i want parse xml file following structure:
<?xml version="1.0" encoding="utf-8"?> <toplevel fileformat = "config"> <objectlist objecttype = "type1"> <o><a>value111</a><b>value121</b><c>value131<c/></o> <o><a>value112</a><b>value122</b><c>value132<c/></o> ... </objectlist> <objectlist objecttype = "type2"> <o><a>value21</a><b>value22</b><c>value23<c/></o> ... </objectlist> <objectlist objecttype = "type3"> <o><a>value31</a><b>value32</b><c>value33<c/></o> ... </objectlist> ... <objectlist objecttype = "typen"> <o><a>valuen1</a><b>valuen2</b><c>valuen3<c/></o> ... </objectlist> </toplevel> i need data 1 node, e.g. 'objectlist objecttype = "type3"'. may not node in 3rd position. have select based on name. finally, children of node (a, b, c) should stored in data frame.
- how can retrieve node?
- how can extract child data data frame?
any ideas? in advance!
use xml package parse xml:
library(xml) ### load xml d <- xmltreeparse("test.xml") top <- xmlroot(d) use xpath query need, objectlist nodes objecttype='type3' attribute:
n <- getnodeset(top, "//objectlist[@objecttype='type3']") [[1]] <objectlist objecttype="type3"> <o> <a>value31</a> <b>value32</b> <c>value33</c> </o> </objectlist> convert structure inside object matrix
m <- lapply(n, function(o) t(sapply(xmlchildren(o), function(x) xmlsapply(x, xmlvalue)))) > m [[1]] b c o "value31" "value32" "value33" you can combine of them (i.e. if have multiple matching objectlist objects) data frame:
d <- as.data.frame(do.call("rbind", m)) > d b c o value31 value32 value33
Comments
Post a Comment