Great article on speeding up the XML conversion in R
I had noted in a previous post that I have been using the
XML package in
R to process an XML from an export of our database. I used
xmlToDataFrame to change from an XML set to an
data.frame and I have found it to be remarkably slow. After some Googling, I found a link where the author states that
xmlToDataFrame is a generic function and if you know the structure of the data, you can leverage that to speed up the function.
So, that’s what I did for my data. I think this structure is applicable to similar data structures in XML, so I thought I’d share.
Let’s look at the data structure. For my data, an example XML would be:
which tells me a few things:
- I’m XML (first line). There are other pieces of information which can be extracted as tags, but…
View original post 585 more words