Python ElementTree XML Tutorial
After a lot of experimenting, I think I’ve found what is the fastest and simplest module for writing/parsing XML with Python. The module is called ElementTree (etree) and it comes standard in all Python versions 2.5+. The parser is lightening fast and the API is extremely clean. You won’t have to compromise between simplicity and speed; ElementTree has it all. The only strike against it is that the documentation is scattered all over the place. That’s why I wrote this.
I am using XML as an intermediate file format between my level editor (built on-top of the 3d software package, Maya) and my game engine. Initially, I went with the minidom XML module because that’s what everyone used and it was well documented. It does work well, but I suspected that I could find something faster. And boy did I. ElementTree is an order of magnitude faster than minidom. I managed to shave off almost 10 seconds when exporting my levels. This is a HUGE benefit for me and is the reason I felt compelled to share this information.
There are two objects you need to worry about when using the ElementTree API:
Element – These are XML elements, so they contain the label (tag), a list of attributes and a list of children elements (forming the XML tree hierarchy). This is the object you deal with in 95% of your code.
ElementTree – These are ‘wrappers’ around Element objects that provide facilities to output the Element to an .xml file (along with all it’s children, recursively). You can also read an xml file into an ElementTree and then get access to the Elements within it.
Get on With It
Ok, to get started, import the module. It’s part of Python’s ‘xml’ bundle, you can find it here:
import xml.etree.ElementTree as xml
Now lets create an XML file from scratch and write it to disk:
root = xml.Element('root') #Create a child element child = xml.Element('child') root.append(child) #This is how you set an attribute on an element child.attrib['name'] = "Charlie" #Now lets write it to an .xml file on the hard drive #Open a file file = open("c:/test.xml", 'w') #Create an ElementTree object from the root element xml.ElementTree(root).write(file) #Close the file like a good programmer file.close()
Run the code above and open test.xml. You will see:
That bit of code represents everything you need to write an XML exporter. There are additional functions for parsing existing XML files (recursive searching, insert elements etc).
#Parse XML directly from the file path tree = xml.parse("c:/testFile.xml") #Get the root node rootElement = tree.getroot() #Get a list of children elements with tag == "Books" bookList = rootElem.findall("Books") #Check if any "Books" were found if bookList != None: for book in bookList: #Do something with your book!
There’s some more esoteric information in the API docs found here:
And you can get more detailed information here:
Here’s the lesson: Use ElementTree as your XML parser in Python (assuming you aren’t already). It’s faster, easier and you already have it as part of the Python distribution. No need to ever touch minidom (and it’s wacky API) ever again. Enjoy
Any XML tricks or tips you want to share can go in the comments below.
|This entry was posted by kiaran on April 5, 2010 at 1:44 pm, and is filed under Developers, Software Development. Follow any responses to this post through RSS 2.0. You can skip to the end and leave a response. Pinging is currently not allowed.|