After a lot of experimenting, I think I’ve found what is the fastest and simplest module for writing/parsing XML with Python. The module is called ElementTree (etree) and it comes standard in all Python versions 2.5+. The parser is lightening fast and the API is extremely clean. You won’t have to compromise between simplicity and speed; ElementTree has it all. The only strike against it is that the documentation is scattered all over the place. That’s why I wrote this.

A Wee Story

I am using XML as an intermediate file format between my level editor (built on-top of the 3d software package, Maya) and my game engine. Initially, I went with the minidom XML module because that’s what everyone used and it was well documented. It does work well, but I suspected that I could find something faster. And boy did I. ElementTree is an order of magnitude faster than minidom. I managed to shave off almost 10 seconds when exporting my levels. This is a HUGE benefit for me and is the reason I felt compelled to share this information.

There are two objects you need to worry about when using the ElementTree API:

Element – These are XML elements, so they contain the label (tag), a list of attributes and a list of children elements (forming the XML tree hierarchy). This is the object you deal with in 95% of your code.
ElementTree – These are ‘wrappers’ around Element objects that provide facilities to output the Element to an .xml file (along with all it’s children, recursively). You can also read an xml file into an ElementTree and then get access to the Elements within it.

Get on With It

Ok, to get started, import the module. It’s part of Python’s ‘xml’ bundle, you can find it here:

import xml.etree.ElementTree as xml

Now lets create an XML file from scratch and write it to disk:

root = xml.Element('root')

#Create a child element
child = xml.Element('child')
root.append(child)

#This is how you set an attribute on an element
child.attrib['name'] = "Charlie"

#Now lets write it to an .xml file on the hard drive

#Open a file
file = open("c:/test.xml", 'w')

#Create an ElementTree object from the root element
xml.ElementTree(root).write(file)

#Close the file like a good programmer
file.close()

Run the code above and open test.xml. You will see:

That bit of code represents everything you need to write an XML exporter. There are additional functions for parsing existing XML files (recursive searching, insert elements etc).

#Parse XML directly from the file path
tree = xml.parse("c:/testFile.xml")

#Get the root node
rootElement = tree.getroot()

#Get a list of children elements with tag == "Books"
bookList = rootElem.findall("Books")

#Check if any "Books" were found
if bookList != None:
    for book in bookList:
        #Do something with your book!

There’s some more esoteric information in the API docs found here:
http://docs.python.org/library/xml.etree.elementtree.html
And you can get more detailed information here:
http://effbot.org/zone/element.htm

Conclusion

Here’s the lesson: Use ElementTree as your XML parser in Python (assuming you aren’t already). It’s faster, easier and you already have it as part of the Python distribution. No need to ever touch minidom (and it’s wacky API) ever again. Enjoy :)

Any XML tricks or tips you want to share can go in the comments below.