Today I want to share my thoughts about XML parsing on Android. It's true, that XML parsing is one of the most frequent task in Android development - a lot of web-services have their data responses in XML format.
So, what do we have in Android to parse XML data?
Out of the box we have DOM, SAX and XMLPull parsers. DOM (Document Object Model) parser is the well-known (especially for web developers) parser and in fact it worse then SAX and PULL: it's slower and has awful and verbose (in my opinion) usage syntax.
Just a short example (part of the RSS parser):
try { DocumentBuilder builder = factory.newDocumentBuilder();
Document dom = builder.parse(this.getInputStream());
Element root = dom.getDocumentElement();
NodeList items = root.getElementsByTagName(ITEM);
for (int i = 0; i < items.getLength(); i++){
Message message = new Message();
Node item = items.item(i);
NodeList properties = item.getChildNodes();
for (int j = 0; j < properties.getLength(); j++){
Node property = properties.item(j);
String name = property.getNodeName();
if (name.equalsIgnoreCase(TITLE)){
message.setTitle(property.getFirstChild().getNodeValue());
} else if (name.equalsIgnoreCase(LINK)){
message.setLink(property.getFirstChild().getNodeValue());
} else if (name.equalsIgnoreCase(DESCRIPTION)){
StringBuilder text = new StringBuilder();
NodeList chars = property.getChildNodes();
for (int k = 0; k < chars.getLength(); k++){
text.append(chars.item(k).getNodeValue());
}
message.setDescription(text.toString());
} else if (name.equalsIgnoreCase(PUB_DATE)){
message.setDate(property.getFirstChild().getNodeValue());
}
}
messages.add(message);
}
} catch (Exception e) {
// place any execption handling code here
}
As you can see, the structure of this code is not so simple as you can expect for parsing such simple XML as RSS data. In real live you can have more complex XML structure and then DOM parser code will have a lot of weird if/else constructions that turns entire code into mess!
So, let's forget about DOM parser and move to the better alternatives! ;-)
XML Pull Parser is an interface that defines parsing functionality provided in XMLPULL V1 API.
It's faster than DOM and has lower memory usage.
The same example code of the RSS parsing:
try {
parser.setInput(this.getInputStream(), null);
int eventType = parser.getEventType();
Message currentMessage = null;
boolean done = false;
while (eventType != XmlPullParser.END_DOCUMENT && !done){
String name = null;
switch (eventType){
case XmlPullParser.START_DOCUMENT:
messages = new ArrayList<Message>();
break;
case XmlPullParser.START_TAG:
name = parser.getName();
if (name.equalsIgnoreCase(ITEM)){
currentMessage = new Message();
} else if (currentMessage != null){
if (name.equalsIgnoreCase(LINK)){
currentMessage.setLink(parser.nextText());
} else if (name.equalsIgnoreCase(DESCRIPTION)){
currentMessage.setDescription(parser.nextText());
} else if (name.equalsIgnoreCase(PUB_DATE)){
currentMessage.setDate(parser.nextText());
} else if (name.equalsIgnoreCase(TITLE)){
currentMessage.setTitle(parser.nextText());
}
}
break;
case XmlPullParser.END_TAG:
name = parser.getName();
if (name.equalsIgnoreCase(ITEM) &&
currentMessage != null){
messages.add(currentMessage);
} else if (name.equalsIgnoreCase(CHANNEL)){
done = true;
}
break;
}
eventType = parser.next();
}
} catch (Exception e) {
// place any execption handling code here
}
While it's better than the DOM parser, still it has boilerplate if/else statements which can turn complex XML data parser code into mess.
Finally, there is a
SAX parser. SAX
is an event-based sequential access parser API. SAX provides a mechanism for reading data from an XML document which is an alternative to the one provided by the Document Object Model (DOM). Big advantage of SAX is a lower memory consumption: SAX parsers operate on each piece of the XML document sequentially, while the DOM parser operates on the document as a whole.
Android SDK has two build-in SAX parsers implementation. And one of them (from
android.sax package) is pretty good: it's
faster than DOM, has
better usage syntax (still verbose) and (it's main advantage!) it's
really easy to define an XML structure with it.
The same sample of the RSS parser could look like this:
public void parse(BufferedReader in) {
RootElement root = new RootElement(ITEM);
root.setStartElementListener(new StartElementListener() {
public void start(Attributes attributes) {
currentMessage = new Message();
}
});
root.setEndElementListener(new EndElementListener() {
public void end() {
result.add(currentItem);
}
});
root.getChild(TITLE).setEndTextElementListener(new EndTextElementListener() {
public void end(String body) {
currentMessage.title = body;
}
});
root.getChild(LINK).setEndTextElementListener(new EndTextElementListener() {
public void end(String body) {
currentMessage.link = body;
}
});
root.getChild(DESCRIPTION).setEndTextElementListener(new EndTextElementListener(){
public void end(String body) {
currentMessage.description = body;
}
});
root.getChild(PUB_DATE).setEndTextElementListener(new EndTextElementListener() {
public void end(String body) {
currentMessage.publicationDate = body;
}
});
try {
Xml.parse(in, root.getContentHandler());
} catch (Exception e) {
// place any execption handling code here
}
}
While SAX parser code has a little bit more lines (mainly due to curly brackets and parenthesis), this code is more readable and can be extended to handle complex XML data easily! For example, imagine that we have title publication date node with nested nodes:
Element linkNode = root.getChild(LINK);
linkNode.getChild(LINK_CHILD_1).setEndTextElementListener(
new EndTextElementListener() {
public void end(String body) {
currentItem.linkChild1 = body;
}
});
linkNode.getChild(LINK_CHILD_2).setEndTextElementListener(
new EndTextElementListener() {
public void end(String body) {
currentItem.linkChild2 = body;
}
});
...
As you can see it's easy to define and handle any XML structure without chains of if/else statements.
So, android.sax parser is my favorite one for Android! :)A few words about performance:DOM is the slowest one. SAX is slightly faster than Pull.
Android XML parsers performance chart (lower is better):
For chart composing big thanks to Shane Conder.p.s. To keep this article in a readable size I decided to split my post into two parts. Second part is coming soon. ;-)