XML processing still hard in Groovy blog home

Posted Friday, 22-Feb-2013 by Ingo Karkat

Okay, I'm still a beginner in Groovy. Let's find some resources on the Internet. Processing XML on the Groovy site is a great starting place (too bad I found it only later). IBM developerWorks has an excellent introduction (by Scott Davis; kudos to IBM's commitment to software development!) However, it only contrasts both XML creation and manipulation with Java, but I need the combination. Same on the Groovy site: It's either Creating XML using Groovy's MarkupBuilder or Updating XML with XmlSlurper. So finally, I've found my way to the class documentation for MarkupBuilder, XmlSlurper, and the related GPathResult.

Unfortunately, the documentation is thin: Just a brief example usage, and most methods have a one-sentence description. That may be sufficient for Groovy's inner circle, but a beginner needs links to the overall concepts and how the classes fit together, as well as the detailed specifications of behavior. Because none of the here's how you do this, here's how you do that examples fit my use case, I tried to tackle my problem in groovysh (as even the build & execute cycle of our tests was prohibitively slow, on the order of several minutes).

One example of the artificial non-problems created by thin documentation: I misinterpreted the following output:

> groovysh
Groovy Shell (2.1.1, JVM: 1.7.0_09)
Type 'help' or '\h' for help.
groovy:000> new XmlSlurper().parseText('<foo><bar/></foo>')

Huh? Nothing returned? Where's my GPathResult object? Turns out (after much research and head-wringing) that I got an empty string, which is the text between the XML markup (so in my XML-as-pure-data-structure, there's nothing). Who would have guessed that from the documentation of GPathResult.toString():

public java.lang.String toString()

Returns the text of this GPathResult.

    toString in class java.lang.Object
    the GPathResult, converted to a String

At least there is some documentation. Scroll over to GPathResult.equals(), which has no documentation at all! This leads poor fellas like him to ask for help (without response so far, unfortunately):

odd equals method of GPathResult

The equals method in groovy.util.slurpersupport.GPathResult is :
return text().equals(obj.toString());
Is there any reason for this implementation ?

And wait, there's more. Anyone care to explain the difference between XmlUtil.serialize(GPathResult) and XmlNodePrinter.print(Node)?!

I don't mean to bash Groovy here, and I can accept that there's no nice unified object model where I can create and manipulate XML (as with the HTML DOM in JavaScript). I guess my conclusion is that though undoubtedly, Groovy the language does have some nice features that make it far more attractive than Java, the overall ecosystem still matters. I'm a bit shocked to see that core Groovy libraries (that ship with Groovy itself; we're not talking about some obscure library that someone open-sourced) are in such an incomplete state, which makes beginners' first steps so frustrating, and which eventually hampers further adoption of the language. With cool languages like Scala and Clojure competing for developers on the JVM, good documentation and clean APIs matter a lot.

Ingo Karkat, 22-Feb-2013

blog comments powered by Disqus