Thoughts on Udell's XML Adventures
I just found these two fascinating articles by Jon Udell about using XML as the primary data container for their web-based applications: The Document is the Database and XSLT Recipes for Interacting with XML Data. This is *exactly* what I've been playing with and it's interesting to see what Jon has come up with with his tools (Python and XSLT) as opposed to mine (Java and JSTL XML tags).
I'm not totally against XSLT, I just think that it's not a programming language which is what many apps - like Cocoon - make it into. It's best for transforms only, in my opinion, anything beyond that makes it impossible to develop and maintain. Complex XSLT is akin to complex RegEx. It's like write-only code because there's no way you'll go back and work on it later once you've stepped out of the process of writing it.
Right at the top of the second article Jon has some interesting thoughts:
In last month's column, "The Document is the Database", I sketched out an approach to building a web-based application backed by pure XML (and as a matter of fact, XHTML) data. I've continued to develop the idea, and this month I'll explore some of the XSLT-related recipes that have emerged.
Oracle's Sandeepan Banerjee, director of product management for Oracle Server Technologies, made a fascinating comment when I interviewed him recently. "It's possible," he said, "that developers will want to stay within an XML abstraction for all their data sources". I suppose my continuing (some might say obsessive) experimentation with XPath and XSLT is an effort to find out what that would be like.
It's true that these technologies are still somewhat primitive and rough around the edges. Some argue that we've got to leapfrog over them to XQuery or to some XML-aware programming language in order to colonize the world of XML data. But it seems to me that we can't know where we need to go until we fully understand where we are.
Jon's right about the XML processing tools right now from a standards point of view. For example, we definitely need the XQuery programming language to take off so we can get away from doing complex transforms in XSL and move to doing XML logic in XQuery. Yes it's *yet another language* to learn, but the benefits will be worth it. For now, though, we're stuck with the tools at hand (though there is a lot of movement on XQuery - I just saw another OSS library the other day...).
As you might have read, I'm doing all my development now where all the data that comes into my web app is formatted as XML. Whether it's from a file or a DB, when it finally arrives into my Struts Actions (the Controller in MVC) it's all XML. If I need to do something with that data, such as extract the title from a weblog post, I use JDOM's XPath support (derived from Jaxen) to grab the title from the XML quickly and cleanly and throw it into the context:
SAXBuilder saxBuilder = new SAXBuilder(); Document doc = saxBuilder.build(new StringReader(xml)); XPath xpath = XPath.newInstance("/document/entry/title"); String title = xpath.valueOf(doc); request.setAttribute("title", title);
Then I proceed to pass the XML on to my view implemented as .jsp pages. I could, if it was a simple XML transform like for an RSS page, do the transformation right there in the Action and return my results to the page directly instead (returning null for the ActionMapping), but as it is, I pass the entire XML down to my JSP, which uses the JSTL XML tags to logically format the data based on my what that web page needs. Mostly it's simple stuff for example, if you're logged in, you get an additional link to edit or delete that page. But it's still *logic* that would be a bitch in XSLT to work out correctly and maintain. Using the JSTL tags, this is easy and clean - and is *exactly* where I see XQuery being used in the future.
Anyways, that's how I'm doing this sort of development using the same concepts and it's working *really* well, so I personally think that Jon is on the right track and that the guy from Oracle is spot on. Once we get the tools standardized, living in XML within our apps is going to be the only way to do things.