Using RDBs and XML-DBs

I just saw this link to Berkeley DB XML Getting Started Guide for Java and it made me think again about using an XML repository instead of a relational db for my server-side data which is mostly document centric. From the introduction:

DBXML is an embedded database specifically designed for the storage and retrieval of modestly sized XML-formatted documents. Built on the award-winning Berkeley DB, DBXML provides for efficient queries against millions of XML documents using XPath. XPath is a query language designed for the examination and retrieval of portions of XML documents.

I think this is pretty freakin' cool, Berkely DB + XML = DBXML. That's pretty awesome. I probabably wouldn't have given it much thought a week ago, but I recently ran across this new weblogging tool called Syncato written in Python which uses SleepyCat's DB as it's base. Syncato is especially interesting because it allows XPath queries in the URL, in a sort of REST meets XPath and it's really neat. For example: http://www.xmldatabases.org/WK/blog/item[starts-with(pubDate/text(), '2003-08-14')]. Isn't that a mind bomb? Very cool.

The whole web app is really an XML Web Service since access to the XML DB is so transparent. It makes perfect sense, it's *all* XML data now, so why are your presentations trapped in HTML? That's just one type of transform for the data. This is how all server-side development is going to evolve... in my opinion, it's the reality to match the hype of XML.

I actually played with something similar to this using JSTL. It wasn't nearly as flexible as Kimbro's app above because I didn't see that leap of logic to put the XPath in the URL, but it was still pretty neat. What I did was do all my queries to the DB using JSTL SQL tags and draw out the XML by hand. It's actually very nice to do this - easier than HTML because *you* get to make up the tag names. :-) Then what I did in another half of the app was to do the transforms of that XML data when I wanted it using the JSTL XML tags. I would do an import - and point it at the JSP pages, and then either apply an XSL transform, or use the XML Tags logic to loop. Later I decided it was more flexible to do the XML in XAO class, however, the conceptual part was that I was creating an XML application the same way I used to create a normal JSP app, and applying transforms later. It was very educational.

Anyways, seeing these Java bindings for SleepyCat's DBXML made me think about using it for my projects since much of the data is being transformed into XML immediately anyways. I mean, storing up to millions of moderately sized documents accessed via XPath sounds damn useful to me. Add in Lucene and I'm sure it'll be damn fast. I've had these sort of thoughts about Apache's Xindice before, but something about this being based on Berkeley's DB gave me a warm fuzzy.

Now, here's my thoughts I had about this. I'm not exactly sure if XML DBs can live on their own. I mean, I *like* relational DBs and I've been working with SQL for 10 years now, so I'm very comfortable with it. I can see how XML DBs would be great for document-centric data such as weblog posts, comments, wiki pages, etc. But what about user data? Access rights? Lists of things like categories, bookmarks and menu items? What about all that miscelleneous little data? Does it all get stored in individual XML documents, and then get stored in an XML DB or is this stuff best kept in an RDB? One part of me wants to standardize on one type of database, the other wants to use the best tools for the best jobs.

Conceptually, I look at XML DBs as akin to Lotus Notes databases. If you haven't used Notes, then you probably have no idea, but the idea is that a "note" in Notes is a free-form container of fields. It's like each row in a db can have an arbitrary number of columns, depending on the type of data being stored. In Notes, it's very easy to grok because you have Forms which define the data which can be stored (analogous to a DTD or better an XML Schema doc), and you can define Views which can display multiple types of formular data, but in rows. Since Lotus Notes was the first development tool I learned how to use, I actually think of *everything* in this way so going back to XML DBs which is similar in its storage of free-form data like this is like coming home.

But still, I can't imagine we're going to see banking data stored in XML DBs any time soon. There's just so much data which requires relational capabilities, like being sure that the data is transactional, or stored efficiently. XML DBs are really a by-product of our ever growing hard drive space. We don't CARE that it's insanely inefficiently stored because we've all got GB to spare. Hell, all the posts and comments on this site and all those on Mobi stored in MySQL come to a whopping 8MB of data. I can store all that on my freakin' PHONE and still have 120MB left over for Java games and MP3s.

Anyways, interesting to see more options for XML Data. It used to be that these sort of DBs were only available on the high-end, like RDBs. SleepyCat's DB is like the MySQL of XML storage, which is a really good thing.

Once again, there's toooo much cool tech to learn. If you have any thoughts on the best way to use XML DBs or RDB/XML DB combos or any links, please give me a clue. Thanks!

-Russ

< Previous         Next >