XML parse/build metaphor mismatch



I'm wondering if there's an approach to writing consistent code to read/write
XML data in arbitrary order that I'm simply missing.

It seems to be easy getting stuff -out- of a DOM via XPath, but it's much
tougher building a DOM document in arbitrary order. Yes - I can get the
parent context element first, using XPath, but then I build custom wrappers
and helpers to simplify the building and adding fragments in the correct
namespace, etc. Adding attributes is especially wierd and unneccessarily
cumbersome.

I can reuse my own libraries across applications, of course, but I've
essentially invented my own private code island at this point for something
everyone has to do when building XML documents using the DOM. What I end up
with doesn't nicely mirror how we use XPath to read content out, so I have
more duplication than I'd like between the code that reads and writes any
particular document type.

I keep wishing that XPath or XQuery would be more like SQL inasmuch as a
simple SQL expressions can add or update a record in a table in a syntax
independent of the database library employed (e.g. UPDATE employee SET
first_name='Jane', last_name='Doe' WHERE id=123"). Correct me if I'm wrong,
but I don't see any evidence that XQuery or any other standard XML library has
data updating capability like that.

Since I'm still fairly new to XML, I'm presuming there's at least a 50/50
chance that I'm simply making mountains out of molehills, but I see evidence
that perhaps, I am not. For instance, see "Lesson 5: Use DOM Wrapper Objects"
at http://www.developer.com/xml/article.php/2194491.

Now - imagine something like this...

document.selectSingleNode(/Document/Body/Employees).modify("insert
Employee[@id:='123'][ Person[ Name[@first:='Jane'][@last:='Doe']
][Notes/self::text():='Hi there']");

The format for insertion roughly mirrors the XPath predicate format, but each
expected node that did not already exist would be created, and any "="
operators perform value assignment, not comparison (other ideas are possible,
like using := for assignment). Any item we assign a value to that already has
a value (including empty-string attributes) should cause an error because we
said we were inserting. There would be a similar 'update' method for creating
new nodes as required, and also unconditionally replacing any existing node
values.

There are lost more details to work out for a full spec, of course, like
whether to support position predicates, and whether an insert should fail or
push the existing element forward in that case, etc., but it's the start of an
idea, anyway.

Now, let's add to this that the elements would be added to the DOM pretty much
as if you had typed the expanded XML text right into that point in the
document. For instance, inserted elements with no prefixes would be treated
the same as child elements with no prefixes, regardless of what namespace that
implies that they should be in (but allow overriding the default with a
defaultNamespaceURI parameter). For node names with prefices, the namespaces
should be assigned the same way they would be interpreted by a select,
depending on the settings for the DOM object.

What say y'all? Am I inventing a solution spec for a non-problem if I just
knew the best DOM coding practices, or is this somwehat on-target, and
something we should all be clammoring for?
.



Relevant Pages

  • Re: Binary vs text protocol for distributed processing
    ... S-Exps are best with a Lisp-like typesystem available (or at ... XML is best with something DOM-like; ... Writing a simple printer to convert a DOM (parse ... DOM parser) and Lisp, ...
    (comp.programming)
  • Re: XmlTextreader versus DOM
    ... I've heard them referred to as "DOM" and "SAX". ... DOM is, generally speaking, the easiest way in which to deal with XML ... "SAX" (named after the original parser, I think) parsers read one XML ...
    (microsoft.public.dotnet.languages.vc)
  • Re: XmlTextreader versus DOM
    ... I've heard them referred to as "DOM" and "SAX". ... DOM is, generally speaking, the easiest way in which to deal with XML ... "SAX" (named after the original parser, I think) parsers read one XML ...
    (microsoft.public.dotnet.xml)
  • Re: How to dynamically resize an array?
    ... work with XML data, and that is DOM which loads the entire XML document model ... even with LINQ - but, LINQ is definately easier and more straight forward. ... Its used in other technologies too, like CSS, DOM and now that Microsoft officially supports jQuery in ASP.NET, XPATH DOM element look ups is supported. ...
    (microsoft.public.dotnet.languages.vb)
  • Re: javascript and XML help
    ... Some Text and html tags ... That is not well-formed XML so any XML parser will give a parse error. ... DOM nodes, it does not help that some of them might have the same tag ...
    (comp.lang.javascript)