Re: XML Schema: inheritance with variable order of childs
- From: Pavel Lepin <p.lepin@xxxxxxxxxxx>
- Date: Mon, 05 Nov 2007 10:49:51 +0200
Sven <sven@xxxxxxxxxx> wrote in
<1194019945.748742.281520@xxxxxxxxxxxxxxxxxxxxxxxxxxx>:
The standard recommendations are:
1. Stop wanting that.
Ok, then I have to deal with more complex code for the
xml-export. And the human editors of the xml files are
more restricted.
If your editors are techies, it's perfectly possible to
explain to them that there are certain rules they should
observe. If they are not, they have no business editing raw
XML documents, and you should provide them with an
application-specific editor instead.
If your editors are not techies, and you let them edit raw
XML documents, and try to design a 'loose' schema for their
convenience, they'll break your system six ways till
Thursday. Of course, it's your blood pressure, so feel free
to jump off that particular cliff.
2. Use a more powerful schema definition language.
What do you mean with this?
I mean using a more powerful schema definition language.
Google "XML schema definition languages". There are four
commonly used ones (for some values of 'commonly'): DTDs,
W3C XML Schemata, RELAX NG and Schematron.
DTDs are a holdover from SGML, and the original preferred
method for XML document validation. They're somewhat
similar in expressive power to XML Schemata, but unlike
those use a separate syntax, are untyped and not
namespace-aware, which limits their usefulness.
RELAX NG is an oft used alternative to W3C's schemata. It is
more powerful in some areas, but reportedly lacking in
others. It's an ISO standard if memory serves, but not
recommended by W3C.
Schematron is a powerful rule-based constraint checking
language that addresses some of the XML Schema/RELAX NG
shortcomings. It is commonly used in combination with
either of them. It's not a W3C recommendation, but an ISO
standard as well.
Note that *all* of the schema definition languages are aimed
at providing validation for well-structured documents. The
looser your schema is, the more likely you'll need a
general-purpose language to check whether all the gizmos
are in place and all the thingamajigs bound together.
3. Validate on application side, not on parser side.
The application already parses the files, but for offline
usage I would like to have the possibility to check
against a schema, perhaps with a simple xmllint.
So decide what's more important to you: readily available
validation, or DWYM processing.
4. Design a well-structured document:
<temperature scale="celsius">27</temperature>
<sky>cloudy</sky>
The sample provided above is not the original document. It
just a simplified example to describe the problem.
I think I described a quite well-structured schema already
:-)
You didn't, and my example demonstrates why.
Firstly, in your schema, you're using TextItem elements for
heterogeneous data; therefore, you cannot determine the
expected content model of a TextItem element... without
*looking* at its content. That's bad, and that alone rules
out XML Schemata as your schema definition language.
Secondly, it's impossible to determine the expected content
of a Content element without looking at *surrounding*
content. For validation purposes, that's beyond bad; it's
godawful.
And thirdly, just to round up things nicely, in case of
temperature your Content element contains both the value
and the units used for that value. One might argue it's
workable. I would scream bloody murder the moment I saw
that in our documents.
Another interesting way of dealing with modestly crippled
XML documents is transforming them into something sane
using XSLT. Make the transformation scream and swear if
it runs into something that shouldn't be there, and
you're golden.
The documents will be generated by my application, but can
be modified by a human user. The schema should be a
guideline which modifications are allowed.
Prepare for the world of fun. If you're worrying about your
users being unable to comprehend that you always want Name
element before Content element... you should be worrying
about well-formedness constraints inherent in any kind of
XML processing first. How do you expect them to understand
that XML is case-sensitive, that 'tags' must be properly
closed and nested, that certain characters are off-limits
and others should always be escaped as entities or
character references?
--
"I can't help but wonder if you... don't know a hell of a
lot more about practically every subject than Solomon ever
did."
.
- Follow-Ups:
- References:
- XML Schema: inheritance with variable order of childs
- From: Sven
- Re: XML Schema: inheritance with variable order of childs
- From: Pavel Lepin
- Re: XML Schema: inheritance with variable order of childs
- From: Sven
- XML Schema: inheritance with variable order of childs
- Prev by Date: distinct multiple choice in xml schema
- Next by Date: Re: using expat parser how to build a tree shaped data structure of a xml document and read the value from it in c language
- Previous by thread: Re: XML Schema: inheritance with variable order of childs
- Next by thread: Re: XML Schema: inheritance with variable order of childs
- Index(es):
Relevant Pages
|