Re: What is the logic of storing XML in a Database?



On 2007-03-28, Cimode <cimode@xxxxxxxxxxx> wrote:

No it doesn't. As I said, there's nothing that you can do with XML that you
can't do with CSV. You can validate the data by checking that it obeys the
constraints defined in the schema.
Validate in what perspective? to send it? What if you data is already
validated at table level...(supposing the right constraints are in
place)?

I don't know or care about what constraints they have in their database. I
only care about the data they send me.


Validation against a schema
will trap most major errors. It will trap most of the minor errors that
would normally require action by an expensive and extremely bored human being.
In what a header does constitute a schema.

Not really. A schema is an external standard that the sender and receiver
agree on. In theory you could put a copy of the file's syntax in
machine-readable form in each data file. I haven't seen that done anywhere.
So a schema is the standard structure by which the user sending a file
gets in agreement with the receiver right? Can't they do that even
for CSV file? I mean what is the real added value in using XML as
opposed to CSV?

Yes, and in practise CSV formats are usually negotiated between sender and
receiver, in much the same way that XML standards are. But in practise
standards are not usually strictly defined. For instance a lot of the specs
that I've seen haven't specified whether the file uses DOS, Mac or UNIX line
endings.



Therefore it reduces processing costs and staff turnover.

Errors are rejected by a machine. That usually makes it the sender's
responsibility to check and correct the data. Making that unambiguous saves
In what does that differ from a CSV with a header?

It doesn't necessarily. As before, it's theoretically possible but I've
never heard of anyone doing it. In essence an XML file is a delimited file
with a header, so if someone set out to design your hypothetical file
structure they could easily end up with XML.
So do you agree that a CSV file with a header can perfectly replace an
XML file with same usage?

In theory it was possible, but the work to make that happen would have had
to have been done before XML was adopted.



a lot of time and endless arguments between business partners.

Code to handle XML is standardised and therefore doesn't need to be
rewritten for each individual application. This makes it more reliable and
cheaper to develop and maintain.
How is standardized? What is a standard for coding XML?

It's standardised in that the code is delivered as part of the operating
system or the development environment. Because everyone is using the same
code it gets more thoroughly tested. If you decided instead to produce an
open standard for CSV files with headers everyone could provide standard
libraries for that too. But they haven't, and don't need to because we
already have XML as specified in the W3C standards.
I could see how the structure is validated by W3C standards. But what
about *correctness* of data? Besides I still have hard time to see
that it would be easier that a hierarchical stucture would be easier
to validate than a table structure?

It's very difficult to test for correctness of the data but it is possible.
You can for instance send both the value of an invoice and the value for
each line-item. I'm not sure that XML alone can deal with checking that the
sums match. Most systems can be made more reliable by adding the appropriate
redundancy.

XML can't detect every error, but can find most.


It is difficult to extend CSV systems boyond the simple flat-file system
with a single record type. Traditionally, at least in the systems I've
worked with, the solution is to denormalise the data from more than one
table. Therefore CSV is usually more verbose than XML and can take up much
So what you are saying is that an XML file takes less space (less
verbose) than a flat CSV file?
But you said the opposite. Just trying to understand what you are
saying...

Usually the XML file is bigger, but I have seen situations where a flat-file
repeats a lot of data, and is therefore bloated.


Besides could you explain what you
mean by *denormalize data from more than one table*.



One CSV file that I replaced included a 20 character field for the name of
the company sending it in every line. It was always identical in every line
because it was created by joining one recod in a name table with multiple
records in another. The worst case situation is that a flat file might be
created from the cartesian product of more than one table.
You can create a structured file that has multiple record structures in it.
So for instance a line with a 0 as the first character represents an order
header, and a line with a 1 as the first character is an order detail. I
have seen files structured this way. But each file type requires its own
schema and processing to write an to read. You could create a generic syntax
and provide standards libraries to process it, and it might look a lot like
XML.
I do not quite see what grouping and query correctness as well as
cartesian product explains how XML is superior to CSV...

XML can transmit the data from two tables in the same file, without having
to repeat data values on each line. You can do that with CSV files, but it
can get messy.



--
bap@xxxxxxxxxx
In search of cognoscenti
.



Relevant Pages

  • Re: D3 Connectivity Demos Download
    ... originally based on an ASCII protocol with a standard command set. ... grouped attributes of 3 values each for the client code to pick apart. ... With the number of interfaces/APIs offering XML ... If someone wants a schema, it's dirt simple to take a well formed ...
    (comp.databases.pick)
  • Re: Considerations for a better Import/Export Format Goals
    ... XML were going to be used then their design would have to follow best ... practices to ensure that a schema-based validation was possible. ... XML schema are useful in some circumstances, but not when the data are ... vendor to step outside the standard if the XML is defined properly. ...
    (soc.genealogy.computing)
  • Re: Space in a Field name
    ... you know when u send a XML schema and would like to ... receive a CSV (Flat file) File. ... CSV), the map between them, is that enough to create the ports and get the ...
    (microsoft.public.biztalk.general)
  • Re: Moving from Office 2003 Professional to Office 2003 Standard
    ... For Office 2003 all of the editions supported XML when used with the default MS schema, ... In Office 2007 the XML support is consistent across the editions, but there are some other differences in features ... Standard and have reviewed the online documentation of what products do not ...
    (microsoft.public.office.misc)
  • Re: What is the logic of storing XML in a Database?
    ... There's nothing that you can do with XML that can't be done in other ways. ... Does that mean that CSV data with a header can't be validated? ... What is a standard for coding XML? ...
    (comp.databases.theory)