Re: GEDcOM as a database format



Tony Proctor wrote:
"Dennis Lee Bieber" <wlfraed@xxxxxxxxxxxxx> wrote in message
news:13mrvtst5hh523c@xxxxxxxxxxxxxxxxxxxxx

On Sun, 23 Dec 2007 04:33:37 GMT, JD <jd4x4@<del.this>verizon.net>
declaimed the following in soc.genealogy.computing:


I personally wouldn't want software that ignored something and didn't at
least provide me with a means of dealing with it. But if it did, as you
say it can be transformed with xslt, which is exactly what I did with my
publishing source data. I had several sources each with slightly

XSLT still requires a known source and destination format; it can't
take an unknown source tag and create a known destination tag with
meaning... Maybe it can produce some sort of blanket output for unknown
tags, but that will quite likely not be a reversible transformation.


differing source schemas, but all of the data was related and I created
my own "final" schema from them, for the use I required. And, the

You "created"... The software didn't derive a consistent schema...

Who "creates" the schema and transforms for all the many programs
that currently exists?


See the above. The "standardization" you refer to doesn't mean that the
data has been changed, only reordered.. to YOUR schema. Software doesn't
have to do that alone. YOUR schema can be YOUR ordering of the data.


To me, said ordering requires prior knowledge of what the meaning of
various tags IS... What if "my" data considers "fourth flood of the
river <x> in the reign of <y>" to be acceptable as a date (okay, even
TMG would consider that a very irregular date). How would your software
treat something that output such as "<date>....</date>", vs
"<date><month>...</month><day>...</day><year>...</year></date>"

Besides... I'm buying the software to handle the genealogical data
and reporting... I'm not writing my own package in which I have the
option of defining transforms into what I think should be used... Unless
all the producers of said software all agree on what is valid data,
commercial software will not be able to /losslessly/ accept the data of
others.


And should you be in a position (as I was) to need a third application
for the data that would have a somewhat different output... you could
then define your own schema and validate against it.


How many weekend genealogists are going to even know what an XML
transform is, much less write one to handle one source of data?


So we don't need to perpetuate that by not providing a mechanism that
could facilitate automation, imo.

Well, we could insist that all extant genealogy programs be modified
to refuse to accept any data entry that doesn't have some sort of source
citation, even if it is nothing more than "personal knowledge of <xyz>"
--
Wulfraed Dennis Lee Bieber KD6MOG
wlfraed@xxxxxxxxxxxxx wulfraed@xxxxxxxxxxxxx
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: web-asst@xxxxxxxxxxxxx)
HTTP://www.bestiaria.com/


XML is touted as some sort of panacea. It is an improvement on the plethora
of data formats (in all IT areas) that existed previously, but it has to be
understood for what it is. It is merely a standardised syntax for
representing hierarchical data. That standardisation therefore only applies
to the syntax, not to the semantics. What this means, in layman speak, is
that any XML file is instantly recognisable as "XML" but it doesn't make the
content any more understandable.


As examples of the obvious in Tony's last sentence: I may _recognize_ that a letter is written in Hindi or Russian without having ANY clue what the letter is about. I may even recognize that the paper upon which the letter is written is high-ticket paper but I don't necessarily know from which company or how high the ticket was. Notes from some of Einstein's (or Hawking's) research log are instantly recognizable as being written in plain-text, but not too many of us actually _understand_ most of the work being logged.



Sure, there are lots of tools for loading/viewing/manipulating XML but they
only know about the syntax, not the semantics. Yes, you can write your only
XSLT (which I have to say is an awful language) but all those
transformations would be doing is manipulating the syntax, e.g. removing
stuff, moving stuff around, extracting stuff, etc. In principle, this would
all be possible with any documented data format, including GEDCOM, at the
expense (& risk) of having to write a little more of the necessary software
yourself.

I firmly believe that a "data model" has to be defined and accepted first.
This subject has come up several times in this group, and links have been
posted here about ongoing projects striving to achieve this. Once such a
data model specification exists then representation of it in any data format
(XML, GEDCOM, some other) is almost a mechanical operation

Total agreement on the data-model will be achieved when If And ONLY IF (IFF) only one person has to be pleased by it. I don't care to record the names of 200 wedding guests, others want that information. Some people want to include the GPS data on the precise place of burial; I figure the name of the cemetery and a place-name is all the precision I need 90% of the time (and the other 10%, I put in notes to myself).

But so long as there is "room" in the market for conflicting views on whether the GPS data is necessary or whether the names of all witnesses (as opposed to only the official witnesses as opposed to only the names of the participants) are necessary ... there's gonna be a need for a data-model so flexible it may as well not exist (see also: Gedcom standard).

IME, YMMV, and so on.

Cheryl
.



Relevant Pages

  • Re: GEDcOM as a database format
    ... Who "creates" the schema and transforms for all the many programs ... How many weekend genealogists are going to even know what an XML ... to the syntax, not to the semantics. ...
    (soc.genealogy.computing)
  • Re: GEDcOM as a database format
    ... Who "creates" the schema and transforms for all the many programs ... How many weekend genealogists are going to even know what an XML ... standardised syntax for representing hierarchical data. ...
    (soc.genealogy.computing)
  • Re: xml storage model in yukon
    ... Your query below errors because of a syntax error. ... declare @xml xml ... You should be able to drop a schema and it seems to work on my current ... > declare @xml xml ...
    (microsoft.public.sqlserver.xml)
  • WriteXML mode and schema
    ... write it out as XML using the following syntax, I get a build error saying ... schema, so is that the only reason I cannot generate a schema? ...
    (microsoft.public.dotnet.languages.vb)
  • Re: iTunes Access music database?
    ... I have found a decent XML transformation which transforms the XML ... library's DICT structure to one that's more suited for importing ...
    (comp.databases.ms-access)

Loading