Re: GEDcOM as a database format



First off..Merry Christmas, Happy Holidays, and a good winter solstice to
all.

"Tony Proctor" <tony_proctor@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:

"Dennis Lee Bieber" <wlfraed@xxxxxxxxxxxxx> wrote in message
news:13mrvtst5hh523c@xxxxxxxxxxxxxxxxxxxxx
On Sun, 23 Dec 2007 04:33:37 GMT, JD <jd4x4@<del.this>verizon.net>
declaimed the following in soc.genealogy.computing:


I personally wouldn't want software that ignored something and
didn't at least provide me with a means of dealing with it. But if
it did, as you say it can be transformed with xslt, which is
exactly what I did with my publishing source data. I had several
sources each with slightly

XSLT still requires a known source and destination format; it can't
take an unknown source tag and create a known destination tag with
meaning... Maybe it can produce some sort of blanket output for
unknown tags, but that will quite likely not be a reversible
transformation.

differing source schemas, but all of the data was related and I
created my own "final" schema from them, for the use I required.
And, the

You "created"... The software didn't derive a consistent schema...

Who "creates" the schema and transforms for all the many programs
that currently exists?


Whomever finds it relevant. Really though, Tony's point about the "data
model" is where it has to start, imo. But more on that later..


See the above. The "standardization" you refer to doesn't mean that
the data has been changed, only reordered.. to YOUR schema.
Software doesn't have to do that alone. YOUR schema can be YOUR
ordering of the data.

To me, said ordering requires prior knowledge of what the meaning of
various tags IS... What if "my" data considers "fourth flood of the
river <x> in the reign of <y>" to be acceptable as a date (okay, even
TMG would consider that a very irregular date). How would your
software treat something that output such as "<date>....</date>", vs
"<date><month>...</month><day>...</day><year>...</year></date>"

Besides... I'm buying the software to handle the genealogical data
and reporting... I'm not writing my own package in which I have the
option of defining transforms into what I think should be used...
Unless all the producers of said software all agree on what is valid
data, commercial software will not be able to /losslessly/ accept the
data of others.


I really don't think there is much variation in the agreement of the
actual "core" data.. but that means different things to different people,
mainly those that are stuck on "names" for the data, etc. A person is a
person, a date is a date. It all depends on where you are starting in
your "relevancy" model as to what "extra" bits are attached to the core
data.


And should you be in a position (as I was) to need a third
application for the data that would have a somewhat different
output... you could then define your own schema and validate
against it.

How many weekend genealogists are going to even know what an XML
transform is, much less write one to handle one source of data?


They shouldn't have to. That's part of my point for having the software
deal with it in meaningful & useful context.

So we don't need to perpetuate that by not providing a mechanism
that could facilitate automation, imo.

Well, we could insist that all extant genealogy programs be modified
to refuse to accept any data entry that doesn't have some sort of
source citation, even if it is nothing more than "personal knowledge
of <xyz>" --
Wulfraed Dennis Lee Bieber KD6MOG
wlfraed@xxxxxxxxxxxxx wulfraed@xxxxxxxxxxxxx
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: web-asst@xxxxxxxxxxxxx)
HTTP://www.bestiaria.com/


Refuse is pretty harsh/final. Give you an overview of the usage in the
context of the sending schema, and allowing you and/or the software to
automate (via xslt, etc) it's transformation is preferred.

XML is touted as some sort of panacea. It is an improvement on the
plethora of data formats (in all IT areas) that existed previously,
but it has to be understood for what it is. It is merely a
standardised syntax for representing hierarchical data. That
standardisation therefore only applies to the syntax, not to the
semantics. What this means, in layman speak, is that any XML file is
instantly recognisable as "XML" but it doesn't make the content any
more understandable.

Sure, there are lots of tools for loading/viewing/manipulating XML but
they only know about the syntax, not the semantics. Yes, you can write
your only XSLT (which I have to say is an awful language) but all
those transformations would be doing is manipulating the syntax, e.g.
removing stuff, moving stuff around, extracting stuff, etc. In
principle, this would all be possible with any documented data format,
including GEDCOM, at the expense (& risk) of having to write a little
more of the necessary software yourself.

I firmly believe that a "data model" has to be defined and accepted
first. This subject has come up several times in this group, and links
have been posted here about ongoing projects striving to achieve this.
Once such a data model specification exists then representation of it
in any data format (XML, GEDCOM, some other) is almost a mechanical
operation


I agree totally. Where I seem to differ with everyone else is that I
think the "data model" has been discussed with a confused mixture of the
actual data element definitions and the use of them in various
heirarchies, as well as with differing attributes and expansions.

I think once XML would be used as simply a transport (meaning at least a
basic expandable core data set and (maybe) even a rough heirarchy
defined), then extensions of the data set and differing heirarchies would
emerge naturally.

Tony Proctor




I haven't really begun to evaluate it yet, but on the surface I think
GenoPro may be on the right track. It lacks some traditional display
concepts and needs work in possibly the research data organization and
entry areas, but it appears to be a very flexible, extendable start I
think. It's at http://www.genopro.com

It's strength I think is in it's use of remapping data through "reports"
into XML files as well as displays. I don't think it helps much with the
schema & transforms, but I've only just installed the trial.

.



Relevant Pages

  • Re: GEDcOM as a database format
    ... Who "creates" the schema and transforms for all the many programs ... How many weekend genealogists are going to even know what an XML ... to the syntax, not to the semantics. ...
    (soc.genealogy.computing)
  • Re: GEDcOM as a database format
    ... Who "creates" the schema and transforms for all the many programs ... XML is touted as some sort of panacea. ... to the syntax, not to the semantics. ...
    (soc.genealogy.computing)
  • Re: xml storage model in yukon
    ... Your query below errors because of a syntax error. ... declare @xml xml ... You should be able to drop a schema and it seems to work on my current ... > declare @xml xml ...
    (microsoft.public.sqlserver.xml)
  • WriteXML mode and schema
    ... write it out as XML using the following syntax, I get a build error saying ... schema, so is that the only reason I cannot generate a schema? ...
    (microsoft.public.dotnet.languages.vb)
  • Re: how to return xml document from a web service
    ... what specific XML you expect. ... If you have a schema that defines what you expect, ... The second issue with this approach is that XML is not a string. ... >> methods from the wire transport. ...
    (microsoft.public.dotnet.framework.aspnet.webservices)

Loading