Re: GEDOM as a database format



Tony Proctor wrote:
I'm not saying that genealogy (and all aspects of organised historical data)
isn't complicated Peter, but that doesn't make it impossible, and it doesn't
prevent the possibility of a bit of "sideways thinking" delivering an
innovative approach to organising such data.

I've previously been involved at the core of a specialist database product
coping with tens of terabytes of data, and several hundred thousand field
types. Issues of reliability, performance, ease of use, were all just as
applicable there.

I think everyone agrees that their data must be safe at all costs. This
applies to avoiding data corruption or data loss on a power cut.

It also includes the operating system crashing for no apparent reason and also the disk on which your database resides going up the swanee.


However it
also applies to data outliving any particular database product.
Unfortunately, if I'm using product A, and I take advantage of some of it
extra-nice features, but then the vendor goes bust or I just want to
transfer to product B, I could find that GEDCOM loses some of my extended
data attributes, or that the new product can't represent them properly.

Don't you agree that it could be good for the industry to have a generic
object model that implicitly acted as a spec for new generations of
software? By this, I mean that if the object model addressed all the
intricacies and flexibilities inherent in genealogical, historical, and
event-based data then it would mean your data has extended longevity and we
all have a common vocabulary and frame of reference.

Tony Proctor

I agree that "it could be good" but I believe you are expecting too much, even for the sake of discussion if you are seeking a total and final definition. I am also suspicious of the word "object". It implies there would be stuff hidden under the hood and this stuff would get in the way of portability etc. This is why I shy away from the idea of an API as the standard and prefer that the raw data fields are visible. So at present we have Gedcom which sort of does a basic job. My scheme for something better is taking shape in Gendatam which is also defined in terms of raw data fields. If you look at the Gendatam record formats you will see that most fields are not limited in length (barring some index fields) and most fields do not even have to be present (again barring some index fields). In addition the scheme can be extended without disturbing existing fields.
The standard I want to see is something not constrained by application program, operating system, or hardware and even at a pinch be implementatble as a paper-based manual system. This all seems to require a lowest common denominator type of approach in terms of data representation, basically plain text, but should not constrain the data itself. In this respect a compliant application program would be required to be able to read and write files of the standard, but what that program does internally, even as a database program, is irrelevant. It that sense a standard API is not required.
In terms of history Gendatam is a step back from Gedcom and draws some inspiration from the Gentech project. It is also fairly explicitly "not a database" as far as the definition is concerned because that immediate imposes problems of portability etc.
That's enough for now
Peter
.