Re: Modelling objects with variable number of properties in an RDBMS




Roy Hann wrote:
> "vc" <boston103@xxxxxxxxxxx> wrote in message
> news:1130939293.784399.319040@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > I gave you a reason why someone might be forced into using EAV to which
> > you gave a smart-alecky response.
>
> I had intended it to be a mildly witty way of expressing an important point.
> I didn't realize it might seem to lack proper gravity.
>
> > That was tried and rejected primarily for performance reasons. The EAV
> > approach turned out to be faster and various SQL queries turned out to
> > be simpler in comparison to trying to combine results from multiple
> > tables representing a single entity.
>
> We weren't discussing performance, but I'll let that pass.
>
> I find both claims quite astonishing. That is entirely contrary to what I
> would expect. Evidently it was contrary to what they expected too since you
> say they tried one of my suggestions. Since it seems we agree that wasn't
> the expected outcome that makes it an extraordinary claim.

OK, a SQL Server row can be 8K long maximum. Say, a drug formula is a
relation like (some_stable_info_about 2K long,
~5000-6000_lab_data_results). The SQL Server row can be 8K long
maximum which leaves room for 750 double precision columns for storing
trial results. In a multi-table, approach it would amount to 8 tables
partitioned horizontally, by attribute subsets. A simple request like
select formulas for a given date with drug trial parameters in some
given ranges can be expressed as a single SQL statement with an EAV
table and would require 8 unions, or thereabout, with multiple tables.
The optimizer could not build an access plan with acceptable
performance with the multiple table approach. Joins complicated the
problem even more.

Additionally, on a daily basis, about a dozen or more drug trial
parameters were added or dropped from consideration which would mean
potentially modifying all the 9 tables with DDL statements. Clearly,
it creates a severe maintenance headache.

>
> An extraordinary claim always requires extraordinary proof and I think it is
> quite reasonable for me to remain skeptical about this until the evidence
> makes it perverse of me to go on denying it. I can be persuaded by
> evidence. Can you provide full details?


Unfortunately, I cannot provide more detail than above since the design
discussion I mentioned happened about two years ago. However, the
stuff I reproduced to the best of my recollection should show that the
matter is not so black and white as you've tried to paint it.

> Since it was unexpected, I would
> need to be convinced that no ordinary explanation could account for why the
> EAV design was (a) faster, and (b) "various SQL queries turned out to be
> simpler". Examples of the latter would be extremely compelling.

See above.

>
> I am not clear if it it is your opinion that the EAV design is beneficial in
> general.

If you'd carefully read my other contributions to the thread, you'd
have been able to figure out that, in fact, I am strongly opposed to
the EAV approach. Besides, we are talking about a couple of entities
implemented in an EAV manner, the rest of the data model was quite
traditionally relational.

> Is that the claim you you make, or are you just saying that in
> this one time and place it worked better than several other things they
> tried?

The latter. The circumstances I've described are not so rare in
pharmaceutical industry and bioinformatics.
>
> > There is a burden on you because you did not offer a sensible
> > alternative after having rejected EAV in toto.
>
> That is not the way rational discourse proceeds. The burden is never on the
> party that takes the established position (that's what established means).

The implication of your original statement was that EAV is bad in all
the cases. My objection was that the generalization is not fair and I
gave an example when it made sense to represent a table (or a couple of
table) as an EAV set. You've offered first a saying and then a
multitable approach that was not acceptable for the reasons I tried to
describe above. Re: "rational discourse" and the "established
position" see below.

>
> The EAV design has been widely and repeatedly criticised, has numerous clear
> weaknesses with respect to the established approach, and there are plausible
> reasons to doubt it should be intrinsically easier to code for. It is
> contrary to the established position. There is no burden on me to defend
> the established position.

The "established position" means nothing in a logical discourse. Every
situation in a database design world has to be argued on its own merits
-- it's not a perpetuum mobile problem that has been settled once and
for all.

>
> Roy

.



Relevant Pages


Loading