Re: Lucid statement of the MV vs RM position?



JOG wrote:
dawn wrote:
[chop]
Not entities. Entities are things. It is a person, place, thing or
event. Person, place, and event (e.g. transaction) are easy enough to
identify. For things, if you cannot hit it with a stick or print it
out on paper, then think twice about whether it is an entity or a
property of an entity.

No, this is way off base - with ERM Chen wrote extensively about
associative entities for example, which are _exactly_ the sort of thing
that cannot be hit with a stick.

Well aware. "Entity" is an overloaded term, of course. If using an
ERD approach, the E's you start with, not those used to handle
many-to-manys, nor the R's that turn into E's are the basics I'm
referring to as entities. I also said to think twice, not to exclude
the possibility. Starting with the basics, look at your problem space
and identify people, places, events, and "things."

But I don't want to get sidetracked by
this so hey, if the colour green as an abstract entity is too out
there, fair enough. Just substitute in something more physical into
those original statements - fruits for the colours for example, or
perhaps academic papers:

If you are going to substitute something I can throw for something I
cannot, then we have a new entity, not just a property of an entity.
While color is a property of our characters, a paper is not. I would
model it as a separate entity in the conceptual data model (if we are
using UML class diagrams and not ORM, for example, then I would
indicate color as an attribute in the Character class and Paper as a
separate class).

paper( paper1, {Barney, Oscar} ) &&
paper( paper2, {Barney}) &&
paper( paper3, {Oscar})

or

person( Barney, {paper1, paper2} ) &&
person( Oscar, {paper1, paper3} )

Everything here is clearly an entity. Both are strong.

Yes.

Both
representations come from those logically identical propositions - so
which is the right one?

Both and neither. These are fine as derivations, but I would model
these propositions the same way you would for implementation, most
likely. It is "best practice" to model strong entities as separate
classes/files/tables/relations/functions/types... However, if the
problem domain is such that our organization could care less about
papers (today) and sees (recognizing we re not all-seeing) nothing on
the horizon that would suggest that it is a big risk factor -- nothing
suggesting our organization will care about any properties of these
papers, we only want to record that Barney has paper1 and paper2 while
Oscar has paper1 and paper3, then in a move from the conceptual to the
logical, we might decide to treat this strong entity as if it were
merely a property. Properties that really are properties are cheaper
than entities. Entities that can be implemented as properties might be
cheaper than entities implemented as entities if you do not have to
change them down the line. What happens if we change our mind in the
future? We "refactor" our solution. We change it to meet our new
requirements. It rarely makes good business sense to design anything
so that it mitigates all possible changes -- risk assessment is key.

Instinct might be to go for the first - papers have authors after all.
But my application's first job is to produce academic resumes - hence
the second is the more appropriate one for that 'interface'. Whichever
is picked, the choice enforces some form of arbitrary hierarchy.

You are correct now that you have changed the question to two strong
entities.

Unless
I encode both - is that preferable?

It might be preferable that both be views of the data (which SQL-92
would not permit, but perhaps people are now coding NF2 views of their
1NF data?), but I would model both as derivations, not base tables.


Which is correct? Neither
and both,

The first, not the second.

because they are artifices. Who knows which will be
appropriate to the user?

The systems analyst better be able to give it a good shot. If they
cannot tell whether green-ness is an entity or a property of an entity
wrt to the organization's requirements, I would be very surprised.

An XML or MV style prejudges that decision,

No, it makes a distinction that those modeling with the RM do not make.
I am an entity and my shoe size is a property of me. It can change.
It could even go away if my feet were amputated (sheesh, apologies for
the example), but only as long as I am an entity an organization cares
about would that property be relevant. If I am not of interest, then,
by definition, my feet are not of interest either. In other words, if
you are interested in shoe size, you are interested in me because my
shoe size is a property of me.

Yes, a weak entity. This is indeed a good example of something that
does fit hierarchically,

happy, happy, joy, joy!

but it *doesn't* negate the fact that there
are many examples that don't at all.

Of course, of course. You don't want to take your entire problem
domain and start willy-nilly nesting relations. It is frustrating to
look at some XML doc designs. Give people, many of whom have been
taught normalization, an inch and they seem to botch it all up. There
are best practices for modeling data when you can use multivalues. You
need to be able to see the HUGE difference between Oscar's color and
Oscar's paper. If I no longer care about Oscar, I no longer care
about his color either.

E-mail addresses are in the fuzzy area where you need to make a call.
It typically makes sense that if a person is no longer of any interest
to your system, then their e-mail address is not either. So, the
e-mail address can be seen as a property of the person. Of course it
is the case that on person could ditch an e-mail address and another
adopt it, so they do each have a life apart from each other, but there
is a low risk for most orgs that their requirements will change enough
to separate e-mail address from person.

Using colours as universals was a
poor example on my part as I feel its given rise to this red-herring.

No, no, no -- it was a GREAT example. You can see I jumped for joy
above when you could see it as I did -- color as a multi-valued
property of a strong entity. Then changing your example and seeing how
I changed with it, might have further helped.

[I realize that if a company is doing a shoe size survey and has no
interest in collecting other data about people, then the shoe size
might be a property of a survey or some other entity.]

and with large shared data, who's future use is unpredictable, it seems
essential to me to avoid that interface-prison.

I disagree, as you have likely guessed. Yes, there are new
requirements regularly that prompt one change or other, but I would
dare say that there are more changes that would push a property like
color to go from single to multi-valued, prompting a schema redesign
(new table) than there are if you make color a multivalued attribute to
start with.

But, two wrongs don't make a right. Cardinality change is something
i've mentioned before as an interesting area in schema change as a
whole - but it does not mean the theoretical solution should be to
imprison the user into a certain view of the data. all best, J.

I agree. Rule of thumb -- model strong entities as separate, uh,
classes? relations? (pick your favorite collective term, and yes I'm
aware that some think it is a mistake to compare these two terms in any
way). Model properties, whether single or multi-valued, as attributes
of these. Make a call, based on risk assessment, whether you should
turn an entity into a property at the risk of needing to make a change
later. Properties that can stay properties are cheaper than entities.
Now if a property was single-valued and becomes multi-valued or vice
versa you flip a bit on it.

Whacha think? --dawn

.



Relevant Pages

  • Re: Hurricane Bertha
    ... kills a large number, by itself, and is well-documented as a risk ... But rather than paying care givers enough that they will stay ... on their own MRI unit, was willing to do volunteer work in researching ...
    (sci.electronics.design)
  • Re: Hurricane Bertha
    ... kills a large number, by itself, and is well-documented as a risk ... But rather than paying care givers enough that they will stay ... huge volunteer cleaning group where IIRC some union took issue, demanding that the state only use people who get paid xx Dollars. ... on their own MRI unit, was willing to do volunteer work in researching ...
    (sci.electronics.design)
  • Re: "No sports mistake is supposed to be fatal"
    ... The skiing federation has taken that thinking to the Olympics, where you can't compete, even if you're the best in your country, if you're not competitive on a world stage. ... You could say that I don't care that they don't care. ... irrelevant to the risk that the people that are professionals ... Because a series of unsafe events, leading to injury claims, is going to scare away the insurance companies. ...
    (rec.bicycles.racing)
  • Re: another thing that pisses me off about pts....
    ... a fat ass, have high cholesterol, and you're bp is not perfect...maybe ... all you have some risk factors) and billed your insurance company 5k ... care or ED care, but pts are out of line with all the complaints.... ... there are incentives for doctors to give patients the "full ...
    (rec.sport.football.college)
  • Re: Long term care facilities
    ... I am now 55 years old I know that my wife was very hesitant to bring out the possibility of separation and felt very guilty about her wish to live a more normal life. ... One very good thing about the long-term care facility is that you do not feel guilty asking for specifically what you want or would like done different. ... we both agree to separate as soon as we finished paying off our mortgage, which it was about three months off. ...
    (alt.support.mult-sclerosis)