Re: Another view on analysis and ER
- From: dawn <dawnwolthuis@xxxxxxxxx>
- Date: Sat, 8 Dec 2007 10:27:57 -0800 (PST)
On Dec 4, 7:10 pm, JOG <j...@xxxxxxxxxxxxx> wrote:
On Dec 4, 8:33 pm, "David Cressey" <cresse...@xxxxxxxxxxx> wrote:
"JOG" <j...@xxxxxxxxxxxxx> wrote in message
news:58c47eeb-adf0-414a-a1f4-9077039bd7fd@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Genuine question guys. From an E/R perspective (one of the good
variants, that allows relationships to have attributes), if I'm faced
with the following data.
-- Fred married Wilma in Bedrock.
-- Barney and Betty married in Paris.
How do I decide whether I am dealing with a marriage entity or a
marriage relationship?
The literature I'm reading here is telling me that the choice is based
on what 'things' are key to the business. If my business is concerned
with people (tax collection say), 'marriage' is best modelled as a
relationship, whereas if the marriages themselves are my focus
(perhaps I run a church) then its probably better as an entity.
Have I made the right interpretation here, and is there general
agreement here? I am much more comfortable seeing that some variants
allow relationships to themselves have attributes, and that there is
nothing sacred about choices between using relationships or entities,
making it a design decision instead.
Thanks in advance, J.
My answer is that it's subjective. If the subject matter experts all treat
a marriage as a relationship, follow their lead. If the subject matter
experts all treat it as an entity, follow their lead, but insist that
there's a key attroubt that identifies it. (Do the SME's have any such
thing as a "marriage ID" attribute?
Once you switch over from analysis to design, here's what happens: the
attributes that you discovered during analyisis and attached to entities or
relationships will be carried over from your ER model your design model,
which I presume will be a relational model.
The entities and relationships themselves, as such, will all disappear!
Another thing that will carry over is the keys, used to identify instances
of entities in the subject matter (or UoD if you prefer). Sometimes the
keys used by the SME (subject matter experts) are a little too informal and
require "common sense" to disambiguate. That's a special case, and doesn't
affect this discussion.
I'm used to expressing a relational model in terms of tables (relational
tables), but I presume that what follows could be transliterated into terms
of relvars without any difficulty.
Each entity will have a table of its own, with the attributes that pertain
to that entity. The primary key of the table will be the key attribute of
the entity.
Each relationship will have a table of its own, except for a few that can
be piggy backed onto entity tables. The primary key of a relational table
will consists of two or more foreign keys, compound. These tables will
automatically be normalized up to 3NF unless your analysis put an attribute
on the "wrong entity". I'm not sure about normalization beyond 3NF.
The above process is so automatic that you can have software that does it
for you. Indeed, that's what several tools do. They express an ER model
in terms of metadata, and likewise express the relational model in terms of
metadata. And they have a programmed process that will create a relational
model from an ER model. The only tool I know calls the relaional model a
"physical model" and makes it specific to some product like Oracle or DB2,
etc. But that's a trivial detail. The software also turns models into
diagrams and/or create scripts for you.
So where did the entities and relationships go? They disappeared into the
ether! However,
when the application people get around to designing screens and reports,
they can tie each feature back to an original entity or relationship. That
can make the resulting system coherent for the users.
Ok so one might summarize the following steps:
1) initial analysis of business processes and important concepts.
2) Formulation of an initial conceptual model (that is necessarily
slanted to a certain viewpoint of the UoD).
3) Translation into a nicely normalized logical model, that's query
neutral.
4) On demand, extract data back out from the neutral logical model,
shaping it either the original conceptual view, or other conceptual
views as needs arise from new applications.
Great. This all makes perfect sense, and is very clear to boot. A
simple process for creating a thorough yet flexible system. It seems
obvious even, right?
So why on earth would /anyone/ want to drop step 3? I'm at a loss as
to why certain cdt'ers (who are clearly intelligent people) seem to be
advocating this. An absolute loss I tell you.
My answers tend toward pragmatism and evaluating trade-offs. What do
you give up by taking propositions that originate from a certain
perspective and turn those into query-neutral propositions, for
example? This is not just a question of what one might give up in
theory, but about what, practically speaking, typically gets lost.
Order is one aspect of the propositions that is often dropped in an
effort to simplify the logical data model, for example. Also, what
complexities get added when deciding that query neutrality is of
utmost importance? What is the cost of introducing such complexities.
There are at least three significant changes I would like to see
compared to what I have seen with systems designed using #3 above:
1. NF2. nested structures/child tables/relation-valued attributes/
Non-1NF as 1NF was traditionally defined. While keeping most FD best
practices, when there are weak entities wrt the proposition and the
UoD, the benefits of strong coupling of these parts of the proposition
to the rest of the proposition (turning them into only child
structures) can have greater benefits (in productivity and integrity
over time/maintainability) than being able to use the nested structure
as the portal for a query (I know I said that in my way and not a
standard way, so if you have questions, you know where to find me)
2. 2VL. I'm guessing you consider this irrelevant in this context,
because it is not required in theory, but speaking practically,
whereever I have seen these "nicely normalized" structures (or at
least typically), I have also seen a 3VL implementation. Tools that
use 2VL typically also make it easier to query and use propositions
that include nested structures, so those avoiding 3VL are likely to
also ignore that which was formerly known as 1NF with benefits and
without a big downside in so doing.
3. Ordering. Preservation of the ordering of the propositions,
including ordering within child structures can improve understanding
and communication with the end-user. "On your pizza crust put sauce,
cheese, and pepperoni" might or might not be idential in meaning to
the proposition "On your pizza crust put sauce, pepperoni, and
cheese", choosing that example for David, so if we preserve the order,
we might be preserving relevant information even if we are unaware of
that, find it irrelevant, or think it expensive, in our analysis.
Each of these relates to potential improvements in the quality of the
data and the quality of decisions made based on the data, as well as,
potential productivity improvements for developers, and potentially
better quality of the overall software throughout the lifecycle. It
isn't a slam dunk either way, but one should at least recognize that
there are trade-offs.
That's just a start, but I hope it gives you some sense that it might
not be completely, utterly, stupid to build data structures without
what has traditionally been termed a "nicely normalized logical model"
Oh, and best wishes in your upcoming marriage. We are at 31 years, so
don't believe the nay-sayers. smiles. --dawn
I apologize for this repsonse. It's really a lot more than you asked for.
Yes, how dare you respond with such clarity and thoroughness. Shame on
you.
But it's actually easier to use than it is to describe. It may also
oversimplify. The relational model that software constructs may not be the
best relational model that could be designed to deal with the original
problem.- Hide quoted text -
- Show quoted text -- Hide quoted text -
- Show quoted text -
.
- References:
- Another view on analysis and ER
- From: David Cressey
- Re: Another view on analysis and ER
- From: JOG
- Re: Another view on analysis and ER
- From: David Cressey
- Re: Another view on analysis and ER
- From: JOG
- Another view on analysis and ER
- Prev by Date: Re: OT editors (was: Another view on analysis and ER)
- Next by Date: Re: Another view on analysis and ER
- Previous by thread: Re: Another view on analysis and ER
- Next by thread: Re: Another view on analysis and ER
- Index(es):
Relevant Pages
|
|