Re: How to work with m:n relationships?
- From: "vldm10" <vldm10@xxxxxxxxx>
- Date: 20 Feb 2006 17:03:32 -0800
Jonathan Leffler wrote:
vldm10 wrote:
I will use the post which was on comp.database on 25 January 2006,
“Design Query”, as one example for m:n relationships. Here we
should to design a database to track product subscriptions. The
solutions are simplified so that we can concentrate on m:n relationship
and on some dis/adventages. Let following solution be as it is usually
done in the relational databases.
Subscriber ( ssn, name, address, phone, email, login, password,
datefrom, dateto, other-details)
Remind me not to get subscriptions from you :-)
Do NOT use SSN in a DBMS if you can possibly help it, and there is
approximately zero justification for doing so with a set of product
subscriptions - they have nothing to do with government sponsored
retirement schemes, nor anything to do with the IRS. [Internal Revenue
Service in the USA - the governmental organization charged with
collecting federal taxes.]
If you must store the SSN, do not use it as a key - generate a surrogate
and use the surrogate as the key. If necessary, use an MD5 or better
hash of the SSN - though since there are fewer than 1 billion SSNs (they
are a 9-digit number), it is not too hard to precompute the possible
values for the hash values, so it is not a very good idea to use a hash.
If you must store the SSN, do not store it unencrypted.
Product ( product_code, name, price, other_details)
SubscriberProduct ( ssn, product_code, type_of_subsription, date_from,
date_to )
Keys are ssn, product_code and (ssn, product_code) respectively.
Sometime, you'll explain why you don't permit a subscriber to
resubscribe for a second period. There are probably reasons for this,
but it isn't immediately obvious to my why.
I will
compare above solution with the following (which I defined in
www.dbdesign.com):
SubscriberSt (Key1, ssn, name, address, phone, email, login, password,
datefrom, dateto, other_details)
ProductSt (Key2, product_code, name, price, date_from, date_to,
other_details)
SubscriberProductSt ( Key3, Key1, Key2, type_of_subsription, date_from,
date_to )
Here every Key is one column generated key. This solution has suffix
St and I will call it St-solution. So, Keys are only difference between
these two solutions. We will assume that the entities and the
relationships can change their own states.
Well, don't forget to declare the other candidate keys too - the
original keys. You've got two keys per table - the new, single column
ones and the old multi-column ones.
1.
Let Subscriber and SubsriberSt now get values:
Subscriber1 = ( s1, n1, a1, p1, e1, l1, ps1, df1, dt1)
SubscriberSt1 = (k11, s1, n1, a1, p1, e1, l1, ps1, df1, dt1)
Let now Subscriber1 gets a new phone number. We will need new key for
Subscriber.
Subscriber2 = ( s1, n1, a1, p2, e1, l1, ps1, df2, dt2) and the new key
is (ssn, df)
New key? Why? The key is SSN (see your opening statement) and the SSN
has not changed - so you are somewhat confused about what your example
is supposed to be.
SubscriberSt2 = (k12, s1, n1, a1, p1, e2, l1, ps1, df2, dt2)
(St-solution can support this change)
Either there is no need for the change of key from k11 to k12 or you are
attempting to do something that you've not stated you're trying to do,
such as track the history of changes. If you're doing that, you need to
look at Temporal Data and the Relational Model by Date, Darwen and
Lorentzos [TDRM].
You're already off track - I wonder if I should really analyze the rest
of the discussion.
Now we can ask ourselves, what will be if any other attribute (for
example email) change its value at same df2 time and has dateto not =
dt2. Subscriber2 will get the duplicate key, while SubscriberSt2 will
work normally.
This is a question for you to define - there is no inherent problem. If
your dates have a granularity of 1 day, then multiple changes on the
same day can be made, all with the same date from. I'm left wondering
what date to is set to, and whether you ever changed the value in the
previous record to the date when it became invalid.
First solution uses time, as a key’s attribute, compound key etc.
Well, you didn't mention that in the definition - so, please start over,
explaining what you're up to. But don't do that until you've read
[TDRM] or some similar books.
Well, maybe you can buy something for reading. Did you considered maybe
to buy "Teach Yourself Beginning Databases in 24 hours" from the Sams
edtion.
Is
there an explanation which will say when to use compound key or when to
use surrogate key. What are the meanings of these keys, etc?
Can we say that time is property of an entity? And related with this,
we can set another question. What does it mean that two entities are
identical? Is it means that they are one same thing or they are the two
things perfectly similar? In the second case their entity’s
properties are same but their system’s properties can be different
(like time/space/position properties). Here my main objection is
regarding construction of a key. Can we use principle: if for n
attribute’s values, two rows are same, then I can always find n+1
attributes so that rows are different? Can I say that for any n
attributes, I can always set two rows to be with same values?
These issues are dealt with thoroughly in [TDRM] and in 'Databases,
Types and the Relational Model: The Third Manifesto, 3rd Edn by Date and
Darwen [TTM], due out at the end of this week.
2.
Now let St-solution be:
SubscriberSt1 = (k11, s1, n1, a1, p1, e1, l1, ps1, df1, dt1)
SubscriberSt2 = (k12, s1, n1, a1, p1, e2, l1, ps1, df2, dt2)
ProductSt1 = (k21, p_c1, name1, price1, df3, dt3)
SubscriberProductSt1 = (k31, k11, k21, type_of_sub1, df4, dt4)
Here as addition to the referential integrity, it should be added that
PK-FKs should
correspond to each other by their states.
I don't understand what you're saying there.
3.
Microsoft ADO.NET has possibility to work with the disconnected
objects. We can use these objects to connect to a database, now we can
get a data from the database, set them in the disconnected objects and
then we can disconnect from the database. When we are disconnected we
can update these data, for example we can update a key. Now we can
connect to the database again and update it. Problems arise when there
are the two users which are working in disconnected mode with the same
row and one user updates the key, but the other one needs this key.
Yes - it is a bad idea to update keys. But you need to have proper
definitions of your keys, and the properties you are hoping to model.
There is Microsoft’s solution for these kinds of problems but it can
be very complicated and there are many combinations. The St-solution
which is shown above (in part 1 and 2) can handle these problems with
ADO.NET easy. In fact St-solution doesn’t have updates at all. The
difference is that ADO.NET solution is based on programming theory.
St-solution is based on DB theory.
ADO.NET is great step, but maybe DB theory can have more appropriate
solutions related to programming languages.
The St-solution needs to update the old 'date to' columns unless your
system is prescient and can predict at insert time when a given
subscriber is going to change their address, phone number or whatever.
4.
In his book “Writings 1994-1997” on page 554, C.Date writes about
duplicate rows as well as E.F.Code’s opinion about this. C. Date
writes that G.W.Leibniz’s The Principle of Identity of
Indiscernibles is the basic why relations don’t contain duplicate
members. But it seems that results from Quantum Mechanics show that
Leibniz’s principle is false.
Any URLs you'd care to give as pointers to follow?
As an important part of his philosophical theory G.W.Leibniz defined
the law which is about relationship between entities and properties. In
the notation of symbolic logic Leibniz’s law can be written as:
x=y ↔ ∀F(Fx ↔ Fy)
where x, y are objects and Fx is property of an object x.
(For those who are interested in Leibniz’s Law, there is for example
the Princeton University web site, where some ideas related to
Leibniz’s Law are shown with the great imagination.
http://www.princeton.edu/~jimpryor/courses/intro/notes/timetravel.html
http://www.princeton.edu/~jimpryor/courses/intro/notes/personal-id.html
)
Those may be helpful - do they cover the quantum mechanics?
To not be misunderstood, I believe in Relational DB, but maybe some
things can be “updated”, especially the things which are related to
the new technologies. For example the things related to xml, web
services, knowledge representation etc.
--
Jonathan Leffler #include <disclaimer.h>
Email: jleffler@xxxxxxxxxxxxx, jleffler@xxxxxxxxxx
Guardian of DBD::Informix v2005.02 -- http://dbi.perl.org/
Vladimir Odrljin
.
- References:
- How to work with m:n relationships?
- From: vldm10
- Re: How to work with m:n relationships?
- From: Jonathan Leffler
- How to work with m:n relationships?
- Prev by Date: Re: Database design
- Next by Date: Re: Database design
- Previous by thread: Re: How to work with m:n relationships?
- Next by thread: Re: How to work with m:n relationships?
- Index(es):
Relevant Pages
|