Re: Key attributes with list values was Re: What are the differences ...KEY
- From: "Brian Selzer" <brian@xxxxxxxxxxxxxxxxxxx>
- Date: Sun, 26 Feb 2006 21:51:50 GMT
You asked for an example, so here goes.
Consider the following set of propositions:
Jane Harper is married.
Jane Smith is single.
And a constraint that states that single people cannot become divorced.
Now, imagine that you're the database, and that you've been presented with
the following set of values to replace the above set:
Jane Harper is married.
Jane Smith is divorced.
Do you reject the update?
The problem is that there is no way for you to know that Jane Harper in the
first set is the same person in the second set. For example, in the
original set, Jane Harper's maiden name may be Smith, and Jane Smith may
have married Paul Harper. It is impossible for you to know because that
information was not provided. Because the update is set-based (and must be,
since either update taken individually would violate the primary key
constraint), and because the only key can change, it is not possible to
enforce the transition constraint because there isn't any mechanism to
correlate the propositions in the first set with the propositions in the
second.
Also see comments inline.
"Marshall Spight" <marshall.spight@xxxxxxxxx> wrote in message
news:1140978737.494702.103570@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Brian Selzer wrote:
"Marshall Spight" <marshall.spight@xxxxxxxxx> wrote in message
What I'm trying to convey is that from one point of view, a list
has identity, regardless of its contents.
I reject this point of view. The system I am building has values
and variables only. There are no pointers, there are no addresses,
and there is no concept of identity. There is only value.
What about the list of operations and materials that are required to
produce a part? That certainly has identity.
Nope. Just a value. Same as an int.
This is not to say that the concept of identity is not consistent.
It certainly is, and useful programming languages have been
built on top of it. It is foundational to OOP. However, useful
systems have been built without it as well; it is not a necessary
concept.
I disagree. It is a necessary concept, not only externally but also
internally.
There are plenty of useful general purpose programming languages
that don't have any concept of identity in them. Mercury, for
example. (And in the non-general-purpose category, SQL.)
So I don't see how you could describe the concept as "necessary."
Useful, arguable, but not necessary. A turing machine doesn't
have any notion of identity. The lamba calculus has no notion
of identity. Come to that, the lambda calculus has no concept
of equality, either. Huh. So I guess neither one is necessary.
I'm not sure we're using the terms in the same way, though.
Do you speak Java?
Integer i = new Integer(1);
Integer j = new Integer(1);
System.out.println(i==j); // tests for identity
System.out.println(i.equals(j)); // tests for equality
Is that how you're using the terms?
In Java, == on a reference type tests for reference equality,
which is to say identity. If there were no reference types,
as in Prolog or SQL or whatever, then there is no identity.
That's the point: a non-1NF attribute is not scalar, hence the ambiguity.
An entity must have identity, otherwise there's no way for the
database, or users, for that matter, to distinguish between them.
That's the whole point of keys.
One distinguishes between values by equality. If two values
are equal, they are the same value. If they are not equal,
they are not the same value. This is true for key values as
well as nonkey values.
Keys work because one can compare values, not because
one can compare identies. That is the fundamental difference
between keys and pointers.
I disagree, and the example above bears me out. The knowledge represented
in a database is subject to the universe of discourse, and as such only
those attributes that belong to that universe are available to the database.
The example above shows that an entity represented in a proposition in one
database state may not necessarily be the same entity represented by a
proposition in a later database state even if the attribute values are
identical.
Members of a set don't have identity.
Yes they do. Is a roll of quarters worth 25 cents? No, and to elucidate
upon my previous point, the attributes that would uniquely identify each
individual quarter may not be relevant within the universe of discourse, but
that doesn't change the fact that there are still 40 quarters.
Because a key value determines all other attribute
values, it identifies an entity.
Sure. This requires only values and equality.
But a there's a problem: a key value may
change over time, so any given key value's ability to determine what was
or
is known about an entity is limited to a specific interval, bounded by
the
time that its value became known by the database and the time that a new
value became known. This imposes limitations on the types of updates
that
can be performed or the types of constraints that can be enforced.
Um, I don't see how it does. If you want to impose specific semantics
on the data, then that might constain how you allow the data to be
updated. But that is true of any constraint. Constraints are semantic
things; values are logical things. 240 is a perfectly legal int, but it
might not be an allowed age for a person.
Have you tried writing a trigger to audit changes to rows in a table that
has only natural keys in SQL Server? Then you've seen the code that's
required to prevent multi-row updates that also affect a column that is part
of the key.
If all
keys can change, then either updates must be singular, that is, must
affect
only one entity of any given type at a time, or no temporal constraint (a
constraint that involves the state of the database at more than one point
in
time) can be enforced.
I don't see why this should be so. Perhaps I'm just not following your
terminology. And anyway, if your domain wants keys that don't change,
just apply a constraint that enforces that.
That is not always possible for natural keys, or even for composite keys.
This is a significant limitation of the Relational
Model with which I am most familiar, but I suspect that the concept
applies
to all other data models, which may have means to overcome it. In the
Relational Model, all updates are set-based, and if all keys are subject
to
change during an update and if the cardinality of the update is greater
than
one, then there's no way to determine which tuple in a new relation value
corresponds to any given tuple in the original relation value.
I don't see how you can use the word "limitation" to describe
what you appear to consider the ability to update too much.
Assuming you add code to reject these updates you don't
like, would you say that you had "removed a limitation"?
The limitation is that you can't write code to reject these updates without
also rejecting perfectly valid updates, unless you do it in the application.
Also, I think you somewhat overstate the case. If I have
a relation of customers with customerid in the range 1 - 1000,
and I decide I want customer ids to start at 1,000,000, I can
"UPDATE Customers set CustomerId = CustomerId + 1000000;"
and there's an update that changes keys and has
cardinality greater than one, and I can still determine
which tuple in the new relation value corresponds to any
given tuple in the original relation value.
But that's because in this case you can detect a pattern between the values.
That is not always possible, and besides, it requires that the system probe
the right-hand side of the assignment in order to find some means to
correlate tuples. Not only is that not practical, as the example above
shows, it is not always possible.
If you want a system that supports identity, you don't want to
be using set theory. There are plenty to choose from, and
they are well-supported and popular!
Yes, I do want to be using set theory. The recognition of identity does not
alter the sound theoretical foundation that the Relational Model provides.
It strengthens it, or better yet, completes it.
It should be obvious that correlation is necessary to enforce
a constraint that involves more than one database state.
It's not obvious to me.
Obviously.
Every proposition must
necessarily be different from every other proposition, because either
something is known, or it isn't: the knowledge contained in a database is
a
set of propositions, not a collection. Thus every proposition has
identity
with respect to the state of the database at any specific point in time,
and
that identity can be revealed as an attribute.
If what you're saying here is "every relation must have at least one
key"
then I agree. If that's not what you're saying, then I don't
understand.
In order to avoid losing
information over time, every new proposition must have a new identity
value.
In this and in the previous part, it appears you are using the term
"identity value" as a synonym for key. Is that correct?
No. It is not. Every proposition in the entire database must be distinct
from every other proposition and thus has identity with respect to the
database, regardless of whether those propositions are organized into
relations, some multi-valued construct or whatever else. An identity value
is a representation of a proposition's identity with respect to the entire
database. If the database is organized into relations, then an identity
attribute is clearly a candidate key on any relation that reveals it, but
conceptually, its value must be distinct throughout the database, not
limited to the scope of a single relation.
By that I mean that new values exist only for propositions that are
completely new to the database rather than to propositions that have been
changed. In other words, something can become known by the database, and
something that is already known can change. The distinction is subtle, I
know, but necessary--especially in a temporal database, but also in one
that
only requires that transitions be constrained.
Perhaps an example is in order.
Marshall
.
- Follow-Ups:
- References:
- What are the differences between the terms, CANDIDATE KEY, PRIMARY KEY, SUPER KEY, COMPOSITE KEY?
- From: Abhishek
- Re: What are the differences between the terms, CANDIDATE KEY, PRIMARY KEY, SUPER KEY, COMPOSITE KEY?
- From: Mikito Harakiri
- Re: What are the differences between the terms, CANDIDATE KEY, PRIMARY KEY, SUPER KEY, COMPOSITE KEY?
- From: Marshall Spight
- Re: What are the differences between the terms, CANDIDATE KEY, PRIMARY KEY, SUPER KEY, COMPOSITE KEY?
- From: Mikito Harakiri
- Re: What are the differences between the terms, CANDIDATE KEY, PRIMARY KEY, SUPER KEY, COMPOSITE KEY?
- From: Marshall Spight
- Re: What are the differences between the terms, CANDIDATE KEY, PRIMARY KEY, SUPER KEY, COMPOSITE KEY?
- From: Jan Hidders
- Re: What are the differences between the terms, CANDIDATE KEY, PRIMARY KEY, SUPER KEY, COMPOSITE KEY?
- From: Marshall Spight
- Key attributes with list values was Re: What are the differences ...KEY
- From: dawn
- Re: Key attributes with list values was Re: What are the differences ...KEY
- From: Marshall Spight
- Re: Key attributes with list values was Re: What are the differences ...KEY
- From: dawn
- Re: Key attributes with list values was Re: What are the differences ...KEY
- From: Brian Selzer
- Re: Key attributes with list values was Re: What are the differences ...KEY
- From: Marshall Spight
- Re: Key attributes with list values was Re: What are the differences ...KEY
- From: Brian Selzer
- Re: Key attributes with list values was Re: What are the differences ...KEY
- From: Marshall Spight
- What are the differences between the terms, CANDIDATE KEY, PRIMARY KEY, SUPER KEY, COMPOSITE KEY?
- Prev by Date: Re: Declarative constraints in practical terms
- Next by Date: Re: Declarative constraints in practical terms
- Previous by thread: Re: Key attributes with list values was Re: What are the differences ...KEY
- Next by thread: Re: Key attributes with list values was Re: What are the differences ...KEY
- Index(es):
Relevant Pages
|