Re: Surrogate Keys: an Implementation Issue
- From: "Brian Selzer" <brian@xxxxxxxxxxxxxxxxxxx>
- Date: Thu, 27 Jul 2006 09:09:33 GMT
"paul c" <toledobythesea@xxxxxxxx> wrote in message
news:lWXxg.243067$iF6.132287@xxxxxxxxxxx
Brian Selzer wrote:
"Bob Badour" <bbadour@xxxxxxxxxxxxxxxx> wrote in message
news:PPMxg.21411$pu3.353655@xxxxxxxxxxxxxxxxxxxxxxxxxx
Brian Selzer wrote:
"Bob Badour" <bbadour@xxxxxxxxxxxxxxxx> wrote in messageAre you stupid or just ignorant? What you state above can only happen if
news:X1gxg.15744$pu3.340040@xxxxxxxxxxxxxxxxxxxxxxxxxx
Brian Selzer wrote:Of course it's possible. The key that Bob specifies identifies the
"Bob Badour" <bbadour@xxxxxxxxxxxxxxxx> wrote in messageThe only relevant concern is whether the update is correct. If the
news:XQ5xg.14298$pu3.333248@xxxxxxxxxxxxxxxxxxxxxxxxxx
Brian Selzer wrote:He may be overwriting a change made by another user.
"Bernard Peek" <bap@xxxxxxxxxx> wrote in messageSo? Bob issues an update statement to change some attribute
news:xn0ep1e3sn11oo000@xxxxxxxxxxxxxxxxxxxxxx
In comp.databases.theory Paul Mansour wrote:Untrue! Natural keys are often mutable--especially compound keys.
The second is perhaps more profound. Consider a rollbackIf what you think of as the key is mutable it's not a natural key.
database, or
a database that must provide a complete audit trail of every
change.
For example, the database must provide the answer to "who changed
this SS number from X to Y, and when did they change it? As far
as I
can tell, if there is no way to answer this without an immutable
identifier. ( I suppose you could design the DB to handle
specific
cases, but I'm interested in DBMS with native rollback and audit
trail support.)
Natural keys aren't mutable, at all, ever.
The problem here is that there isn't a usable natural key forThere is always a compelling reason to use surrogates: natural keys
identifying people. So all that's left are surrogates of varying
quality. The SSN is a surrogate that works most of the time. If
you
assign someone a payroll number and tell tham that thay won't get
paid
unless they can quote it then you have a close to immutable key.
But
it's still a surrogate.
I think the original post was part correct. Use a natural key when
you
have one. I'd just add to that, don't use a surrogate unless there
is a
compelling reason for it. Efficiency is unlikely to be a
compelling
argument in most cases.
can change. This makes it difficult--if not impossible--to detect
changes to rows. For example, Bob is preparing to update a
row--that is, he has read the row and is in the process of keying
in a change. During that time, another process updated several rows
in the same table. Unless the key is immutable, when Bob issues
the update, there's no guarantee that the row he's updating is the
same one that was read out.
identified by a logical identifier. As long as the data Bob is
entering is correct, why should he care what happens to any other
attributes?
update is correct, it matters not at all that it overwrites something,
which is presumably no longer correct.
Using logical identity and natural keys, that's not possible. The keyThis problem is magnifiedYou're missing the point. The change Bob's making may change the
if there are rows related via a foreign key constraint because it'sOr you can just write better applications that don't update anything
possible for the referenced row to appear unchanged. So you're
left with either maintaining an exclusive lock on the row until Bob
returns
from the golf outing, or adding additional columns and code in
order to
determine with certainty whether or not a change occurred between
the time that a row was read and the time of the update.
that didn't change. By introducing some 'under-the-covers'
identifying attribute, you create a risk that Bob will change some
data identified by a familiar logical identifier and some other
process in the meantime will associate that identifier with a
different surrogate. Your application will then record the updates
against the wrong logical identifier.
row, but because several changes occurred to the table while Bob was
keying in his change, the row he's about to change may represent some
other entity altogether.
that Bob specifies identifies the entity. And if Bob updates the
entire row, the entire row needs updating. I draw your attention, in
particular, to my observation that one can just write better
applications that don't update anything that didn't change.
entity at the time of the read from the database. By the time that the
write occurs, that same key now refers to a different entity.
one assumes one is using a hidden surrogate. If one uses explicit values
that Bob interacts with, then the value identifies a single entity. That
identity does not change.
From Bob's perspective, what he would expect is that the entity that he's
working on, identified by the key values he used to retrieve it, would
not change; however, since the database is no longer the same when Bob
attempts to commit, the entity that he was working on may either no
longer exist, or may have been changed by another user. During that same
interval, a second entity may have been changed so that by the time Bob
attempts to commit, that second entity may be identified with respect to
the new database state by the exact same key values that Bob read out.
...
I think this is misreading the information principle and perhaps it is due
to falling into the trap of thinking typical lock manager behaviour has
something to do with RT. Even if it's not that, it's a mistake. It
doesn't matter if the rows involved (not necessarily the ones updated)
were deleted twenty times and re-inserted. If the invoking transaction
were required to re-iterate the 'rows' it read whose values it considered
to be crucial for a correct update, then there would less chance of
falling into this trap.
The problem is with the theory, not the mechanisms used to get around its
limitations. As I stated before, the Relational Model doesn't take into
account duration. What do I mean by that? The Relational Model takes into
consideration only two successive database states: the current state, which
is always consistent, and the proposed state, which may or may not be
consistent. These states are separated by a single operation, either a
relational assignment or a multiple assignment. If the proposed state
violates the database predicate, then it is rejected, otherwise, upon
completion of the operation, the proposed state becomes the current state.
The operation occurs instantaneously (at least from a theoretical
standpont), meaning that it has no duration; however, there are no
simultaneous operations, and order is important. (In a concurrent
environment, simultaneous assignments are aggregated into a single multiple
assignment.) Relational assignment is set-based--the new relation value
replaces the old relation value; multiple assignment is a set of relational
assignments and is also set-based--the set of new relation values replaces
the set of old relation values. There are no row-based operations. This
poses several problems. I'll use contrived examples for brevity to
illustrate these problems. To be useful, a database is a representation of
some aspect of reality. Each table is a container for a specific type of
entity, and each row represents a real instance of that type of entity.
Consider the following states for a table that contains people:
Current: Jane Jones Married Proposed: Jane Smith Divorced
Do Jane Jones and Jane Smith represent different people? Or did Jane Jones
get Divorced? Because the only key is mutable, there isn't enough
information to answer that. Another problem involves temporal constraints.
Consider the following states for the people table and a transition
constraint, Single people can't become Divorced:
Current: Jane Jones Married Proposed: Jane Jones Married
Current: Jane Smith Single Proposed: Jane Smith Divorced
Should the proposed state be rejected? Or is it possible that Jane Jones
got Divorced becoming Jane Smith and Jane Smith married Bob Jones? Because
the only key is mutable and because there or no row-based operations, there
isn't enough information to answer that. The lack of row-based operations
means that if all keys are mutable, it is not possible to enforce transition
constraints because there is no way to correlate the rows in the Current
state with those in the Proposed state. Another problem involves the
situation I described above. Add the following states for a children table:
Read: Jane Jones Brian Write: Jane Jones Beth
Read: Jane Jones Lynn Write: Jane Smith Brian
Write: Jane Smith Lynn
When Bob reads Jane Jones' row from the people table, Jane Jones has two
children, Brian and Lynn, but when Bob is ready to update the people table,
Jane Jones only has one child, Beth. Clearly these are two different
people, even though the row in the people table is identical. Because the
only key is mutable, Bob will be updating the wrong row.
It should be obvious that surrogates solve these problems.
p
.
- Follow-Ups:
- Re: Surrogate Keys: an Implementation Issue
- From: Keith H Duggar
- Re: Surrogate Keys: an Implementation Issue
- References:
- Surrogate Keys: an Implementation Issue
- From: Paul Mansour
- Re: Surrogate Keys: an Implementation Issue
- From: Roy Hann
- Re: Surrogate Keys: an Implementation Issue
- From: Paul Mansour
- Re: Surrogate Keys: an Implementation Issue
- From: Bernard Peek
- Re: Surrogate Keys: an Implementation Issue
- From: Brian Selzer
- Re: Surrogate Keys: an Implementation Issue
- From: Bob Badour
- Re: Surrogate Keys: an Implementation Issue
- From: Brian Selzer
- Re: Surrogate Keys: an Implementation Issue
- From: Bob Badour
- Re: Surrogate Keys: an Implementation Issue
- From: Brian Selzer
- Re: Surrogate Keys: an Implementation Issue
- From: Bob Badour
- Re: Surrogate Keys: an Implementation Issue
- From: Brian Selzer
- Re: Surrogate Keys: an Implementation Issue
- From: paul c
- Surrogate Keys: an Implementation Issue
- Prev by Date: Re: Surrogate Keys: an Implementation Issue
- Next by Date: Re: Surrogate Keys: an Implementation Issue
- Previous by thread: Re: Surrogate Keys: an Implementation Issue
- Next by thread: Re: Surrogate Keys: an Implementation Issue
- Index(es):
Relevant Pages
|