Re: integer id columns for all tables
- From: "JOG" <jog@xxxxxxxxxxxxx>
- Date: 30 Aug 2006 09:08:12 -0700
Alexandr Savinov wrote:
garhone schrieb:
Hi,
Someone with greater expertise than I recently told me that it is best
to have an integer primary-key for all tables in a database, even if
the table already has some non-integer primary key, or some sort of
composite primary key. If there is already a non-integer key, then
create a new sequence column and set that as the primary key.
Does anyone have an opinion on this and why?
Here are a number of principles for designing identifiers:
- An entity may well exist without properties but not without identifiers.
gibberish surely.
- An identifier is the first thing to be created before actually the
entity itself is created. In other words, we create an identifier and
only after that (and if it is really needed) the corresponding entity is
created.
We create identifiers do we? Not observe them? Unghh.
- Identifiers have to be constant (in contrast to properties). If you
use a kind of system identifier then you are guaranteed then nobody can
change it or otherwise misuse.
No, we identify items via their properties. It seems a common problem
in IT for people to not think about /how we recognise distinctions in
the real world/ (this is where our propositions come from after all)
- Identifiers should not be meaningful because they are intended to be
manipulated by the software rather than by users (with the exception of
some well known identifiers used for bootstrapping). So identifiers
should be hidden from the users.
Sigh. Does this count as trolling now?
- Identifiers should be defined before any properties exist. Such a
structure will have only identifiers for accessing empty entities. After
that we can add meaningful properties. This discipline can ensure the
separation between these two concerns.
Double Sigh.
- If you need uniqueness then use the corresponding constraint.
Frequently the need for uniqueness in a property leads to the conclusion
that it should be used as an identifier.
Not if one realises that uniqueness is not the only quality a good
identifier has. See bob's informative list in his post.
Assume that there are two tables Companies and Products with
many-to-many relationship implemented via CP table. When we create this
join table, it automatically gets some identification means for its
elements. Or this format for its identifiers has to be provided
manually, for example, as some auto-integers (there are not properties
yet). *After that* we can add some properties. In this example we want
to add two fields the first pointing to a product and the second
pointing to a company. (Notice again that we keep identifiers and
properties separately.)
Alternatively, we might want to *optimize* this representation and
combine two roles: identification and characterization. In this case we
remove our identifier column and then say that the two properties
<Company, Product> will identify each element. (Notice that the question
is not if we need to *add* an integer identifier but rather if we can
*remove* the existing identifier and use our properties instead of it.)
We hope that such records will take less space. It is not always so. The
thing is that these records can be referenced by other records and then
we need to store two fields instead of only one. Thus records themselves
will really take less space but other records will need more space for
longer identifiers.
Now assume that somebody wants to change one of the two properties. Here
we need to answer the main question: are these columns identifiers or
characteristics of the entity. Since we combined them, the answer is
twofold. So every new developer of the system will give his own answer
with the corresponding consequences.
more entity/proposition confusion.
Another problem is that it not an exception when two records with the
same pair of <Company, Product> may exist. Indeed, why not? These are
only properties and they can take any values. Or assume that in future
this join table will also need to reference more records as its
characteristics. Should I add them to the existing two-field identifiers
or should I keep them separately. If I add them, then I will have to
reorganize the whole model. If not, then for some unknown reasons two of
three fields will be used for identification while the third will be
not. It does not make sense because all the fields have equal role. Here
again keeping separately properties and identifiers will prevent you
from their misinterpretation.
As I said already, these arguments do not mean that complex identifiers
are not needed. The problem is that using them may well result in an
inconsistent behaviour due to ambiguous interpretation of properties.
Using primitive identifiers is one possible safe way for solving this
problem. It guarantees that you will not have serious problems with your
system especially in a year or two when some other developers start
extending it.
--
http://conceptoriented.com
<weeps/>
.
- Follow-Ups:
- Re: integer id columns for all tables
- From: Bob Badour
- Re: integer id columns for all tables
- References:
- integer id columns for all tables
- From: garhone
- Re: integer id columns for all tables
- From: Alexandr Savinov
- integer id columns for all tables
- Prev by Date: Re: Functional Dependencies > Uniqueness Constraints
- Next by Date: Re: integer id columns for all tables
- Previous by thread: Re: integer id columns for all tables
- Next by thread: Re: integer id columns for all tables
- Index(es):
Relevant Pages
|