Re: NULLs
- From: Hugo Kornelis <hugo@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Tue, 08 Jan 2008 00:56:43 +0100
On Fri, 4 Jan 2008 08:07:50 -0000, Roy Hann wrote:
(snip)
This is the most sensible explanation I've seen for why people do this,
Hi Roy,
Thanks! :)
but
it still doesn't work for me. In the first place, you--like most of the
other posters on this thread--seem to assume that a null is mostly used to
indicate when a desired fact certainly exists but is unknown.
No. I assume that a NULL is always (not mostly) used to indicate that a
fact that could have been stored is missing. It does not imply that the
fact is desired, nor that it certainly exists, nor that it is unknown.
Or rather, I am trying to put across that a NULL *should* always be used
like that - I am fullly aware that this is not how NULLs are mostly used
in the real world :)
Experience
suggests this almost never happens in the real world. I work with dozens of
databases at different sites every year, and the vast majority (all, for
practical purposes) use nullable attributes mainly (almost exclusively) to
permit multiple distinct fact types to be confused in one table.
That is indeed a common use of NULL. And though I agree that it is far
too common, I'd wager that I'm not as vehemently opposed to this as you
are.
When I teach modeling, I try to give the students a feeling for where to
draw the line between "required" normalization and "overdone"
normalization. Using Dutch addresses as an example, one might argue that
city is functionally dependant on postal code. As is, in fact, street
name (Dutch postal codes form a very fine grid - most streets span two
postal codes [one for odd, one for even house numbers], longer streets
have even more). And one might then go on and argue that the area code
of the phone number is in turn functionally dependant on the city. But,
in the vast majority of applications, this is not useful at all. Most
applications just want to print an address on a form, stick a stamp on
it and be done. And most applications don't even care about the area
code and local number as seperate attributes - they just need to be able
to punch a number to contact a customer. So I consider a table like
Persons (SSN (PK), Street, HouseNo, PostalCode, City, PhoneNo)
to be adequately normalized - even when purists would prefer a design
such as
Persons (SSN (PK), PostalCode (FK), HouseNo, LocalPhoneNo)
PostalCodes (PostalCode (PK), Street, City (FK))
Cities (City (PK), PhoneAreaCode)
Back from a digression - I've also seen situations where there were
mutually exclusive attributes, but I saw absolutely no harm in keeping
them in the same table, with a proper constraint to prevent them both
being populated. This should not be a common practice, but there are
cases where it's good enough.
It appears
to me that most practicing database designers/business analysts prefer to
avoid even attempting to collect information that is not guartanteed to be
known.
Definitely true. Though I'm not always sure if the designer is to blame,
or the end user (or decision maker) refusing to commit to ensure that
all data is collected at data entry time.
Nor do I buy the maintainability argument.
I *think* that this is partly historic (some decades ago, the maximum
number of tables in a database was much lower than at present), partly
perceived (imagine opening up some management tool and being presented
with a list of thousands of tables...), and partly what people have
become accustomed to.
Best, Hugo
.
- Prev by Date: Re: NULLs
- Next by Date: Re: NULLs
- Previous by thread: Re: NULLs
- Next by thread: Re: NULLs
- Index(es):
Relevant Pages
|