Re: 3vl 2vl and NULL



On 15 Dec 2005 14:21:18 -0800, dawn wrote:

(snip)
>> This definitely improves the quality of how your 2VL logic system
>> handles missing data. But I still don't like it. :-)
>
>If you worked with it for a bit or if you had not worked with a 3VL at
>all, you would likely be happier with it.

Hi Dawn,

Yes, of course. If I had never known about 3VL, I wouldn't miss the
extra control it gives. You can't miss what you're not aware of.

>
>> And you have already given the reason.
>>
>> > Now, perhaps Uncle
>> >Vernon is 110, clearly older than Aunt Marge, but that fact has not yet
>> >been recorded since people were pouring over medical and legal records
>> >to verify this first. He will then incorrectly be identified as
>> >someone whose age (in this database) is less than or equal to Marge's
>> >age. When the data are not correct in the system, the results of the
>> >queries are not correct either.
>>
>> But in this case, the data was not incorrect.
>
>Yes, it was incorrect. In order to simply, we could say that we
>modeled the proposition:
>
>Uncle Henry's age is unknown
>
>with the database data of
>
>Uncle Henry has no age

No, Dawn, we didn't model either of those propositions. We only modeled
a proposition that looks like <person> is <age> years old, and omitted
to include a fact for Uncle Henry in the data collection.

Think of the DB components as persons performing a task. The task is
(for example) to calculate family member's ages in decades.
For Aunt Marge, the person asks the data keeper for her age. He gets a
reply, divides by ten, then informs his manager about the age of Aunt
Marge.
For Uncle Henry, the person again asks the data keeper for his age. He
doesn't get any reply (signaled, in the DB, but NULL. Of course, that
only means that he can't give a reply to his manager either (again,
signaled by NULL in the DB).
Real people would still communicate the fact that they can't give an
answer, and start blaming others. But this fictional DB component is
constrained to only return an integer to his manager, and this integer
should be an age in decades. He can't return this - ergo silence in the
people analogy, or NULL in the DB.

Now, when the question changes from "age in decades" to "is (s)he older
than 57?", the handling of missing data changes as well. Instead of not
replying to the question, the person or DB component now replies with
UNKNOWN. This does not signal that Uncle Henry's age is unknown; it only
signals that it is unknown if the question should be answered with
"true" or "false".

One might argue that it would be more consistant to return NULL on a
logical test that involves missing data as well. And though there's
certainly a lot to be said for that POV (for a theorist), the use of
UNKNOWN is much better for practionists like me. After all: NULLS should
always propagate; UNKNOWN don't propagate. FALSE AND UNKNOWN can be
FALSE, but FALSE AND NULL should be NULL.

Returning NULL instead of UNKNOWN if a logical test involves missing
data would probably yield a very unworkable system. The inclusion if
UNKNOWN makes for a great system that usually returns what everybody
expects, sometimes surprises newbies, and give the experienced coder a
great amount of control.

>
>One could say this flies in the face of a database being a set of
>facts. However, simplifications are part of modeling and this little
>white lie is a way to simplify the data processing functions and human
>understanding thereof. We are still modeling an accurate statement.
>
>> Since there's no way for a
>> database to check if data is correct or not, I can live with the
>> "garbage in garbage out" principle.
>>
>> But this case is different. The data was missing. And instead of
>> treating this data as being missing, the DB simply assumed some value
>> (apparently a zero in this case) and reported on it as if that was the
>> value I entered on the form yesterday. This would be the "nothing in
>> garbage out" principle.
>
>I understand your argument. Were I not a practioner, I would likely
>agree with your reasoning and dismiss a 2VL as over-simplified for
>reality.

This surprises me. Surely, as a practioner you would be interested in
the difference between Aunt Marge being 27 years older than Billy, or
Aunt Marge being 27 years old and Uncle Henry's age not beinig supplied?

(snip)
>> If the answer to both these questions is yes, you can now say "Hi" to
>> someone who has worked extensively with both and who would definitely
>> say otherwise. ;-)
>
>OK, how about if I insert words like "database access language" in
>there?

According to what definition of "database"? I've worked with IMS/DB,
VSAM and flat files. I consider the first two to be databases.

> I'll also ask you if you are sure you prefer the 3VL of SQL to
>the 2VL of your programming language(s). Which language(s) are you
>using? Do you think they would be better if "enhanced" with a 3VL?

To start with the second question: I am now using Transact-SQL. Before I
moved to relational databases, I used mainly PL/1, but I've also had my
share of Cobol.

The other questions are harder to answer. Using a third generation
language means that you have to iterate over a collection of data and
process it one record at a time. If code is structured that way, it's
very easy to use IF THEN ELSE constrctions to add special-casing for
handling missing data. SQL changes the way code is written - instead of
procedural record-oriented code, I am now writing declarative
set-oriented code. The structure of that code precludes the use of
complicated IF THEN ELSE constructions for special-casing the missing
values.

I think that many of my queries would become longer and more complicated
if SQL had no 3VL. But that was not what you're asking. And I honestly
don't know if my PL/1 programs would have changed if PL/1 had offered
3VL support. At that time, I didn't know zilch about 3VL, so I never
missed it. And now, it's quite hard to remember exactly what programs I
wrote 10 years ago and how I would have changed them if I could have
used 3VL.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
.



Relevant Pages

  • Re: Why does 1:3 relationsihp require another table?
    ... English word for missing is vague. ... When someone leaves off their age on an ... and when the insurance application asks me when ... However, one of these is missing because of pride, so it is unknown. ...
    (microsoft.public.sqlserver.programming)
  • Re: So whats null then if its not nothing?
    ... The IBM UniVerse programmers manual has this to say about the NULL ... Unknown Data: The Null Value ... Do not confuse the null value with the empty string. ... SORT EMPLOYEE FIRSTNAME SEX AGE ...
    (comp.databases.theory)
  • Re: 3vl 2vl and NULL
    ... >>Dot as everyone else, including Uncle Vernon in this 2VL scenario, has ... >>If the data are changed so that Aunt Marge also has a null set age, ... >>someone whose age (in this database) is less than or equal to Marge's ...
    (comp.databases.theory)
  • Re: So whats null then if its not nothing?
    ... But not different to something with the value '' (aka empty string). ... Both Null and Unknown are important issues to understand when using SQL. ... What can you tell me about Joyce's age? ... such as Age - since a blank cell clearly ...
    (comp.databases.theory)
  • How to move data in Column A into seperate columns / rows
    ... I have just started a book website, part of the site is a database. ... BI paperback ... SR FIELD GUIDES ... DE It was an age of Empire, an age of contrast, and an age of dramatic ...
    (microsoft.public.excel.programming)