Re: Storing codes vs. human-intelligible values - best practice?



On Jun 28, 9:36 am, dana...@xxxxxxxxx wrote:
In a pre-existing project, many tables reference domain (lookup)
tables of the general structure:
#   COL_NM (column name) -- a primary key containing a human-
intelligible value; max varchar length varies by table, but typically
no longer than 30 characters
*    COL_CD (column code)  -- a not null field with a unique
constraint, typically an integer with a length of 1-2 digits.
o   COL_DES (column description) -- an optional description field;
could be up to 255 chars

The child tables referencing these domain tables use the COL_NM human
intelligible values rather than the less storage intensive COL_CD
(since it is unique and not null, it could also be used by a
referencing child table).

In a new project, it's been suggested to me that I continue to store
COL_NM values rather than COL_CD values in the child tables with these
rationales:
-- It's better to use COL_NM because, for anyone directly querying the
child tables, COL_NM means something to a human. COL_CD doesn't
(unless someone has a really good memory).
-- "Oracle can handle it"

However:
-- The new child tables will have orders of magnitude more rows than
the child tables in the old database; so, although "Oracle can handle
it", storage may be an issue, etc.
-- I've always thought it was a best practice to store short codes
rather than longer human-intelligible values.
-- As far as I can tell, few if any users will be querying the child
tables directly; and if they were, they could join to the domain table
to get the human intelligible values

Wondering if anyone with more experience than me (most people reading
this group, probably) have any opinions on the best way to proceed.
What other factors do I need to consider in making a choice?

Thanks.

Dana

(A few project constraints that may or may not be relevant to my
question (dictated to me, so they're non-negotiable) are:
-- Can't use check constraints (and wouldn't want to; easier to update
values in a domain table than change a constraint)
-- Can't use Oracle product-specific domains; we're typically asked
not to use vendor-specific features))

If the column in question will appear in one and only one table then
storing human readable values is probably safe as well as practical as
long as the values are relatively short and the total space
requirements on the object really does not affect performance against
the object. You can use an IOT to hold the valid list of values and
use a FK to ensure data integrity. Everybody should be happy.

On the other hand if three or more table will reference this column
you definitely want to use a lookup table. You also probably want to
use a lookup table whenever the length of the lookup values is
lengthy. I will admit lengthy is a relative term. I think it depends
on how many columns that fall into this problem classification exist
in the design, how many rows each table has, and the percentage of the
table data the column(s) will make up verse how many joins you will
need if you do not store the data.

Customers will not be too happy if to join Table-A to Table-B in their
adhoc query he or she also has to join 5 additional tables to Table-B
for lookup data. A view on Table-B that includes the lookup might be
the solution here but you will also have to consider the cost of the 5
lookup joins.

Where only a few legal values exist column constraints can be used to
enforce data integrity and views with case statements can be used to
display the human meaningful value.

I am not sure of the correct name or exact usage of the feature yet
but with 11g the calculated column feature might be of use here where
only a few values exit.

In conclusion you best decision is going to depend on the data and how
it is referenced and updated. In many cases carrying the actual data
values in your tables rather than a reference value is probaly the
most practical design otherwise to join 4 application base tables
could easily end up requiring writing 12 table join due to the need to
lookup reference data.

HTH -- Mark D Powell --
.



Relevant Pages

  • Re: temporal data constraints
    ... That seems like a great solution, but the problem is constraints. ... A child record can be created only if its parent record already ... A child lifespan must be contained within the parent lifetime ...
    (comp.databases.theory)
  • Re: A real world example
    ... If a constraint is defined in terms of successive states of a database, ... Yes, that's all facts are, though constraints mean that your facts are ... value of a candidate key determines the values of all other attributes, ... The frame of reference for a candidate key is a ...
    (comp.databases.theory)
  • Re: circular relationships ok?
    ... So you DO allow constraints that are cyclical. ... Concept graph visualizes what the database will manage and nothing else. ... if A reference B then we draw an upward arrow from A to B. ... may have any structure including cycles. ...
    (comp.databases.theory)
  • Re: How do you set a value to Null in a DataSet when its normally set to another table through a rel
    ... If it doesn't reference the other table at all it's an Orphan which is what ... You can try disabling the constraints but I ...
    (microsoft.public.dotnet.languages.vb)
  • Re: circular relationships ok?
    ... refer to constraints and tokens, a net refers to constraints ... If so then Net is positioned above Constant and any constant object can reference one object. ...
    (comp.databases.theory)