Re: SQL, related records (quotes)



Dan Guntermann wrote:
<snip>

I'll have to get Celko's book myself, I guess.

Ideally and abstractly, I think something like this would work:

hierarchies(child_vertex  *vertex*, parent_vertex *vertex*,
                  CONSTRAINT primary_key (child_vertex),
                  CONSTRAINT CHECK (child_vertex <> parent_vertex),
                  CONSTRAINT CHECK ****ANTISYMMETRIC****
                );

A few questions: 1. Did you just forget the foreign key constraint (child_vertex referencing parent_vertex) or did you leave it out for a purpose?

No nulls in my model.

Ok with me. As a consequence we need a model change, maybe more tables. Anyway we are changing the model. (In the OP's question there where quotes _without_ previous quotes)

So which FK are we talking about now?

I don't believe an FK is necessary,

Maybe, maybe not. It depends on the - yet to be defined - new model.

and with some implementations could not be achievable. I could be convinced otherwise, but consider the following:

child    parent
  B         A

how would one insert this tuple with the FK established? What would B reference?

Moreover, the model I was conceptualizing would allow for multiple hierarchies.

Does that matter? Multiple trees are easily mapped to branches of one root node, so whichever is easiest (one or more trees) is easily used for the other - or are you talking about multiple hierarchies over the same nodes? I don't think so.

2. Both the '<>' and the ****ANTISYMMETRIC**** constraint serve to
  establish that 'hierarchies' can just contain hierarchies (trees)
  indeed.
  What is special about the '<>' constraint? Shouldn't it be included
  in the (as yet magical) ****ANTISYMMETRIC**** constraint?

I think Mr. Rybacki begs to differ, but I had the following rationale:

Starting with an empty relation with just a key constraint on 'child', the following would be allowable, but undersireable:

child     parent
  A          A

Thus, to guarantee the nonreflexive property of a hierarchy, the constraint is need to guarantee all cases. Note that this is only possible with a vertex involving itself as a root node of a hierarchy. It is not possible for child vertexes by virtue of the key constraint since they already exist in conjunction with a single parent vertex.

It never stops bugging me how many arbitrary choices one has to make to put trees - or even just order - into tables). It seems you define a root as a node which has itself as a parent, right? That is one, not uncommon, way of dealing with this, but not the only way.


To address the issue of antisymmetry, I should identify the conditions where I think it such a constraint would be pertinent. Direct cycles are not an issue anywhere but for a certain case, and that is in the case where a root can be associated with a child and a child can be the parent of the root.

Is this just a way of restating the root definition?

All other vertices of the hierarchy will be immune to such a condition because each child is defined in terms of its relationship with a parent and, by definition, the number of arrows representing a directed parent-child relationship and thus pointing into child is limited to one (as enforced by the key).

example1: child parent
B A
C A
A B <-- cyclic (symmetric) relationship


example2 B A
C A
D B
B D XXXX <-- can't happen,
because of the key constraint.


Nor can a leaf node or an intermediate branch node reference higher level vertices as children.

As Mr. Rybacki points out, a relation is antisymmetric <=> for all x, y in some domain, if x R y and y R x, then x = y. The antisymmetric constraint will disallow the extensional relation given in previous posts:

child     parent
   3          6
   6          3

Why? Because the DBMS will see that <child: 3, parent 6> exists and that if <child: 6, parent: 3> were allowed to be inserted, the assertion of antisymmetry would not hold because 6 <> 3, and thus the DBMS would reject the insertion on the basis of the ANTISYMMETRIC constraint.


3. Would ****NOLOOP*** (some other as yet magical constraint over
  child_vertex and parent_vertex) suggest the same mechanics
  to you as ****ANTISYMMETRIC**** ?


??? I'm not sure what NOLOOP would do.

Something like this:
It would not allow loops. A loop being a set of vertices, where there exists a vertex A being both child and ancestor of a vertex B (which may, of course be A).


It does not play well with having root as parent of itself, though.

Properties of relations such as symmetry, reflexivity, and transitivity are states of relations that are very well defined.


4. (should have been the first Q to the OP - to late for that I guess:)
  What does one tuple mean / what is the predicate for 'hierarchies'?


Well, in my little encapsulated world, it can represent several types of conceptual objects, or specific aspects of hierarchy, really.
Possibility:
Child Vertex: X of type *vertex* has one and only one Parent Vertex: Y of type *vertex*
AND (informally) Child Vertex: X belongs to only one edge as the child and therefore has only one parent
AND (informally) if the Child Vertex = X, then Parent Vertex <> X
AND (informally) there should never be the case where EXISTS (child_vertex: X, parent_vertex Y) AND (child_vertex: Y, parent_vertex: X)
AND (child vertex: X <> parent_vertex: Y).

This is structure, not meaning.

The table constraints are really meta constraints across multiple tuples to enforce structure over the hierarchy.

Yep.

5. (Proposal, not really a question) Let's isolate the roots
  (and get rid of the NULLs).

  (PS: later I saw that you already did just that)


Yes!!

This ideal but unachievable representation would solve the problem and it enforces:
* a single root vertex for any hierarchy.
* a single parent for any child vertex and allows for multiple children for any parent.
* ensures an nonreflexive property and thus eliminates the self-referential cyclic issue.
* eliminates the possibility of a cycle between a root and single child with the antisymmetric constraint.
* eliminates the possibility for indirect cycles by virtue of the constraint that any given vertex can only have one parent.


But alas, not many implementations allow for assertions of antisymmetry, though it could be done with a trigger.

Let's forget implementation for now and focus on minimal needs.


Sure, why not.


Hmm... the formatting makes this a tough read.


Sorry.


NOT EXISTS ( SELECT 'x'
 FROM roots WHERE roots.root_vertex = current.child_vertex )
no children are root. Ok.


EXISTS ( SELECT 'x' FROM roots WHERE roots.root_vertex = current.parent_vertex)

all parents should be root? To restrictive.


The formatting was tough, I know. I'll rewrite the constraint in its entirety here:


CONSTRAINT EXISTS
(SELECT 'x'
     FROM roots
     WHERE roots.root_vertex = current.parent_vertex
     AND NOT EXISTS
(SELECT 'x'
      FROM roots r2
      WHERE r2.child_index = current.parent_index) );

Basically, the intent is to only allow a tuple to be inserted into the hierarchies relation if the proposed parent vertex value does not already exist as a child and its value has been previously inserted as a value in the roots relation.

The intent is clear and common - I think. That makes it more annoying that it is so difficult to come up with a good set of constraints. I'm puzzled wether the above does what it is supposed to do. Did you test it?

To check we should have something like this:
A length 1 loop:
Values:
   parent_vertex    child_vertex
         3           3
can't be in 'hierarchies', it will be rejected by
'CONSTRAINT CHECK (child_vertex <> parent_vertex)'. Good.

A length 2 loop:
Values:
   parent_vertex    child_vertex
         3           6
         6           3
can't be in 'hierarchies', it will be rejected by
???.

It will be rejected by the candidate key for almost all cases except for the possibility of such a case happening between a root and one of its children.

This complication sounds like a consequence of the (arbitrarily chosen) root definition.

In that case, the ANTISYMMETRIC constraint should ensure that that special case does no occur.


A length 3 loop:
Values:
   parent_vertex    child_vertex
         3           6
         6           7
         7           3
can't be in 'hierarchies', it will be rejected by
???.


Well, I think the special roots relation and the added constraints I mentioned would be helpful in such cases. Admittedly, I overlooked this in my first proposal.


The last constraints, which are DB constraints, of the tables would most likely have to be implemented as a database constraint either with trigger logic or by modeling manipulation within the scope of a transaction boundary from the application side. This overcomes the obstacle that a hierarchy could conceivably be composed of a single root node.

The non-hierarchy hierarchy. Do I care wether it is in or out? Let's take the easy road. Out.

Makes sense. However, one might be neglecting some useful applications (e.g. modeling a B*Tree or something similar).

Ever heard of YAGNI? KISS, so neglect it unless you specifically aim for "modeling a B*Tree or something similar" - are you?
.




Relevant Pages

  • Re: SQL, related records (quotes)
    ... Did you just forget the foreign key constraint ... > establish that 'hierarchies' can just contain hierarchies ... Starting with an empty relation with just a key constraint on 'child', ... vertex involving itself as a root node of a hierarchy. ...
    (comp.databases.theory)
  • Re: SQL, related records (quotes)
    ... Did you just forget the foreign key constraint ... child B to reference for referential integrity and what meaning the ... > for the other - or are you talking about multiple hierarchies over ... >> possible with a vertex involving itself as a root node of a hierarchy. ...
    (comp.databases.theory)
  • Re: SQL, related records (quotes)
    ... Now in this specific situation a check constraint, ... * a single root vertex for any hierarchy. ... any parent. ... of disallowing single node hierarchies. ...
    (comp.databases.theory)
  • Re: Trigger/Foreign Key limitation for real?
    ... You can create an INSTEAD OF trigger on the referenced table, ... The foreign key constraint is declared on the Child ... parent & child. ...
    (microsoft.public.sqlserver.programming)
  • Re: SQL, related records (quotes)
    ... Let's say for the hierarchy itself we decide on ... child parent ... child B to reference for referential integrity and what meaning the ... possible with a vertex involving itself as a root node of a hierarchy. ...
    (comp.databases.theory)