Re: TRUE and FALSE values in the relational lattice



On 21 jun, 16:12, Marshall <marshall.spi...@xxxxxxxxx> wrote:
On Jun 21, 3:14 am, Jan Hidders <hidd...@xxxxxxxxx> wrote:



On 19 jun, 22:32, Marshall <marshall.spi...@xxxxxxxxx> wrote:

I am thinking about formalizing the lattice logic. That would
involve a formal statement of the axioms and the rules of
inference. We would then want a soundness proof, and
a proof of consistency. We would also want to understand
how complete such a logic would be, and how strong it
was relative to FOL. And how decidable. My hope is
that it is decidable even if that means it is weaker than
FOL. We also have to think about what else besides
& and | we're putting in to it. Clearly 00 is a strong
candidate. I'm currently thinking that the negation
operator I defined a week or two ago is *not* a good idea.
The scary one is division/group by.

I would assume the following operations: (in yet another notation)
- E * E : natural join
- E + E : generalized union
- {()} : the relation with empty header and a single empty tuple
- [A,B,..D] : the empty relation with header {A, B, ..., D} (possibly
the empty set)
- A=B : the equality relation with header {A, B}
- A=c : the relation with header {A} and tuple (A:c)
- R : a relation with name R (and known header)

That would be roughly equivalent with unions of conjunctive queries,
for which equivalence is known to be decidable.

Hmm. But without generalized distributivity, we can't always
reduce an expression to conjunctive normal form nor
disjunctive normal form.

Correct, but that is a matter of finding enough algebraic identities.
I'm fairly certain I can come up with a set of identities that allows
you to rewrite to a normal form r_1 + ... + r_n with the r_i's in the
following syntax:

E ::= (E + [H]) | (E * E) | [H] | (A=B) | (A=c) | R

where H is a set of attribute names, A and B are attribute names, c a
constant and R a relation name. Note that you need the E+[H] to do the
projection, so you cannot remove all +'s.

Adding the set difference or division
would make it relationally complete (so, FOL)
and make this undecidable.

I would be interested to have the axioms for division
especially.

Me too. :-) Finding a simple finite algebraic complete axiomatization
for FOL is an open problem.

Anyone interested in thinking about a paper on that?

Yes.

Ok. But if I'm going to invest time in this I want this to be a
focused effort. The goal has to be to find a complete axiomatization,
and nothing else. Otherwise I'm afraid it will cost too much time and
not lead to any usable (= publishable in real database journal)
results. I hope you will forgive me for being a bit bossy.

Right, so let's starts with a small hors d'ouvre. Let's take a very
very small and trival part of the algebra and see what we can achieve.
I suggest:

E ::= (E * E) | R

So we can only do natural joins between given relations. What do we
need? The obvious ones:

(1) r * r = r
(2) r * s = s * r
(3) r * (s * t) = (r * s) * t

This is complete since two expressions are equivalent if the join the
same set of relations, and that is exactly what these rules allow you
to show.

Ok. In the next step we add the empty relations [H] with H a possibly
empty set of attributes. So the syntax becomes:

E ::= (E * E) | R | [H]

The extra rules:

(4) [H] * [S] = [H + S] (I'm overloading * and + as intersection
and union of sets of attributes)
(5) [H] * R = [H + S] where S is the header of R

Is this sufficient? For the the case [H] did not appear we were
already complete. If it does appear then the result will be empty, so
equivalence only depends upon the header of the result. In this case
you can use (2) and (3) to rewrite it to a form [H_1] * .. * [H_n] *
R_1 * .. * R_m. With (4) and (5) you can then rewrite it to the form
[H] which indeed gives you the header of the result.

We proceed with the {()} so the syntax becomes:

E ::= (E * E) | R | [H] | {()}

The extra rules needed:

(6) r * {()} = r

This seems sufficient. Proof proceeds similar as in the previous
addition.

What next? Let's take (A=B). So the syntax becomes:

E ::= (E * E) | R | [H] | {()} | (A=B)

What this adds is that we can we identify certain sets of attributes
in the join result that all have to be the same. With the rules we
already have we can put all (A=B)'s next to eachother. So we need
rules to make sure we can write the same sets to the same form:

(7) (A=B) = (B=A)
(8) (A=B)*(B=C) = (A=C)*(C=B)
(9) (A=B)*(B=B) = (A=B)

Is this complete? You can think of the pairs as edges in an indirected
graph, and the rules (7) and (8) are sufficient to allow you to order
the nodes in the graph alphabetically if the graph is connected.
Moreover, (9) allows you to remove duplicates, so sets of pairs that
are connected and describe the same set of attributes can be rewritten
to exactly the same form.

Ok. Next would be (A=c) but you can simulate that with a special
relation C with header {A}. So all that remains is the + operator, but
this is of course the most difficult one, not least because it
combines the set union and the projection. A careful approach might be
to start with allowing only projection, i.e., expressions of the form E
+[H].

That's it sofar. Let me know if you can follow this.

-- Jan Hidders

.



Relevant Pages

  • Re: completeness of the relational lattice
    ... It is in union normal form, ... may be always empty, the result is wide and the ... a relation with name R (and known header) ... E * E: natural join ...
    (comp.databases.theory)
  • HELP! how do i read in multiline data blocks?
    ... fgetl, etc., and trying out various things, but I am not succeeding. ... ONE HEADER LINE ... ONE EMPTY LINE ... I know I want a loop that goes through one datablock at a time, ...
    (comp.soft-sys.matlab)
  • Re: Reading compressed files
    ... (especially unix compress). ... HEADER BODY and FOOTER by open function. ... What you need to do is research each syntax and ...
    (comp.lang.python)
  • Re: No exceptions?
    ... I had in mind that an "empty" header would be assumed. ... I'm still not sure whether relations called TRUE and FALSE would cause confusion with the REAL TRUE and FALSE.) ... with the confidence that your expressions are still correct. ...
    (comp.databases.theory)
  • Re: attribute name conflicts
    ... Regarding "capacity", I think I'd prefer for an app to use the three ... would be that I find it helpful to have as much transparency as possible but a technical reason might be that then I wouldn't need to deal with "exceptions" (I presume that an expression like "A JOIN B" where A and B use the "capacity" attribute name for different types would, in the purest sense, be considered not well-formed, ie., it would not be strictly logical to give a result such as "empty" or "false" for such an expresssion.) ... Whether such a join would cause an exception is a matter of applied psychology and not theory. ... First, assume the header of relation A is and that of relation B is the same. ...
    (comp.databases.theory)