Re: untypical problem



"dgront" <dgront@xxxxxxxxxxxxxx> wrote in message
news:1121536482.214930.39800@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> That problem comes from biosciences, here I present a simpler version.
>
> The story is about beads.
>
> * A single necklace is composed of a couple of strings with beads,
> appx. 3-10 strings in one necklace
> * A single string of beads is composed of several hundreds of beads of
> different shapes and colours
> * Beads in each string are ordered, strings in a given necklace are
> ordered
> * Beads can have many properties, eg. colour spots (number of the spots
> and their colours can vary)
>
> I have a SQL database for all my 20 000 neckaces, containing about 100
> 000 strings and 30 000 000 beads. My common queries are:
> - which necklace have at least 1 string with no more than 200 yellow
> beads
> - which beads are green and cubic-like and have at least 5 brown spots
>
> And here is the QUESTION:
> Is it possible to ask my database a question concerning the order of
> beads in a string? For example I'd like to ask about any combination
> like regular expression:
> RBY..R*RRR
> which means redbead, then blue bead, then yellow, then any two beads
> (meta sign >.<), then red again, then any number of any beads followed
> by three red beads. Of course I need to use anny properties stored in
> the database, not only the colours of beads (shape, number of spots,
> colours of spots, etc.).
>
> I can rebulit my database from scratch and adjust its structure, I am
> just asking if it is possible with SQL
>
> Thanks in advance
> Dominik

This is completely doable in SQL, however, efficiency, both of man and
machine, could be an issue. Remember that a table represents a relation
which is a set and, hence, an unordered collection, that is, there's no meaning
to asking for the first row or next row. The data you're representing is ordered
and the more interesting queries you wish to pose, ones that can be conceived
as regular expressions, very much use that order. That being said, you might find
this interesting if not helpful (offered untested so caveat lector).

-- Colors beads can have
CREATE TABLE BeadColors
(
bead_color VARCHAR(20) NOT NULL PRIMARY KEY
)

-- Shapes beads can have
CREATE TABLE BeadShapes
(
bead_shape VARCHAR(20) NOT NULL PRIMARY KEY
)

CREATE TABLE Beads
(
bead INT NOT NULL PRIMARY KEY,
bead_shape VARCHAR(20) NOT NULL REFERENCES BeadShapes (bead_shape),
bead_color VARCHAR(20) NOT NULL REFERENCES BeadColors (bead_color)
)

CREATE TABLE BeadSpotColors
(
bead_spot_color VARCHAR(20) NOT NULL PRIMARY KEY
)

-- Assumption: a single bead can have spots of an arbitrary number of
-- colors, including 0
CREATE TABLE BeadSpots
(
bead INT NOT NULL REFERENCES Beads (bead),
bead_spot_tally INT NOT NULL CHECK (bead_spot_tally >= 1),
bead_spot_color VARCHAR(20) NOT NULL
REFERENCES BeadSpotColors (bead_spot_color),
PRIMARY KEY (bead, bead_spot_color)
)

CREATE TABLE BeadStrings
(
bead_string INT NOT NULL PRIMARY KEY
)

-- The ordered beads of each string of beads
-- Assumption: The beads in a string are numbered consecutively from 1
CREATE TABLE BeadStringBeads
(
bead_string INT NOT NULL REFERENCES BeadStrings (bead_string),
bead INT NOT NULL REFERENCES Beads (bead),
bead_nbr INT NOT NULL CHECK (bead_nbr >= 1),
PRIMARY KEY (bead_string, bead_nbr)
)

CREATE TABLE Necklaces
(
necklace INT NOT NULL PRIMARY KEY
)

-- The ordered strings of beads of each necklace
-- Assumption: The strings of beads in a necklace are numbered
-- consecutively from 1
CREATE TABLE NecklaceBeadStrings
(
necklace INT NOT NULL REFERENCES Necklaces (necklace),
bead_string INT NOT NULL REFERENCES BeadStrings (bead_string),
bead_string_nbr INT NOT NULL CHECK (bead_string_nbr >= 1),
PRIMARY KEY (necklace, bead_string_nbr)
)

-- Necklaces that have at least 1 string with <= 200 yellow beads
SELECT necklace
FROM Necklaces AS N
WHERE EXISTS (SELECT NS.bead_string
FROM NecklaceBeadStrings AS NS
INNER JOIN
BeadStringBeads AS SB
ON NS.necklace = N.necklace AND
NS.bead_string = SB.bead_string
INNER JOIN
Beads AS B
ON SB.bead = B.bead AND
B.bead_color = 'yellow'
GROUP BY NS.bead_string
HAVING COUNT(*) <= 200)

-- Beads that are green, cubic and have >= 5 brown spots
SELECT B.bead
FROM Beads AS B
INNER JOIN
BeadSpots AS BS
ON B.bead_color = 'green' AND
B.bead_shape = 'cubic' AND
BS.bead = B.bead AND
BS.bead_spot_tally >= 5 AND
BS.bead_spot_color = 'brown'

-- Bead strings that contain the bead color substring RBY..R*RRR
-- One could certainly write code to automatically generate the
-- equivalent SQL from a regular expression style of specification
-- as doing it manually is lengthy and error prone
SELECT bead_string
FROM BeadStrings AS S
WHERE EXISTS
(SELECT *
FROM BeadStringBeads AS SB1
INNER JOIN
Beads AS B1
ON SB1.bead_string = S.bead_string AND
B1.bead = SB1.bead AND
B1.bead_color = 'red' -- a red bead
INNER JOIN
BeadStringBeads AS SB2
ON SB2.bead_string = SB1.bead_string AND
SB2.bead_nbr = SB1.bead_nbr + 1 -- next bead on string
INNER JOIN
Beads AS B2
ON B2.bead = SB2.bead AND
B2.bead_color = 'blue' -- a blue bead
INNER JOIN
BeadStringBeads AS SB3
ON SB3.bead_string = SB2.bead_string AND
SB3.bead_nbr = SB2.bead_nbr + 1 -- next bead
INNER JOIN
Beads AS B3
ON B3.bead = SB3.bead AND
B3.bead_color = 'yellow' -- a yellow bead
INNER JOIN
BeadStringBeads AS SB4
ON SB4.bead_string = SB3.bead_string AND
-- skip next 2 beads whose color are irrelevant
SB4.bead_nbr = SB3.bead_nbr + 3
INNER JOIN
Beads AS B4
ON B4.bead = SB4.bead AND
-- next bead is red
B4.bead_color = 'red' AND
-- need to find 3 consecutive red beads at some
-- subsequent point on string
EXISTS
(SELECT SB5.bead_nbr
FROM BeadStringBeads AS SB5
INNER JOIN
BeadStringBeads AS SB6
ON SB5.bead_string = SB4.bead_string AND
SB5.bead_nbr > SB4.bead_nbr AND
SB6.bead_string = SB5.bead_string AND
SB6.bead_nbr BETWEEN
SB5.bead_nbr AND
SB5.bead_nbr + 2
INNER JOIN
Beads AS B6
ON B6.bead = SB6.bead AND
B6.bead_color = 'red'
GROUP BY SB5.bead_nbr
HAVING COUNT(*) = 3))

As you can appreciate, for complicated "regular expressions", this could be
tiresome to write manually and performance may not be blazing though I'd be
interested in hearing your experiences should you experiment with this approach
on your actual data.

--
JAG


.



Relevant Pages

  • Re: OT/jewelry
    ... really enjoy doing earrings, I've found). ... the bead & make it into a link, so that the necklace looks a bit like ... The only thing I don't care for is the clasp. ... The puffy diamond beads are pretty. ...
    (alt.true-crime)
  • Re: problem with a necklace sequence
    ... I believe it means that if you take any necklace, reverse its order, ... So the 6 necklaces with 6 beads are ... "Aperiodic" apparently means that the n-bead sequence may not consist ...
    (sci.math)
  • Re: AD: Added some amethyst and green seed beads...
    ... Hee hee, that's true! ... they'd make a nice necklace or bracelet. ... These are a nice Christmas green, my other green seed beads ... I'm on a total amethyst kick now - back in summer I was stuck in the ...
    (rec.crafts.beads)
  • Re: Pricing question and critique needed...
    ... to break cable last night (I might start smashing beads this weekend to ... I made a necklace of Tigereye cubes and bali for my daughter and it came back broken. ... When I first started making necklaces I obsessed about them breaking and used clamshell bead tips and knotted the ends before and after, on the assumption that the link part of the clamshell would be the weakest link and the breaking point. ...
    (rec.crafts.beads)
  • Re: Beading help, please (not OT)
    ... Most of what they discuss is attaching strings of beads -different kinds ... enough for the needle. ...
    (rec.crafts.textiles.quilting)