Re: Self Joins and optimization
- From: "David Cressey" <cressey73@xxxxxxxxxxx>
- Date: Mon, 14 May 2007 20:31:39 GMT
"Brian Selzer" <brian@xxxxxxxxxxxxxxxxxxx> wrote in message
news:JBe%h.16839$YL5.304@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
"David Cressey" <cressey73@xxxxxxxxxxx> wrote in message
news:Vg5%h.166$83.81@xxxxxxxxxxx
"Brian Selzer" <brian@xxxxxxxxxxxxxxxxxxx> wrote in message
news:MQ2%h.4456$H_.2052@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
"David Cressey" <cressey73@xxxxxxxxxxx> wrote in message
news:ZE%_h.163$D52.68@xxxxxxxxxxx
The table definition:Here's a maybe not-so-simple example:heading,
Given a table with the columns,
EMPLOYEE#, JOB#, STEP#, MACHINE#, TX_TIME, TX_TYPE
where TX_TYPE can be either 'ON' or 'OFF' and the key is the entire
CREATE TABLE TX
(
EMPLOYEE# INT NOT NULL,
JOB# INT NOT NULL,
STEP# INT NOT NULL,
MACHINE# INT NOT NULL,
TX_TYPE CHAR(3) NOT NULL
CHECK (TX_TYPE = 'ON' OR TX_TYPE = 'OFF'),
TX_TIME DATETIME NOT NULL,
PRIMARY KEY (EMPLOYEE#, JOB#, STEP#, MACHINE#, TX_TYPE, TX_TIME)
)
A little bit of test data:
INSERT TX VALUES (1,1,1,1,'ON', '2007-05-03 08:00:00')
INSERT TX VALUES (1,2,1,1,'ON', '2007-05-03 08:00:00')
INSERT TX VALUES (1,1,1,1,'OFF', '2007-05-03 09:00:00')
INSERT TX VALUES (1,2,1,1,'OFF', '2007-05-03 09:30:00')
INSERT TX VALUES (1,1,1,1,'ON', '2007-05-03 10:00:00')
INSERT TX VALUES (1,1,1,1,'OFF', '2007-05-03 11:00:00')
INSERT TX VALUES (1,1,2,1,'ON', '2007-05-03 11:30:00')
INSERT TX VALUES (1,1,2,1,'OFF', '2007-05-03 12:00:00')
INSERT TX VALUES (1,2,2,1,'ON', '2007-05-03 13:00:00')
INSERT TX VALUES (1,2,2,1,'OFF', '2007-05-03 14:00:00')
Except for droping the pound signs, and not putting in the NOT NULL
constyraints, my TX Table is the same as the above.
Ok, I have another step in producing a set oriented solution to the not so
simple example given.
Ok, I have another step in a comparison between set oriented as procedural
solutions for the labor cost example problem. This time, I created
"segments" as a view off of TX, the table that you (Brian) supplied as the
sample inputs.
Here's what I built:
I added a tiny table to store the translation between 'ON' and 1 and 'OFF'
and -1.
CREATE TABLE TXREF
(TX_TYPE CHAR(3),
TX_OPEN INT);
INSERT TXREF VALUES ('ON', 1);
INSERT TXREF VALUES ('OFF', -1);
and a view that delivers TX_OPEN instead of TX_TYPE.
CREATE VIEW TXR AS
SELECT TX.EMPLOYEE, TX.JOB, TX.STEP, TX.MACHINE, TXREF.TX_OPEN, TX.TX_TIME
FROM TXREF INNER JOIN TX ON TXREF.TX_TYPE = TX.TX_TYPE;
Next, I created the Segments view, that presents the same data as in my
last example, but this time derived from TX (and TXREF).
CREATE VIEW SEGMENTS AS
SELECT
A.EMPLOYEE,
A.JOB,
A.STEP,
A.MACHINE,
B.TX_TIME AS START_TIME,
MIN (C.TX_TIME) AS STOP_TIME,
(SELECT SUM(D.TX_OPEN) FROM TXR AS D WHERE
D.EMPLOYEE = A.EMPLOYEE AND D.TX_TIME <= B.TX_TIME) AS PAIRING
FROM (TXR AS A
INNER JOIN TX AS B
ON (B.EMPLOYEE = A.EMPLOYEE) AND (B.TX_TIME >=
A.TX_TIME))
INNER JOIN TX AS C
ON (C.TX_TIME > B.TX_TIME) AND (C.EMPLOYEE = A.EMPLOYEE)
GROUP BY
A.EMPLOYEE,
A.JOB,
A.STEP,
A.MACHINE,
B.TX_TIME
HAVING SUM (A.TX_OPEN) > 0;
While this passes the test data, it still needs a little work. One of the
INNER JOINs should be an outer join, in order to deal with the case where a
job step is still running when the data is pulled. Then the current time
should be coalesced into the stop time, to replace resulting NULLS that
come out the outer join. The test data has no incomplete job steps, so this
error didn't affect my test run. There could be other bugs.
Then, finally, there's the query that summarizes the labor costs, as
before:
CREATE VIEW LABOR_COSTS AS
SELECT
SEGMENTS.JOB,
SEGMENTS.STEP,
SUM(DATEDIFF("S",START_TIME,STOP_TIME)/3600/PAIRING) AS LABOR
FROM SEGMENTS
GROUP BY SEGMENTS.JOB, SEGMENTS.STEP;
The next step might be to collect enough data to analyze performance, and
run it side by side against a procedural solution. I don't know how the
race is going to go, but with the right indexes, a six way join in Oracle
doesn't waste enough time to bother with further greasing of the code, IMO.
I suppose, if I get down to analyzing performance, it would behoove me to
download a copy of SQL Server Express, so I could have something better than
MS Access to run on. I may just leave this effort where it is. Brian isn't
going to be persuaded in any event, and no one else in the NG has expressed
interest in it.
.
- References:
- Self Joins and optimization
- From: David Cressey
- Re: Self Joins and optimization
- From: Brian Selzer
- Re: Self Joins and optimization
- From: David Cressey
- Re: Self Joins and optimization
- From: Brian Selzer
- Self Joins and optimization
- Prev by Date: Re: set builder notation for relational division
- Next by Date: Re: Self Joins and optimization
- Previous by thread: Re: Self Joins and optimization
- Next by thread: Re: Self Joins and optimization
- Index(es):
Relevant Pages
|