Re: Self Joins and optimization




"Brian Selzer" <brian@xxxxxxxxxxxxxxxxxxx> wrote in message
news:JBe%h.16839$YL5.304@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx

"David Cressey" <cressey73@xxxxxxxxxxx> wrote in message
news:Vg5%h.166$83.81@xxxxxxxxxxx

"Brian Selzer" <brian@xxxxxxxxxxxxxxxxxxx> wrote in message
news:MQ2%h.4456$H_.2052@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx

"David Cressey" <cressey73@xxxxxxxxxxx> wrote in message
news:ZE%_h.163$D52.68@xxxxxxxxxxx

Here's a maybe not-so-simple example:

Given a table with the columns,

EMPLOYEE#, JOB#, STEP#, MACHINE#, TX_TIME, TX_TYPE

where TX_TYPE can be either 'ON' or 'OFF' and the key is the entire
heading,
The table definition:

CREATE TABLE TX
(
EMPLOYEE# INT NOT NULL,
JOB# INT NOT NULL,
STEP# INT NOT NULL,
MACHINE# INT NOT NULL,
TX_TYPE CHAR(3) NOT NULL
CHECK (TX_TYPE = 'ON' OR TX_TYPE = 'OFF'),
TX_TIME DATETIME NOT NULL,
PRIMARY KEY (EMPLOYEE#, JOB#, STEP#, MACHINE#, TX_TYPE, TX_TIME)
)

A little bit of test data:

INSERT TX VALUES (1,1,1,1,'ON', '2007-05-03 08:00:00')
INSERT TX VALUES (1,2,1,1,'ON', '2007-05-03 08:00:00')
INSERT TX VALUES (1,1,1,1,'OFF', '2007-05-03 09:00:00')
INSERT TX VALUES (1,2,1,1,'OFF', '2007-05-03 09:30:00')
INSERT TX VALUES (1,1,1,1,'ON', '2007-05-03 10:00:00')
INSERT TX VALUES (1,1,1,1,'OFF', '2007-05-03 11:00:00')
INSERT TX VALUES (1,1,2,1,'ON', '2007-05-03 11:30:00')
INSERT TX VALUES (1,1,2,1,'OFF', '2007-05-03 12:00:00')
INSERT TX VALUES (1,2,2,1,'ON', '2007-05-03 13:00:00')
INSERT TX VALUES (1,2,2,1,'OFF', '2007-05-03 14:00:00')


Except for droping the pound signs, and not putting in the NOT NULL
constyraints, my TX Table is the same as the above.

Ok, I have another step in producing a set oriented solution to the not so
simple example given.

Ok, I have another step in a comparison between set oriented as procedural
solutions for the labor cost example problem. This time, I created
"segments" as a view off of TX, the table that you (Brian) supplied as the
sample inputs.

Here's what I built:

I added a tiny table to store the translation between 'ON' and 1 and 'OFF'
and -1.

CREATE TABLE TXREF
(TX_TYPE CHAR(3),
TX_OPEN INT);

INSERT TXREF VALUES ('ON', 1);
INSERT TXREF VALUES ('OFF', -1);

and a view that delivers TX_OPEN instead of TX_TYPE.

CREATE VIEW TXR AS
SELECT TX.EMPLOYEE, TX.JOB, TX.STEP, TX.MACHINE, TXREF.TX_OPEN, TX.TX_TIME
FROM TXREF INNER JOIN TX ON TXREF.TX_TYPE = TX.TX_TYPE;

Next, I created the Segments view, that presents the same data as in my
last example, but this time derived from TX (and TXREF).

CREATE VIEW SEGMENTS AS

SELECT
A.EMPLOYEE,
A.JOB,
A.STEP,
A.MACHINE,
B.TX_TIME AS START_TIME,
MIN (C.TX_TIME) AS STOP_TIME,
(SELECT SUM(D.TX_OPEN) FROM TXR AS D WHERE
D.EMPLOYEE = A.EMPLOYEE AND D.TX_TIME <= B.TX_TIME) AS PAIRING

FROM (TXR AS A
INNER JOIN TX AS B
ON (B.EMPLOYEE = A.EMPLOYEE) AND (B.TX_TIME >=
A.TX_TIME))
INNER JOIN TX AS C
ON (C.TX_TIME > B.TX_TIME) AND (C.EMPLOYEE = A.EMPLOYEE)

GROUP BY
A.EMPLOYEE,
A.JOB,
A.STEP,
A.MACHINE,
B.TX_TIME

HAVING SUM (A.TX_OPEN) > 0;


While this passes the test data, it still needs a little work. One of the
INNER JOINs should be an outer join, in order to deal with the case where a
job step is still running when the data is pulled. Then the current time
should be coalesced into the stop time, to replace resulting NULLS that
come out the outer join. The test data has no incomplete job steps, so this
error didn't affect my test run. There could be other bugs.



Then, finally, there's the query that summarizes the labor costs, as
before:

CREATE VIEW LABOR_COSTS AS
SELECT
SEGMENTS.JOB,
SEGMENTS.STEP,
SUM(DATEDIFF("S",START_TIME,STOP_TIME)/3600/PAIRING) AS LABOR
FROM SEGMENTS

GROUP BY SEGMENTS.JOB, SEGMENTS.STEP;

The next step might be to collect enough data to analyze performance, and
run it side by side against a procedural solution. I don't know how the
race is going to go, but with the right indexes, a six way join in Oracle
doesn't waste enough time to bother with further greasing of the code, IMO.

I suppose, if I get down to analyzing performance, it would behoove me to
download a copy of SQL Server Express, so I could have something better than
MS Access to run on. I may just leave this effort where it is. Brian isn't
going to be persuaded in any event, and no one else in the NG has expressed
interest in it.






.



Relevant Pages

  • Re: WHERE clause applies to right-hand table of LEFT JOIN
    ... it's always best to specify INNER JOIN ... T1_PK int NOT NULL ... CONSTRAINT FK_T2_T1 FOREIGN KEY ... The SQL Server cost-based optimizer is very good at generating efficient ...
    (microsoft.public.sqlserver.server)
  • Re: WHERE clause applies to right-hand table of LEFT JOIN
    ... it's always best to specify INNER JOIN ... T1_PK int NOT NULL ... CONSTRAINT FK_T2_T1 FOREIGN KEY ... The SQL Server cost-based optimizer is very good at generating efficient ...
    (microsoft.public.access.queries)
  • Re: adp report does not recognize fields from stored procedure
    ... CREATE PROCEDURE dbo.rp_rptLckrAssgn(@plngAcadYrID int) ... declare @lngClssOfHold int,@lngClssOfGrad int, @lngClssOfID1st int, ... select DgObj.lngDgObjID, @strMsgHumGen as strMsg, ... from tblDgObjStts DgObj inner join ...
    (microsoft.public.access.adp.sqlserver)
  • Need help on a SProc
    ... > @ID_QUESTIONNAIRE INT, ... >IDCYCLE int, ... >inner join t_familles as famille3 on ...
    (microsoft.public.sqlserver.server)
  • RE: How to efficiently create test data?
    ... use bcp,but you need generate your test data before. ... > I want to learn about performance tuning in SQL and therefore need some test ... > declare @i int ...
    (microsoft.public.sqlserver.programming)