Re: SQL: Working with huge tables of chronological data
- From: "charely" <nospam@xxxxxxxxx>
- Date: Sun, 8 Apr 2007 10:08:35 +0200
"John" <acide.ascorbique@xxxxxxxxx> schreef in bericht
news:1175788930.059725.19220@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Thanks for your answer,
Here are the technical details and the query I've been using so far.
TableA is ~100 millions row and contains (timestamp, evtA)
TableB is ~30 millions row and contains (timestamp, evtB)
The following query took ~60h (on a private but quite slow server) to
compute. ~1h is what I'm aiming to.
select TA1_B.evtA, TA2.evtA
from
(
select TA1.evtA, TA1.timestamp timeA1, TB.evtB, min(TB.timestamp)
min_timeB
from tableA TA1 left outer join tableB TB on (TA1.timestamp <
TB.timestamp)
group by TA1.evtA, TA1.timestamp, TB.evtB
) TA1_B,
tableA TA2
where
TA1_B.timeA1 < TA2.timestamp
and (TA2.timestamp < TA1_B.min_timeB or TA1_B.min_timeB is null)
and TA1_B.evtA <> TA2.evtA;
Thanks!
John
On Apr 5, 10:51 am, "Ana C. Dent" <anaced...@xxxxxxxxxxx> wrote:
"John" <acide.ascorbi...@xxxxxxxxx> wrote in
news:1175783243.167337.195580
@n59g2000hsh.googlegroups.com:
Hi,
I'm trying to figure out an efficient way to search for the non
existence of events in chronological data with SQL. The goal (detailed
below) seems pretty simple but so far it looks like it's tricky to do
with Oracle. Here's my problem:
I'm working with 2 simple but huge tables each recording a different
kind of event associated with a timestamp. For instance:
Table A
(03:50pm, A1)
(03:55pm, A2)
(03:58pm, A3)
Table B
(03:51pm, B1)
(04:00pm, B2)
I'm looking for all the chronological sequences (Ax, Ay) where no B
event is present between Ax and Ay. In this example, the result would
I do not understand the logic that would lead the answer above;
since none of the timestamps in Table A match any timestamp in Table B.
be (A2, A3).
I've been searching actively for an efficient solution for this
problem and I couldn't find any fast enough. Do you have any idea?
How fast is fast enough?
How do we know what you tried & deemed unacceptable?- Hide quoted text -
- Show quoted text -
What about
select ta1.timestamp , ta2.timestamp from tablea ta1 , tablea ta2
where ta2.timestamp > ta1.timestamp and ta2.timestamp <=
nvl((select min(timestamp) from tableb b where b.timestamp >
ta1.timestamp) ,
(select max(timestamp) from tablea))
The nvl function is only needed to also catch events where no later event
in b exist.
I have not tested this for performance , but assuming you have indexes on
the timestamp
columns ( or using IOTs for the tables) , the optimizer will probably use
range scans on
those indexes
.
- Follow-Ups:
- Re: SQL: Working with huge tables of chronological data
- From: Charles Hooper
- Re: SQL: Working with huge tables of chronological data
- References:
- SQL: Working with huge tables of chronological data
- From: John
- Re: SQL: Working with huge tables of chronological data
- From: Ana C. Dent
- Re: SQL: Working with huge tables of chronological data
- From: John
- SQL: Working with huge tables of chronological data
- Prev by Date: Re: Configuration question
- Next by Date: Re: SELECT statement efficiency question
- Previous by thread: Re: SQL: Working with huge tables of chronological data
- Next by thread: Re: SQL: Working with huge tables of chronological data
- Index(es):
Relevant Pages
|