Re: Suggestions on how to efficiently search a large table
- From: bdbafh <bdbafh@xxxxxxxxx>
- Date: Thu, 15 Nov 2007 09:38:15 -0800 (PST)
On Nov 15, 9:35 am, chrism...@xxxxxxxxx wrote:
On Nov 15, 7:42 am, bdbafh <bdb...@xxxxxxxxx> wrote:
On Nov 14, 8:55 pm, chrism...@xxxxxxxxx wrote:
I have a billion+ row table that I need to do queries on.
The table looks something like this:
item
id number
process1_id number
process2_id number
process3_id char(35)
I need to perform multiple processes on each item. When I process an
item, I then set the corresponding column with the result of the
process.
I need to do queries like the following to determine which processes I
haven't completed yet:
select * from item where process1_id is null or process2_id is null
or process3_id is null
Any recommendations on how to index or query such a table to get good
performance? The cardinality of all the process*_id columns is very
high.
Thanks.
You've provided no information regarding:10g
- version of Oracle software
- edition of Oracle software10.2.0.3
- how many client processes will be hammering the table1
- how the table is being loaded
loaded from a comma delimited file exported from a SQL Server
database> - hardware, OS, clustering, RAC, server color
LS41, SE Linux, OCSF, RAC> - desired response time
< 2 seconds
Also, I should have added that I will be limiting the query with a
rownum < 1000 in the WHERE clause.
Ok.
The subject of this post is overly constrained.
Your issue is not just in searching the table for rows of data to be
processed.
It also includes processing rows and updating rows that have been
processed.
You mention that only one instance of a program will be processing
rows that need to be processed ... but how much do you want to bet
that somewhere down the road, additional resources will be devoted to
processing the rows of interest in order to reduce the overall
processing time?
A design that works just fine for a single process will likely hit
concurrency issues once it begins to be multiuser. Partitioning the
data is probably the way to go.
Back to your question -"how to efficiently search a large table?"
To borrow from others:
The fastest way to search a table is to not search it.
In other words you want to perform the search against a smallish index
to find the candidate rows.
Your problem definition states that the data is loaded with columns of
interest as null.
Oracle b*tree indexes do not index null values.
Oracle has support for function-based indexes that could be used to
provide an access path to rows of interest.
Rather than repeat content that is posted elsewhere, search for
articles on the use of function based indexes in Oracle 10g. you'll
need to use the same expression in the function-based index as in the
select statement.
Such indexes will add overhead to the loading operation.
Depending upon the load type, rate and frequency, you might exhibit
large amounts of undo being read in order to provide a consistent read
of the index(es).
Is the same session going to be performing both the inserts and the
selects/updates?
If so then concurrency won't be an issue.
If not, you might want to time things such that loads and processing
aren't running at the same time in order to minimize response time.
Follow-up questions:
is the Oracle Partitioning option available to you? (if so, use it)
how many rows per day are inserted?
what size batch are they inserted as?
how frequently are the batches inserted?
will Oracle sql loader be used to perform the inserts in bulk?
is bulk (array) processing being used if sql loader is not being used?
how many blocks can be read by the existing database server per
second?
This won't be just a matter of keeping a static index in the keep pool
for fast access, as the (function-based) index will be undergoing
maintenance with each row inserted, updated or deleted ... which will
generated undo and redo ... and changed blocks which will be written
out at checkpoints (and at other times, also).
The part that will be funny is that after you have all of this put
together and meeting your service level agreement for response
time ... reporting against the data will change and blow your resource
usage sky high.
good luck. have fun.
-bdbafh
.
- Follow-Ups:
- Re: Suggestions on how to efficiently search a large table
- From: chrism778
- Re: Suggestions on how to efficiently search a large table
- References:
- Suggestions on how to efficiently search a large table
- From: chrism778
- Re: Suggestions on how to efficiently search a large table
- From: bdbafh
- Re: Suggestions on how to efficiently search a large table
- From: chrism778
- Suggestions on how to efficiently search a large table
- Prev by Date: Re: Suggestions on how to efficiently search a large table
- Next by Date: Re: TimeZone in Oracle 8i...
- Previous by thread: Re: Suggestions on how to efficiently search a large table
- Next by thread: Re: Suggestions on how to efficiently search a large table
- Index(es):
Relevant Pages
|
Loading