Re: D3 Indices
- From: "Mark Brown" <mbrown@xxxxxxxxxxxxx>
- Date: Wed, 02 Aug 2006 07:09:18 GMT
"Peter McMurray" <excalibur21@xxxxxxxxxxx> wrote in message
news:bMWzg.5086$rP1.2589@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Hi
Recent discussions about indexing set me to thinking that I should once
again look at D3 Indices. So I did a simple test. The Invoices file
contains multi-valued attributes for the invoice lines. One line is the
key to the product file. We frequently wish to sort invoice sales by
Product Group and Product name, both of which are in the Product file. I
prefer F correlatives but popped in an "A" type for the exercise. The
result is fine for LIST and SORT BY-EXP. However any attempt to create an
index results in a crash. I originally posed this question when Pick
first did b-trees and got a silly answer that did not address the
question.
I don't know when that was, but it wasn't when I was running that part of
the engineering department.
Now that people have had years of practise can someone explain exactly
what the problem is here ( no theories please just an answer to the exact
question) In my opinion any b-tree that cannot deal with builds like this
is simply a waste of space.
You're a smart guy. You could have figured this one out:
<snip>
:CT DICT INVOICES GRPDESC
GRPDESC
001 S
002 0
003 Product Description
004
005
006
007
008 A13(TPRODUCTS;X;3;3]MR%3):13(TPRODUCTS;X;2;2)
009 L
010 32
LIST INVOICES GRPDESC
INVOICES Product Description.............
1*514336 050Longhaul Premium
052Brake & Clutch Fluid
1*514337 052Revtex Super 2T
SORT INVOICES BY-EXP GRPDESC GRPDESC<snip>
INVOICES Product Description.............
1*516787 002Unleaded Unmarked
1*516788 009Gas
1*514336 050Longhaul Premium
1*514338 050Longhaul Premium
:CREATE-INDEX INVOICES GRPDESC
Creating index on A13(TPRODUCTS;X;3;3]MR%3):13(TPRODUCTS;X;2;2)
0002457
[3] The verb 'MR%3):13(TPRODUCTS;X;2;2)' is not defined.
:
There are 2457 invoices in the test file with in excess of 4,500 lines so
it appears to have indexed on attribute 13 in full which would be pretty
silly as it is multi-valued.
Peter McMurray
1) It's not the indexing that is failing, it's the a-correlative in the
index. Try this:
A13(TPRODUCTS;X;3;2)(MR%3):13(TPRODUCTS;X;2;2)
2) Regardless how many indices there are, there are still only 2457
records.
Most problems with indexing in D3 comes from not understanding what indexes
are and what they are inteded to do. First and foremost, they are a list of
the item ids of the file in some particular sequence. They look a little
like this:
indexKey <am> id <vm> id2 <vm>...<idN> <sm> indexKey2 <etc>
If the index key is really long, or if there are thousands of Item IDs for a
single key (as with null keys) there can be problems when the leaves in the
b-tree split into branches. Index correlatives that call subroutines or do
multiple translates can cause problems if any of the pieces aren't there
when you need them. Regardless of how many branches and leaves there are in
the index, it is in fact one gigantic linked list.
That being said, the English pre-compiler looks at the WITH selection list
and if the first ADI has an index and the selection mode is EQUALity, AQL is
smart enough to use the index to speed up selection, particularly in the
sort pass, as the index is already in sequence and doesn't need to be
sorted.
A typical file select process starts with a frame of overflow, marches thru
the file sequentially group by group, selecting items that match and placing
ids into a dynamic array. Once the selection has been done, if required,
the sort pass takes place by using that list to re-read the items, gather
the sort criteria and then do a merging multi-list sort to arrive at a
sorted id list.
Using indexes, AQL uses a system level function to find the first index that
matches the criteria (eg. SELECT file WITH A1 = "CA") and then cruises the
links until the condition is no longer true, placing the item id's into the
list and using high-speed search technique simalar to how items are read and
GFE's detected. When the list is built, the sort phase is skipped and we're
ready to go. Typically the number of frames in the index is significantly
less than actual file size, so even selecting the entire file would be
faster just from simply working less.
Since the index is maintained automatically at the system level, for large
files, there's nothing better. Sometimes you can create "pseudo-"indexes to
facilitate reporting. For example and index like
An(state)(ml#3):n(district) would pre-sort the sales file by state by
district even before reporting is required.
Mark Brown
.
- Follow-Ups:
- Re: D3 Indices
- From: Peter McMurray
- Re: D3 Indices
- References:
- D3 Indices
- From: Peter McMurray
- D3 Indices
- Prev by Date: D3 Indices
- Next by Date: Re: Satan and the evil
- Previous by thread: D3 Indices
- Next by thread: Re: D3 Indices
- Index(es):
Relevant Pages
|