Re: Oracle Text and searches across columns
- From: "Vladimir M. Zakharychev" <vladimir.zakharychev@xxxxxxxxx>
- Date: Mon, 4 Feb 2008 02:10:56 -0800 (PST)
On Feb 3, 11:34 pm, Jake <jgarfi...@xxxxxxxxxxxxx> wrote:
Say if I am making a database of college classes across many colleges,
where people can search and find classes.
So, for example, each class has a:
level - bachelors, graduate (for example)
college name- State U, or SU as short name, along with other colleges
semester - Fall 2008...
Category - Biology, Math, Music etc.
course number - BIO 101
description - a full text description of the course.
etc.
as you can imagine, all of these things are in many tables in many
columns. The "full description" that the user sees actually contains
information from a join of many columns.
What I would like is for the user to be able to search given a text
phrase, and given a list of hits from all classes where the "full
description" contains every one of the words they search for. The
words do not have to be in any particular order. Example searches:
"Biology 101" "BIO 101" "SU Math Fall 2008"
I haven't seen an Oracle Text example that handles this type of case.
What is the best way to do this? Can any one point me in the right
direction?
If "full description" is a "computed" column (that is, you dynamically
join relevant columns from different sources and present that to the
user,) the most straightforward solution is to create a USER_DATASTORE
with a PL/SQL procedure that will aggregate all this data into a CLOB
and present that to Oracle Text for indexing. For example:
CREATE TABLE CLASSMATES
( ID NUMBER(10)
,FIRST_NAME VARCHAR2(100)
,LAST_NAME VARCHAR2(100)
,MIDDLE_NAME VARCHAR2(100)
,....
,FULL_DESC CHAR(1) -- this is the placeholder column for
indexing
,CONSTRAINT PK$CLASSMATES#ID PRIMARY KEY (ID)
)
/
-- some related tables are built here...
....
-- create the USER_DATASTORE preference.
BEGIN
CTX_DDL.CREATE_PREFERENCE('MY_FULL_DESC_DATASTORE','USER_DATASTORE');
-- this is the procedure that will synthesize the document for
indexing
CTX_DDL.SET_ATTRIBUTE('MY_FULL_DESC_DATASTORE','PROCEDURE','SYNTHESIZE_FULL_DESC');
-- 'CLOB' is default if not specified, included here for
completeness
CTX_DDL.SET_ATTRIBUTE('MY_FULL_DESC_DATASTORE','OUTPUT_TYPE','CLOB');
END;
/
-- this is our document synthesizer
-- it MUST have two arguments: first is IN ROWID and
-- second is IN OUT NOCOPY <output_type>
-- names of the arguments are not important, only their types and
order
CREATE OR REPLACE PROCEDURE SYNTHESIZE_FULL_DESC(RID IN ROWID, C IN
OUT NOCOPY CLOB)
IS
-- DBMS_LOB.WRITEAPPEND is pretty expensive, so we will buffer
-- our output in this local variable.
sBuf VARCHAR2(32767 BYTE) := NULL;
-- a buffering string appender procedure, also automatically
-- adds space character to every appended token to delimit them.
PROCEDURE APPND( S IN VARCHAR2, bFlush IN BOOLEAN := FALSE )
IS
BEGIN
IF S IS NOT NULL AND LENGTHB(sBuf) + LENGTHB(S || ' ') > 32000
THEN
DBMS_LOB.WRITEAPPEND(C, LENGTH(sBuf), sBuf);
sBuf := S || ' ';
ELSE
sBuf := sBuf || S || ' ';
END IF;
IF bFlush THEN
DBMS_LOB.WRITEAPPEND(C, LENGTH(sBuf), sBuf);
sBuf := NULL;
END IF;
END APPND;
BEGIN
FOR X IN (SELECT * FROM CLASSMATES WHERE ROWID=RID) LOOP
APPND(X.FIRST_NAME);
APPND(X.LAST_NAME);
APPND(X.MIDDLE_NAME);
-- can be repeated several times for different detail tables
-- or you can join all details here, or you can join them in the
-- outer cursor.
FOR Y IN (SELECT * FROM SOME_DETAIL_TABLE WHERE CLASSMATE_ID=X.ID)
LOOP
APPND(Y.WHATEVER);
....
END LOOP;
END LOOP;
-- flush whatever is in the buffer to the CLOB
APPND(NULL, TRUE);
END SYNTHESIZE_FULL_DESC;
/
-- now we create the index on our "full description" placeholder
-- and specify our custom data store the data will come from.
CREATE INDEX CTX$CLASSMATES#FULL_DESC ON CLASSMATES.FULL_DESC
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS ('DATASTORE MY_FULL_DESC_DATASTORE SYNC(ON COMMIT)');
-- now we can query the base table and all details will be searched
-- as well thanks to our document synthesizer.
SELECT ID, SCORE(0) AS SCORE
FROM CLASSMATES
WHERE CONTAINS(FULL_DESC, 'my query terms', 0) > 0;
One important note is due here: if any data in any detail table is
changed, the base table should be updated as well, otherwise that
changed detail data will not be re-indexed because the index only
depends on the base table and will not "see" any changes of data it
actually indexes. A simple no-work update is sufficient:
UPDATE CLASSMATES SET FULL_DESC=FULL_DESC WHERE ID=:CHANGED_DETAIL_ID;
Despite the fact that this update doesn't change anything, Oracle Text
will still add the ROWID of "updated" column to the list of rows
pending indexing.
For more information: Oracle Text Reference for your Oracle release,
Google (I believe I posted similar solution to a similar problem a few
years back in this very group.)
Hth,
Vladimir M. Zakharychev
N-Networks, makers of Dynamic PSP(tm)
http://www.dynamicpsp.com
.
- References:
- Oracle Text and searches across columns
- From: Jake
- Oracle Text and searches across columns
- Prev by Date: Re: ora-600 with table...cast... on 10.2.0.3 / al32utf8 / linux
- Next by Date: Re: Oracle characterset confusion when storing encoded passwords
- Previous by thread: Oracle Text and searches across columns
- Next by thread: SQL query for this purpose
- Index(es):
Relevant Pages
|