Re: Search efficiency - RDMS search versus free text search
- From: patrick61z@xxxxxxxxx
- Date: Fri, 23 Jan 2009 13:06:15 -0800 (PST)
On Jan 13, 6:10 am, Taras_96 <taras...@xxxxxxxxx> wrote:
Hi everyone,
I just had an interesting discussion with a colleague regarding how we
are implementing search. We allow users to browse resources under a
list of topics, so for example a list of all resources relating to
'cars'. One way of doing this is with a SQL query against the database
backend:
SELECT * FROM resources WHERE topic_area = 'cars'
The actual query is quite a bit more complicated than this (involves
table joins and the like).
However, he said that due to the table joins (coming from the fact
that the database is normalised), that this technique is quite
resource expensive. Instead, we are using the Google Search Appliance
to index our resources (delivered via the web), and creating the topic
pages by using Google to search on meta-data, which apparently results
in better performance.
This came to me somewhat as a suprise, as I was under the impression
that returning such result sets were one of the key uses of a RDMS -
the efficient storage and retrieval of records. I can understand the
performance issues if the search was a free text search, but the above
search is really searching against an indexed column value.
Is this often the case with enterprise level database searches - that
the database itself isn't used for searching because of performance
issues?
Cheers
Taras
Search like this is probably one of the areas that RDBMS's struggle
with, simply because the problem of keyword or text search is not what
relational database tech is all about.
Yes it can do it, but full text search is best designed outside of
relational algebra and like repeating groups in some of todays
database engines, the engine itself will do full text indexing for
you. And you can bet that underneath the hood, theres real programmers
implementing indexes that maximize performance.
An example I found is http://www.developer.com/db/article.php/3446891
that discusses how SQL Server 2008 bails on this task to an instance
of "Microsoft Search Service" and to me this makes perfect sense.
It wouldn't hurt to forget relational algebra long enough to read a
few articles about document indexing and the design decisions that
come into play. Along with that, be familiar with the types of indexes
in use on todays databases (that while are parts of relational
databases themselves, are definitely built using non relational
algorithms). I've found a basic understanding of indexing techniques
to be pretty helpful.
.
- References:
- Search efficiency - RDMS search versus free text search
- From: Taras_96
- Search efficiency - RDMS search versus free text search
- Prev by Date: Re: Best way to design table to store attributes?
- Next by Date: Re: Best way to design table to store attributes?
- Previous by thread: Re: Search efficiency - RDMS search versus free text search
- Next by thread: How would I do this sub-select?
- Index(es):
Relevant Pages
|
Loading