Re: general question about how best to cache expensive query results
- From: Robert Klemme <shortcutter@xxxxxxxxxxxxxx>
- Date: Thu, 03 May 2007 10:27:17 +0200
On 03.05.2007 02:12, j.k. wrote:
I have a general question about how best to cache query results that
are too expensive to calculate each time they're needed.
The question is that when I have a query like:
SELECT
e.id, e.title
FROM
entries e
JOIN feeds f ON e.feed_id = f.id
WHERE
f.url = 'http://example.com'
AND e.id > (select id from entries where pub_date <
current_timestamp - interval '5 days' order by pub_date desc limit 1);
The last restriction in the query is the one that is too expensive to
calculate every time its needed. It's purpose is to restrict the
entries to those that are newer than a certain date, except that some
entries don't specify a date, so I use the fact that ids are in
ascending order in order of creation and find the highest id that's at
least 5 days old.
How do you do that if not all records have a pub_date? Consider
id pub_date
1 2007-01-01
2 2007-01-09
3 (NULL)
4 2007-02-01
Now, if your limit is 2007-01-17 you cannot know whether your max id is 3 or 2 - whatever you do, with this data you will always get 2 because the RDBMS cannot decide whether 3 was before or after 2007-01-17.
First of all, IMHO the sub query can be improved. Because what you really want is the MAX. So, I'd rather do
-- slightly different semantics, but should work according
-- to your description
select max(id)
from entries
where pub_date < current_timestamp - interval '5 days'
-- or, more like your original, in case ids might not be ordered
select max(id)
from entries, (
select max(pub_date) max_date
from entries
where pub_date < current_timestamp - interval '5 days'
) e
where entries.pub_date = e.max_date
Note, there might be other approaches - probably even more efficient ones.
Then, I'd do proper indexing. (Btw, you do not mention DDL - that would certainly help.)
Only if it is still slow then I'd consider "caching".
Regards
robert
.
- Follow-Ups:
- References:
- Prev by Date: Re: XML abuse?
- Next by Date: Re: escaping table/field names and values
- Previous by thread: general question about how best to cache expensive query results
- Next by thread: Re: general question about how best to cache expensive query results
- Index(es):
Relevant Pages
|
Loading