Re: Keywords in bibliography database
- From: "Allen Browne" <AllenBrowne@xxxxxxxxxxxxxx>
- Date: Mon, 20 Mar 2006 08:29:57 +0800
The concept of the UNION query it is to build a SELECT for each keyword
being used, and return it in a monster query that has duplicates of the
article, e.g.:
SELECT ArticleID, Keyword FROM tblArticleKeyword WHERE Keyword = 'cat'
UNION ALL
SELECT ArticleID, Keyword FROM tblArticleKeyword WHERE Keyword = 'fish'
UNION ALL
SELECT ArticleID, Keyword FROM tblArticleKeyword WHERE Keyword = 'mule';
You can then create another query that counts how often an article occurs in
the results:
SELECT ArticleID Count(Keyword) AS Ranking
FROM Query1
GROUP BY ArticleID
ORDER BY Count(Keyword) DESC;
For a more elaborate ranking, you could assign a weighting to each keyword.
While I have not used Lyle's suggestion, I quite like the concept of putting
to work the algorithm built into the operating system instead of having to
code your own. It might not be possible for you if you cannot get the
documents into the computer, or there might be other problems with the
approach (such as the way the operating system approaches known file types
and fails to handle zip files as expected), but it could be worth
investigating.
--
Allen Browne - Microsoft MVP. Perth, Western Australia.
Tips for Access users - http://allenbrowne.com/tips.html
Reply to group, rather than allenbrowne at mvps dot org.
"Nenad Loncarevic" <nash@xxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:v2jr12dmu4ljhpn7a1i79gbsscojninj77@xxxxxxxxxx
On Sat, 18 Mar 2006 21:05:50 +0800, "Allen Browne"
<AllenBrowne@xxxxxxxxxxxxxx> wrote:
10 different combos where the user can select a keyword is workable if you
want to ensure the user is limited to only selecting known keywords.
That is exactly what I intend.
The IN operator is also useful to avoid heaps of OR operators, e.g.:
Category IN ('cat', 'fish', 'mule')
rather than:
Category = 'cat' OR Category = 'fish' OR Category = 'mule'
Thanks, I'll surely use it.
Ranking is a big area on it own. If you want to know which keywords the
match and therfore how many matches, you might need to return matches for
each keyword, combine them, and then deduplicate with count. That could be
a
monster UNION ALL query that becomes the source for a Totals query to give
the count. If you actually want more complex ranking, you might end up
writing each keyword's result to a temp table, and running the Totals
query
to count and sum the ranking values.
This is where you lost me a bit. Could you elaborate on this if it's
not too much trouble? Or (just a suggestion) could you maybe write an
article on this (when you find the time) and put it on your Web site?
I'm sure I'm one of many people that find it extremly useful, along
with your posts in this newsgroup. Your explanations are always
comprehensive, thorough and to the point (this is not sucking up).
Your structure for author is correct. Unfortunately the SQL language is
not
good at combining, so I suspect most of us use the approach suggeted in
this
article:
http://www.mvps.org/access/modules/mdl0004.htm
You can call the function in the query (though for some reason the last
couple of versions of Access run user-defined functions more slowly in the
context of queries.)
I'll read it for sure. If I have any questions about it I'll ask.
With authors, there can be a priority of ordering them, i.e. the lead
author
is listed first. Your junction table will need to specify the order of the
authors for the publication, and your ranking might want to take that into
account also.
The authors are not going to be part of criteria, so I don't expect
that to affect ranking.
Hope that is of some use to you.
It most certainly is, thanks very much.
Nenad
.
- Follow-Ups:
- Re: Keywords in bibliography database
- From: Nenad Loncarevic
- Re: Keywords in bibliography database
- References:
- Keywords in bibliography database
- From: Nenad Loncarevic
- Re: Keywords in bibliography database
- From: Allen Browne
- Re: Keywords in bibliography database
- From: Nenad Loncarevic
- Keywords in bibliography database
- Prev by Date: Re: concatenate records
- Next by Date: Re: huge MDB with very little in it
- Previous by thread: Re: Keywords in bibliography database
- Next by thread: Re: Keywords in bibliography database
- Index(es):
Relevant Pages
|