Re: count occurance of a word/string in the body of an HTML page
- From: Dr J R Stockton <reply0935@xxxxxxxxxxxxxxxxxx>
- Date: Fri, 28 Aug 2009 20:09:21 +0100
In comp.lang.javascript message <aec1b339-3206-4aa8-b374-7943f02aee3f@c2
9g2000yqd.googlegroups.com>, Thu, 27 Aug 2009 11:16:27, Question Boy
<question.boy@xxxxxxxxxxx> posted:
I'm trying to find an easy way to count how many time a given word
appear on a webpage. For instance, I would like to be able to count
the number of occurance of the word 'Accepted', how would I go about
this?
No, occurrences.
If the Web page is not yours, you can take a copy of the source and work
on that, so one can assume source to be available. However,
straightforwardly counting words in the source is not going to give,
reliably, the right answer. The word may appear in comment, or within
HTML tags, or in JavaScript or VBScript; and code may write it
conditionally or repeatedly. The word may be in an undisplayed or
hidden part of the page. The word may be generated by included script,
and not be in the source at all. The word may be computed - consider
what document.write( ['mk'+'op', '\x44um'].reverse().join("")+"f" )
might give.
You wrote "appear on a webpage". Display the web page, use Select All
and Copy; then paste it into something which can count words. I think
MS Word can do it; alternatively, you can paste it into a textarea and
match its value property with a well-chosen RegExp. See in my
<URL:http://www.merlyn.demon.co.uk/js-valid.htm>.
You will need to be very careful to see that you implement an
appropriate definition of a word. Will, for example, the word "Accep-
ted" be found? If looking for "paw", should it be found in "cat's-paw"?
Given what you wrote above, should you also be looking for alternative
spellings?
--
(c) John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME.
Web <URL:http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, & links.
Proper <= 4-line sig. separator as above, a line exactly "-- " (SonOfRFC1036)
Do not Mail News to me. Before a reply, quote with ">" or "> " (SonOfRFC1036)
.
- References:
- count occurance of a word/string in the body of an HTML page
- From: Question Boy
- count occurance of a word/string in the body of an HTML page
- Prev by Date: Re: count occurance of a word/string in the body of an HTML page
- Next by Date: Browser/debugger/tester?
- Previous by thread: Re: count occurance of a word/string in the body of an HTML page
- Next by thread: Re: count occurance of a word/string in the body of an HTML page
- Index(es):
Relevant Pages
|