Re: Dutchland



[dk.kultur.sprog dropped from crosspost because my server doesn't carry
it]

Donna Richoux wrote:

...In fact the explanation is even simpler than that. I skipped to the
end of the Google listings to see what was showing up, and there were
tons of that "word salad" sort of post. Somehow, "Dutchland of
quaffing compare" is repeated on hundreds of thousands of those pages:

[... which quote, it turns out, comes from a 1579 article by Stephen
Gosson, according to Father Ignatius elsewhere in this thread]

I've seen quite a few cases like that now. Do you recall the one about
"clothes make(th) the man", where about 30,000 out of 110,00 hits were
from a full passage of Mark Twain? And where the .uk search gave results
that conflicted with the unrestricted search, as noted by:

[quote from older thread, Sept 2005]
Mark Brader:
"clothes make the man" 111,000
"clothes maketh the man" 834

"Ray":
Site: co.uk
"clothes maketh the man" 1620
"clothes make the man" 840
[end quote]

Repeating now the search on "clothes make the man naked people" gives
about 55,500.

It is almost impossible to compensate for this sort of thing because of
the way Google counts are not additive. If you search on "dutchland
+quaffing" and "dutchland -quaffing" and add them together, you usually
won't get the same as searching on "dutchland" alone. And on top of
that, the country domain searches can be wildly different again, as
above.

Why do 30,000 people put up web sites with the full text of a Twain
piece? Or hundreds of thousands an article by Gosson? I guess it's part
of a strategy to get the sites multiply-indexed by the search engines.
Not many people understand all the ins and outs of how that all works. I
certainly don't, but I do know that what Google does is far from a
simple count of all the content on web pages. The counts are weighted by
all sorts of factors, many of which are probably proprietary secrets.

So, in short, I don't see how anyone can put much trust in Google
numbers.

--
Regards
John
for mail: my initials plus a u e
at tpg dot com dot au

.



Relevant Pages

  • Re: And AGAIN Google screwed up
    ... see the search beginning to start and then a slight pause as some sort ... my machine has some sort of infection it's one that none of the ... Cookies, ... I think this virus has little to do with Google. ...
    (alt.usage.english)
  • Re: Ask EU: what is that thing called .....
    ... I am trying to find out what sort of thing might be ... various combinations on Google but as usual I am just not very ... My mate's daughter has one and I'm on the emergency telephone list ... of whom have a key to the bungalow. ...
    (uk.media.radio.archers)
  • Re: Refrain: I need a good news server/reader
    ... have to go to Google Groups and dig through that mess to keep current. ... it should give you the option to toggle the default sort to that "sort ... by date" mode instead of requiring you to manually do it every time ... you visit a thread with more than ten posts. ...
    (misc.writing)
  • Re: AJAX vs. JSON : Google Trends
    ... to post this sort of thing don't you think you should make some ... AJAX and JSON possible usage trends is a noise at c.l.j.? ... I am reminded that you once posted server logs to ... There is no certain relationship between what google tends reports (the ...
    (comp.lang.javascript)
  • Re: Ask EU: what is that thing called .....
    ... alarm, alert, personal and medical in various combinations on Google ... coming up with nothing resembling the sort of thing I've seen on tv. ... My mate's daughter has one and I'm on the emergency telephone list (but ... list of people who may be called, all of whom have a key to the bungalow. ...
    (uk.media.radio.archers)