Re: Is it possible to rank articles for quality as Google claims?
- From: curt@xxxxxxxx (Curt Welch)
- Date: 20 Feb 2009 06:51:11 GMT
spamtrap.red@xxxxxxxxx wrote:
Curt Welch wrote:
spamtrap.red@xxxxxxxxx wrote:
Google claims that their "ranking" system is capable of increasing
the quality of search results and put higher quality articles at the
top of multi-million search result list.
What do you think about that?
How is it possible to analyze quality of code examples?
How is it possible to analyze the quality of any writing?
How is it possible to analyze quality of architecture, art,
music, literature or just about anything you can imagine?
What is AI deprtment's response to Google Garbage Generator
phenomenon?
Google doesn't rank the quality. Humans do.
Actually, they do make such a claim as described in a
Stanford paper by Google founders.
http://www.stanford.edu/class/cs240/readings/google.pdf
Google has just found clever
ways to measure (and predict) the ratings assigned by humans.How well
their techniques work is always up for debate, but that's what they do.
Not true. The most crytical aspect of their search engine is
"backlinks". That parameter alone was a central piece of their
"research"
at Stanford.
There success speaks much about how well it works. They showed up late
to the game of internet search engines (at a time when the game seemed
like it was over), and in a matter of only a few short years, totally
dominated the market. Believe it or not, that market domination
happened because their search engine was far better at extracting and
using those human rankings than any other at the time.
It has nothing to do with "human rankings".
Right. Keep telling yourself that. Who do you think created those "back
links"?
People.
The prime idea behind their technique is simple and ingenious. People
create links on their web site to other sites they think are valuable.
That's where the value measure comes from. By using those link as a voting
system, we can create a measure of quality of any web site or web page.
Statistically, it's more complex than counting links, because you get
better results by ranking the power of the "vote" by the "value" of the
voter. The more valuable the web site is, the more it's "vote" should
count in ranking other sites.
Calculating rank in this ways becomes a massive recursive matrix math
problem since the value of everything depends on the value of everything
else and we are dealing with billions of variables. I believe it's too
large to compute a direct solution and I believe an iterative algorithm is
used which quickly converges on a good approximation.
Though the math and implementation details are tricky, the concept behind
it is simple and elegant. Everyone that creates web pages gets to vote on
which web pages on the internet they think are valuable by the act of
adding links to their own web pages. Collect all that data and apply a bit
of statistics and you get a page ranking for every page on the internet -
where all "value" came from the human voters - aka the people who created
all the links.
My position on it could be reduced to a cople of words
Google Garbage Generator.
The system is highly discriminatorial. One of the heaviest weights
is place on a number of links pointing to some page, which, by
implication, means the bigger you are, the higher is your rank.
The fatter is your wallet, the higher on Grand Google Garbage list
you will end up and higher is your chance to be visited.
And the question is: how is it possible to assign some "quality"
rating to ANYTHING? According to what criterias and from what
point of view.
From the fact that humans are reinforcement learning machines and share asa whole, very similar reward generating hardware (due to the similarity in
our DNA across the species) and live (very roughly) in environments similar
enough to allow the optimal solution of the value problem to converge close
enough on similar values as to make the concept of "average value across
the population" to be in itself, valuable to humans.
In simpler terms, humans are value machines and the odds of something being
valuable to me is highly correlated with the things average value to the
population.
Or, to get it back closer to you words, the bigger the web site is, and the
more "bucks" they have, the higher the odds it has value to me, or any
random member of society.
There are, however, some people that tend to be "odd". This is just
expected to happen statistically. For these people, there desires (values)
simply do not correlate well with the values of society. It happens for
one of two reasons. Either their value hardware (genetics) is different
enough from the average human so as to make their needs (values) greatly
skewed from the population, or, they were raised in a environment so
different from their norm, that their learned value system becomes highly
skewed from the general population. In either case, they are screwed,
because they are forced to live in a society who's values are very for
match their own. And not only does the algorithm that Google uses fail to
be of much use them the, everything about society sucks for them. They
might want to sexually abuse children, or rape and murder, because it's
what they value the most in life. But they are stuck in a society that
rejects such behaviors.
Their not so famous ranking system has about 500k+ rules and
variables, which, by itself, is an indication that it would not be
possible to predict the outcome of such horrendously complicated
system.
Well, my understanding is that the "special" rules are not there because of
a fault in the logic of the basic ranking system. It's there because the
ranking system is easy to abuse once it becomes valuable to do so. That
is, sites that people don't want to see, but ass hole spammers do want you
to see anyway, are fairly easily bumped up by "cheating" the ranking
system. The special rules are there mostly to stop the abuse, not to fix a
ranking system that failed in the first place.
And some of those parameters go well into the territory of obsene.
They even assign weights to FONT sizes, supposedly indicating
that this information is more significant than the article text
itself.
According to what logic?
Probably based on some measure of abuse prediction. The same people that
are likely to use techniques to abuse their site ranking are the ones most
likely to use extra large font sizes for the same reason - to force more
people to read stuff they didn't want to read in the first place.
What is this all about anyway? Are you pissed your page isn't showing up
near the top or something?
--
Curt Welch http://CurtWelch.Com/
curt@xxxxxxxx http://NewsReader.Com/
.
- Follow-Ups:
- References:
- Is it possible to rank articles for quality as Google claims?
- From: spamtrap . red
- Re: Is it possible to rank articles for quality as Google claims?
- From: Curt Welch
- Re: Is it possible to rank articles for quality as Google claims?
- From: spamtrap . red
- Is it possible to rank articles for quality as Google claims?
- Prev by Date: Re: A current list of what AI cannot yet do?
- Next by Date: Re: Is it possible to rank articles for quality as Google claims?
- Previous by thread: Re: Is it possible to rank articles for quality as Google claims?
- Next by thread: Re: Is it possible to rank articles for quality as Google claims?
- Index(es):
Relevant Pages
|