Re: blocking robots.txt from non-robots
- From: Don <lostinspace@xxxxxxxxxxxxxxxx>
- Date: Fri, 22 Feb 2008 01:52:59 GMT
Joe Fox <ny152@xxxxxxxxxxxx> wrote in
news:Xns9A4B60538A9A891563@xxxxxxxxx:
John Bokma <john@xxxxxxxxxxxxxxx> wrote in
news:Xns9A4A915154EB7castleamber@xxxxxxxxxxx:
Joe Fox <ny152@xxxxxxxxxxxx> wrote:
John Bokma <john@xxxxxxxxxxxxxxx> wrote in
news:Xns9A4A4758D9F83castleamber@xxxxxxxxxxx:
Joe Fox <ny152@xxxxxxxxxxxx> wrote:
I'm using a robots.txt file to control what is and is not crawled
by search engine bots but I'd like to block anything that isn't a
known search engine bot doesn't get the file I'm feeding to
google, yahoo and the others.
Why?
I can imagine that you want to block your entire site for any bot
that's known to be abusive though, but those probably don't check
your robots.txt anyway.
Perhaps I didn't say it right. I'm wanting to block the robots.txt
that I'm feeding search engines from being given to anybody else.
Why? If the reason is that you want to "protect" some folders: it's
not secure and bound to fail sooner or later. Remember that not all
bots honor the robots.txt, especially not the ones that you don't
want on your site in the first place.
I want to keep certain humans from reading the robots.txt that I give
to search engines because it's none of their bloody business what
pages I tell SE's not to index and there are a few that might have
mind enough to look at robots.txt They will not however expect to be
handed a tailored version of it.
I
realize that they *could* spoof the SE's user agent or something,
but my concerns are bright enough to look for robots.txt but not
bright enough to expect to be handed a phoney
You want to hide the key under the doormat which has in 5 languages
"The key is hidden nearby" written on top...
Not really, or is it possible that they could also get my .htaccess?
I didn't think that was possible. If they ask for a robots.txt and
get one that's got nothing more than a pointer to a sitemap that will
satisfy 'em.
The most effective way to do this is not allow the option of vieweing
robots.txt for denied IP ranges within htaccess.
As far as denying robots.txt to the entire general public?
It's a bad practice as the majority of the GP never even heard of
robots.txt
.
- References:
- blocking robots.txt from non-robots
- From: Joe Fox
- Re: blocking robots.txt from non-robots
- From: John Bokma
- Re: blocking robots.txt from non-robots
- From: Joe Fox
- Re: blocking robots.txt from non-robots
- From: John Bokma
- Re: blocking robots.txt from non-robots
- From: Joe Fox
- blocking robots.txt from non-robots
- Prev by Date: Re: blocking robots.txt from non-robots
- Next by Date: Re: blocking robots.txt from non-robots
- Previous by thread: Re: blocking robots.txt from non-robots
- Next by thread: Re: blocking robots.txt from non-robots
- Index(es):
Relevant Pages
|