Re: blocking robots.txt from non-robots



Don <lostinspace@xxxxxxxxxxxxxxxx> wrote in
news:Xns9A4BE192F75F7lostinspace123univer@xxxxxxxxxxxxxx:

Joe Fox <ny152@xxxxxxxxxxxx> wrote in news:Xns9A4BD1118343D891563@
127.0.0.1:

Don <lostinspace@xxxxxxxxxxxxxxxx> wrote in
news:Xns9A4BD360F498Clostinspace123univer@xxxxxxxxxxxxxx:

Joe Fox <ny152@xxxxxxxxxxxx> wrote in
news:Xns9A49EC6462EC5891563@xxxxxxxxx:


I'm using a robots.txt file to control what is and is not crawled
by search engine bots but I'd like to block anything that isn't a
known search engine bot doesn't get the file I'm feeding to google,
yahoo and the others.

From what I've read this could be done with .htacess but I've not
been able to make heads or tails out of that.

I'd really be grateful for some help here.

Thanks

Some tutorials
http://baremetal.com/gadgets/htaccess/ http://evolt.org/node/226
http://www.edginet.org/techie/website/htaccess.html

http://www.dimi.uniud.it/labs/documentazione/roxen/parsed/Challenger1.2
/U
ser/htaccess/htaccess.html
http://www.webhelpinghand.com/htaccess_deny.htm
http://www.javascriptkit.com/howto/htaccess.shtml
http://www.serverwatch.com/tutorials/article.php/10825_1127711_1
http://www.verio.com/support/documents/view_article.cfm?doc_id=3624


Some of those are familiar but I'll take a look at 'em anyway.

My big problem is I'm not a coder. Simple stuf I can handle but
figuring out docs and helps takes forever


Joe,
There are more beneficial forums for htaccess and Apache.
The Apache Server forum at Webmaster World is excellent and the
moderator makes a superb effort to assist far too many people.

The Search Engine Spider ID was the predecessor to the Apache as far
as
htaccess coding.

Rgistration is free to most forums.

I may be able to assist you, however my extensive use of htaccess has
been limited to the "KISS" thought.
When it comes to simulated-wildcards and complicated expressions, I'm
daft!


Thanks.. I'll check out the forums and reread the links you provided
along with John's suggested code

.



Relevant Pages

  • Re: blocking robots.txt from non-robots
    ... search engine bots but I'd like to block anything that isn't a known ... There are more beneficial forums for htaccess and Apache. ...
    (alt.internet.search-engines)
  • Re: Apache mod_rewrite vs custom 404 handler.
    ... "search engine friendly" URLs for a dynamically generated site is to ... Take a look at the way apache does it: ... If these "Custom 404 redirect" ... Plenty of people have success with Windows servers and asp. ...
    (alt.internet.search-engines)
  • Re: blocking robots.txt from non-robots
    ... known search engine bot doesn't get the file I'm feeding to ... that I'm feeding search engines from being given to anybody else. ... or is it possible that they could also get my .htaccess? ...
    (alt.internet.search-engines)
  • Re: blocking robots.txt from non-robots
    ... John Bokma wrote in ... known search engine bot doesn't get the file I'm feeding to google, ... that I'm feeding search engines from being given to anybody else. ...
    (alt.internet.search-engines)
  • Re: blocking robots.txt from non-robots
    ... known search engine bot doesn't get the file I'm feeding to google, ... yahoo and the others. ... that I'm feeding search engines from being given to anybody else. ...
    (alt.internet.search-engines)