Robots Exclusion standard, a forum discussion on RagePank SEO. Join us for more discussions on Robots Exclusion standard on our General SEO forum.
You must be logged in to post a reply
| |
admin
15 Feb 2006
Posts: 10
The robots.txt file is a simple text file placed in the root of your website, which search engine bots use to determine which parts of your website to visit, and which parts to ignore.
It can also exclude certain search engine bots from accessing the site, although the standard does rely on the bot following the standard.
Most webmasters use robots.txt to prevent bots from indexing the admin or private sections of their website. Care must be taken to get the format correct, or you can end up telling the bots not to index your entire website. Surely this has been the cause of some poorly ranking websites in the past.
typical usage...
There are much better resources available on the topic. The Wikipedia is a good place to start.
It can also exclude certain search engine bots from accessing the site, although the standard does rely on the bot following the standard.
Most webmasters use robots.txt to prevent bots from indexing the admin or private sections of their website. Care must be taken to get the format correct, or you can end up telling the bots not to index your entire website. Surely this has been the cause of some poorly ranking websites in the past.
typical usage...
User-agent: *
Disallow: /admin/
Disallow: /admin/
The White House
Surely the most interesting robots.txt file has to be that of The White House. Their robots.txt file is very comprehensive, excluding spider access to a range of topics including 9/11, the first lady and their news releases. Some speculate that the purpose of this is to prevent Search Engines from caching their content, so they can change it when they like without anyone noticing.There are much better resources available on the topic. The Wikipedia is a good place to start.
manju
5 Mar 2009
Posts: 1
The search engine spider simulator can be of great help when trying to figure out if the hyperlinks lead to the right place. For instance, link exchange websites often put fake links to your site with _javascript (using mouse over events and stuff to make the link look genuine) but actually this is not a link that search engines will see and follow. Since the spider simulator would not display such links, you'll know that something with the link is wrong.
[url=http://www.embroideryplanet.co.uk/]Embroidered [/url] [url=http://www.embroideryplanet.co.uk/] Polo shirt Embroidery [/url] [url=http://www.seo-services-seo.co.cc/] Seo services [/url]
| Back to Forum Index : Back to General SEO |
|










