SEO Robots

What is Robots.txt

Robots.txt is a file that provides crawl instructions about your site for search engines (bots that crawl your site). It can be used to 1) restrict which directories bots “should”*** not crawl, 2) tells the bot where your sitemap is located.

A Robots.txt file is a good idea, and considered an SEO best practice, but is not necessary. A professional SEO will be sure to use one to prevent unnecessary traffic in order to keep the site from becoming bogged-down by bots, and to point to the sitemap, especially if the sitemap is not in the standard XML format.

An example of what you will see in a Robots.txt file:

User-agent: Alexibot
Disallow: /

This tells the Alexibot to not crawl the site.

Sitemap: http://www.seleads.com/sitemap.xml.gz

This line tells the bots where the compressed sitemap file is located.

***The Disallow: / will not keep a page or directory private. If you want privacy, don’t put it on the Internet. The instructions in the Robots.txt file should be seen as suggestions only. Some bots ignore the rules in the Robots.txt file. And although Google does adhere to the rules, Google will still index a page that is linked to from somewhere else. So if you have a page in the Disallow: /personal directory on your site, but you or someone else links to that page, Google will still find it and index it.