What is “robots.txt?”
A robots.txt file gives instructions to web crawlers and web robots designating which parts of a public website should be accessed by the web crawler/robot. This system is dependent on the cooperation of the web crawler; not all crawlers/robots heed the access/non-access requests. This function is used for various reasons, for example, webmasters might implement robots.txt for a page with content irrelevant to a website’s search engine presence in order to prevent inaccurate search engine listing. For the most part, however, you don’t want to deny web crawlers access to your site. It’s bad for SEO.
A robots.txt file gives instructions to web crawlers and web robots designating which parts of a public website should be accessed by the web crawler/robot. This system is dependent on the cooperation of the web crawler; not all crawlers/robots heed the access/non-access requests. This function is used for various reasons, for example, webmasters might implement robots.txt for a page with content irrelevant to a website’s search engine presence in order to prevent inaccurate search engine listing. For the most part, however, you don’t want to deny web crawlers access to your site. It’s bad for SEO.
What does this mean for SEO?
Oftentimes, search engines use web crawlers to optimize search engine results. These web crawlers are used to review site content and determine a website’s relevance to specific keyword searches. In short, web crawlers are partially responsible for search engine results, which they base on their perusal of HTML. Sometimes robots.txt files keep websites out of search engine results altogether. Be wary of the “Disallow: /.”
Oftentimes, search engines use web crawlers to optimize search engine results. These web crawlers are used to review site content and determine a website’s relevance to specific keyword searches. In short, web crawlers are partially responsible for search engine results, which they base on their perusal of HTML. Sometimes robots.txt files keep websites out of search engine results altogether. Be wary of the “Disallow: /.”
The robots.txt file modifier “Disallow: /” instructs cooperating web crawlers not visit any pages on a website. If search engine web crawlers don’t crawl through your website, then they do not collect any data. If they don’t collect any data, then they can’t index your website as related to any search engine keywords. If they don’t index your website, then your website will never turn up in search engine resutls. The presence of “Disallow: /” in your robots.txt HTML is like placement on the search engine black list. Unless someone knows your precise URL, no one is going to be finding your website.
How do I check for “Disallow: /?”
Finding the dreaded “Disallow: /” in your HTML is easy. Simply right-click on your web page and go to “View page source.” Then, use the Find function (hit CTRL-F) to search for “Disallow.” All robots have been excluded from your website if you find:
User-agent: *
Disallow: /
Luckily, all you need to do is remove that final forward slash (/) and all robots will be allowed complete access. Visit http://www.robotstxt.org/robotstxt.html for more information.
Watch his instructional video about robots.txt for more informatoin about the use and abuse of robots.txt.



Comments