Robots.txt file basics

05.02.2013 by Ilia Raiskin

SEO TXT

When a web crawler finds a website, it first reads a file named robots.txt which is positioned at the root of a domain. In this file, you can set how and if the web crawler can visit a website. This offers us a possibility to deny the access to our website for specific search engines. Note that the robots.txt file can not be used to keep files secret, because one can open them with a browser. Some search engines still show the "hidden" pages in their search results, but without descriptions.

Structure

The first line describes the crawler or User-agent which is adressed by the following rules. There is no limitation for the amount of these "blocks". After reading a block which starts with User-agent: * a crawler will stop reading the file, so special-crawler-blocks have to be placed at the beginning of the file. You can add one-row comments to a robots.txt. They simply start with #. One-row comments can be helpful to describe your settings in the robots.txt. They are ignored by crawlers.

Possible Statements

Statement	Example	Description
User-Agent:	`User-Agent: *` \| `User-Agent: Googlebot`	Specifies web crawler. `*` selects all crawlers.
Disallow:	`Disallow: /` \| `Disallow: /images/` \| `Disallow: /test.html`	Doesn't allow reading of files.
Allow:	`Allow: /` \| `Allow: /free/` \| `Allow: /public.html`	Allows reading of files.
Crawl-delay:	`Crawl-delay: 100`	Sets readout speed. If we use our example, only every 100 seconds a new page may be opened for reading.
Sitemap:	`Sitemap: http://for-example.com/sitemap-url.xml`	The Sitemap can be found using this url. This only works for the following crawlers: Googlebot, Yahoo! Slurp, msnbot, Ask.com

Example



User-agent: Googlebot
Allow: /public/
Disallow: /not-google.html

User-agent: *
Disallow /images/
Disallow /privates/
Disallow: intern_file.html

#I am a one-row comment

More Information

You'll find more Information at http://www.robotstxt.org

Did you like this post? Any feedback is highly appreciated. Feel free to leave a comment.

About the author

Ilia Raiskin

Ilia Raiskin is a web designer, web developer, blogger and founder of Toolinfy.com.

Robots.txt file basics

Structure

Possible Statements

Example

More Information

Did you like this post? Any feedback is highly appreciated. Feel free to leave a comment.

About the author

Ilia Raiskin

Follow & Subscribe

About us

Info