The robots.txt is a very simple text file that is placed on your root directory. An example would bewww.yourdomain.com/robots.txt. This file tells search engine and other robots which areas of your site they are allowed to visit and index.
You can ONLY have one robots.txt on your site and ONLY in the root directory (where your home page is):
BAD - Won't work: www.yourdomain.com/subdirectory/robots.txt
All major search engine spiders respect this, and naturally most spambots (email collectors for spammers) do not. If you truly want security on your site, you will have to actually put the files in a protected directory, rather than trusting the robots.txt file to do the job. It's guidance for robots, not security from prying eyes.
At its most simple, a robots.txt file looks like this:
This one tells all robots (user agents) to go anywhere they want (disallow nothing).
This one, on the other hand, keeps out all compliant robots:
As you can see, the only difference between them is a single slash ( "/" ). But if you accidentally use that slash when you didn't mean to, you could find your search engine rankings disappear. Be very careful.
One important thing to know if you are creating your own robots.txt file is that although the wildcard (*) is used in the user-agent line, it is not allowed in the disallow line. For example, you can't have something like:
Here is the official information on the subject: RobotsTxt.org
UPDATE: If you use Google Sitemaps (and you should), they have now included a robots.txt validator in it - which will make certain that your robots.txt file is understood properly by Google.