Discussion in 'Search Engine Optimization (SEO)' started by georgie, Aug 18, 2009.
What's the importance of robots.txt?
How does it work?
The robots.txt file is used to tell search engines how to handle your site when their bots/spiders crawl your website. You can specify pages or directories on your site that should be ignored by the bots and not indexed. This comes in handy when you have subfolders for pictures or scripts that you don't want showing up on search results.
You can also specify the path to your sitemap file (if you have one) in the robots.txt file. This will help the bot find all the pages in your site.
I'm pretty new at this too, so if my info is incorrect, hopefully one of the more experienced members *cough*Newbie Shield*cough* will come along and make corrections.
Good luck to you!
Right on Dr_Boo!
Here's the content of a sample robots.txt file:
User-agent: * means all search engine bots.
means the search engine bots should not index sub-directories images and cgi-bin and file privacy.html
Thanks for the information. I'm working with DotNetNuke. I know nothing about robots. Could you please direct me to a good location to learn other than google?
Here's where you can learn more about robots.txt:
Use robots.txt to hide landing pages that you don;t want competitors to see.....
Robots.txt is just a regular text file saved on a website. On request, the specified robots will ignore specified files or directories in their search. It is an algorithm. If it is not there on your site, Google will not crawl it.
I use robots.txt to hide the plugins folder on some of my sites so
that google doesn't know that they are *auto*--=blogs. lol
I mean the sites look great so there is no reason to suspect
Robots.txt is exactly a normal text file saved on a website. For the asking, the conditioned robots will ignore defined files or directories in their search.
refer to http://www.robotstxt.org/robotstxt.html
Ron S: refer to http://www.robotstxt.org/robotstxt.html
Unfortunately the link doesn't work "Internal Server Error"...
The robots.txt file is a simple text file (no html) that is placed in your website’s root directory in order to tell the search engines which pages to index and which to skip.
Robots.txt" is a regular text file that through its name, has special meaning to the majority of "honorable" robots on the web. By defining a few rules in this text file, you can instruct robots to not crawl and index certain files, directories within your site, or at all. For example, you may not want Google to crawl the /images directory of your site, as it's both meaningless to you and a waste of your site's bandwidth. "Robots.txt" lets you tell Google just that.
Robots.txt is frequent name of a text file that is uploaded to a Web site's root directory and linked in the html code of the Web site. The robots.txt file is used to have the funds for directions about the Web site to Web robots and spiders.
Separate names with a comma.