Cereal song government " search engine optimizes a guideline " network spider piece

Use Robots.txt effectively

The what part that file of a Robots.txt can tell search to index props up your website allows it to go capture. And this file does not dawn must name Robots.txt, return so that be put on the root catalog of your website.

The method of file of Robots.txt of a website.

All search engine reptile that follow this one regulation (according to what connect directive of * matching accord with in that way) should be not entered and capture / Images/ perhaps holds the post of why / the content in the Url of Search begin.

You perhaps do not hope a few pages of your website are gone to by capture, because if they appear,be in search result,to the user the likelihood does not have much great sense. If you want to prevent search to all alone engine climbs the webpage that takes you, tool of Google website manager has very good used Robots.txt maker will help you found this file. Additional if you hope child a few content in the domain name are not climbed to take, you need to be in child file of a Robots.txt is built again below the catalog of the domain name. You also can be in center of our website administrator help to obtain more the information about Robots.txt.

The content that there also is way to prevent you here appears in search result, add NOINDEX in Robots Meta label for example, use Htaccess will be added for a few catalog close perhaps use tool of Google website manager to move except certain the content that has been indexed. Google engineer Matt Cutts introduced all sorts of URL roughly to exclude the note of the principle in video of a help.

Robots.txt carries out experience

Use more safe program to be prevented through Robots.txt for those sensitive content a few sensitive or classified content. Because,what do so is if those links that you prevent in Robots.txt appear on Internet (record) e.g. origin, search engine cites probably these URLs(do not pass only difficult it is URL, won’t contain headline or content extract) . The scampish search engine that still a few inobservant robots eliminate a level can violate the injunction on Robots.txt. Last somes of curious user can examine your Robots.txt is the catalog that statement prevents or subdirectory, guess those content that you do not think to be seen. Add to content so close perhaps realizing a password to protect through.htaccess can be safer choice. Of course you need to avoid:

1. allows to look like the page like searching a result to be climbed to take. (The user does not like to throw a piece to oneself to search a page, it is the result of a few without how great senses completely. It is the result of a few without how great senses completely..

2. allows the many page that makes automatically by capture, and inside it is identical mostly or the content with very profound difference. Did like these 100000 pieces resemble duplicate page be being indexed by search engine almost how to have great sense?

3. allows the URLs that acting server founds to be climbed to take. (origin: Rich guest translates Google website administrator: Hedge daydream is recorded)

Knowledge is additional: Robots.txt

Robots.txt(unified the ordinary form of a Chinese numeral) it is the text file that deposits the ASCII code that issues at website root catalog one kind, it tells what the network searchs engine to roam normally implement (call network spider) again, the what content in this website is cannot be roamed by what search index props up implement get, what is OK by (roam implement) get. Because the URL in a few systems is big the ordinary form of a Chinese numeral is sensitive, so the file name of Robots.txt should be united for the ordinary form of a Chinese numeral. Robots.txt should put the root list of park website. If want to define those who search engine to roam alone implement the behavior when visitting subdirectory, can incorporate the setting that decides oneself so the Robots.txt below root catalog, perhaps use Robots metadata.

Robots.txt agreement is not a standard, is established by usage only, can not assure the privacy of the website so. Noticing Robots.txt is to use string to whether get URL certainly quite, so catalog fine has and do not have sprit / these two kinds express is different URL, also cannot use Disallow: *.gif such connecting match accord with.

This agreement also is not a standard, is established by usage only, normally search engine can identify this metadata, do not index this page, and the catenary of this page gives a page.