Getting into SEO-promotion courses, beginners come up with a lot of understandable and not very terms. It’s not so easy to figure it all out, especially if one of the points was initially poorly explained or missed. Consider the value in the robots.txt Disallow file, why this document is needed, how to create it and work with it.
In simple words
In order not to “feed” the reader with complex explanations that are usually found on specialized sites, it is better to explain everything “on the fingers”. A search engine comes to your site and indexes pages. After you look at reports that indicate problems, errors, etc.
, . , « » «». , , . , . robots.txt Disallow.
. . , , . .
Robots.txt . , . . , «/robots.txt». , . – Sitemaps.
, , Sitemaps. XML. . -, .
, , . . , , . .
robots.txt . 1994 . W3C. . «» . , .
, , , . Robots.txt , .
?
, , . , . , , , .
, :
robots.txt Disallow , , . , .
, «». . «» «». .
Disallow robots.txt , . , . , User agent, . Allow - Disallow. . .
, robots.txt User agent Disallow , . . User agent. , , , .
300 . , , . «User-agent: *». «» , .
Google, . Googlebot. , robots.txt: Disallow, Allow . . , , .
. , , Google Yandex, .
. Disallow , . , «Disallow:». .
robots.txt «Disallow: /». , . , , . . , , .
. . , , «Disallow: /papka/», , .
, , Allow. , . , -, . . , , Allow .
Host. -. , . , "" , , .
, robots.txt. «http://», , HTTP. HTTPS, . , «Host: site.com» HTTP, «Host: https://site.com» HTTPS.
Sitemap , . robots.txt , . «Sitemap: http://site.com/sitemap.xml». , , . , , , .
, . , . , Crawl-delay , . , «» . .
Clean-param , . , . : «Clean-param: ref /catalog/get_product.com».
, robots.txt, – . , . . . .
. . , .
, , . , CMS. , , Google "".
, Disallow robots.txt, , . , .
. , . , . , User-agent «/», Disallow .
– . , , . , .
. , «robots.txt». , «Robots.txt» «ROBOTS.txt».
User-agent . . , , HTTP, . HTTPS. Disallow . , .
, , robots.txt – , . , . -, . . - - - .