How to Set Robots.txt (HSR)

What is a robots.txt file?

Robots.txt is a text file that webmasters create to instruct web robots (usually search engine robots) how to crawl pages on their websites. The robots.txt file is part of the robots exclusion protocol (REP), a group of web standards that govern how robots crawl the web, access and index content, and serve that content to users. The REP also includes directives such as meta robots, as well as page, subdirectory, or site instructions for how search engines should treat links (such as "follow" or "nofollow"). In practice, a robots.txt file indicates whether certain user agents (web crawling software) can or cannot crawl parts of a website. These crawling instructions are specified by "denying" or "allowing" certain (or all) user agent behaviors.

User-agent: Googlebot
Disallow:
User-agent: msnbot
Disallow:
User-agent: Yahoo-slurp
Disallow:
User-agent: Slurp
Disallow:
User-agent: Mediapartners-Google
Disallow:
User-agent: AdsBot-Google
Disallow:
User-agent: Googlebot-Mobile
Disallow:
User-agent: Googlebot-Image
Disallow:
User-agent: Yahoo-MMCrawler
Disallow:

User-agent: *
Disallow: /
Disallow: /2014_06_01_archive.html?m=1
Disallow: /2014_07_01_archive.html?m=1
Disallow: /2015_02_01_archive.html?m=1
Disallow: /2014_05_01_archive.html?m=1
Disallow: /2014_11_01_archive.html?m=1
Disallow: /2014_08_01_archive.html?m=1
Disallow: /2015_03_01_archive.html?m=1
Disallow: /2014_04_01_archive.html?m=1
Disallow: /2014/09/mengenal-leica-hds-untuk-forensik-dan-investigasi.html
Disallow: /2014/10/bagaimanakah-proses-fabrikasi.html
Disallow: /2014/11/how-the-work-flow-3d-laser-scanning-to-be-applied-at-pertamina.html

Sitemap: http://www.gatewan.com/feeds/posts/default?orderby=UPDATED

Traces of Gatewan's blog at that time.

User-agent: Googlebot
Disallow:
Disallow: /2014_06_01_archive.html?m=1
Disallow: /2014_07_01_archive.html?m=1
Disallow: /2015_02_01_archive.html?m=1
Disallow: /2014_05_01_archive.html?m=1
Disallow: /2014_11_01_archive.html?m=1
Disallow: /2014_08_01_archive.html?m=1
Disallow: /2015_03_01_archive.html?m=1
Disallow: /2014_04_01_archive.html?m=1
Disallow: /2014/09/mengenal-leica-hds-untuk-forensik-dan-investigasi.html
Disallow: /2014/10/bagaimanakah-proses-fabrikasi.html
Disallow: /2014/11/how-the-work-flow-3d-laser-scanning-to-be-applied-at-pertamina.html

User-agent: msnbot
Disallow:
User-agent: Bingbot
Disallow:
User-agent: Yahoo-slurp
Disallow:
User-agent: Slurp
Disallow:
User-agent: Mediapartners-Google
Disallow:
User-agent: Googlebot-Mobile
Disallow:
User-agent: AdsBot-Google
Disallow:
User-agent: Yahoo-MMCrawler
Disallow:

User-agent: *
Disallow: /

Sitemap: http://www.gatewan.com/feeds/posts/default?orderby=UPDATED

Traces of the casualarea blog at that time.

User-agent: Googlebot
Disallow:
Disallow: /s72-c/
Disallow: /delete-comment.g?blogID&m=1
Disallow: /delete-comment.g?blogID=
Disallow: /s
Disallow: /p/about-me.html?m=1
Disallow: /s?m=1
Disallow: /p/memuat.html?m=1
Disallow: /2015_02_01_archive.html
Disallow: /2014/06?m=1

User-agent: msnbot
Disallow:
User-agent: Bingbot
Disallow:
User-agent: Yahoo-slurp
Disallow:
User-agent: Slurp
Disallow:
User-agent: Mediapartners-Google
Disallow:
User-agent: Googlebot-Mobile
Disallow:
User-agent: AdsBot-Google
Disallow:
User-agent: Yahoo-MMCrawler
Disallow:

User-agent: *
Disallow: /

Sitemap: http://www.santaiarea.com/feeds/posts/default?orderby=UPDATED

Totaltrend blog traces at that time.

User-agent: Googlebot
Disallow:
Disallow: /s
Disallow: /s72-c/

User-agent: msnbot
Disallow:
User-agent: Bingbot
Disallow:
User-agent: Yahoo-slurp
Disallow:
User-agent: Slurp
Disallow:
User-agent: Mediapartners-Google
Disallow:
User-agent: Googlebot-Mobile
Disallow:
User-agent: AdsBot-Google
Disallow:
User-agent: Yahoo-MMCrawler
Disallow:

User-agent: *
Disallow: /

Sitemap: http://www.totaltren.com/feeds/posts/default?orderby=UPDATED

Post a Comment

Previous Next

نموذج الاتصال