Question 1

Why must each path start with a slash?

Accepted Answer

The robots protocol treats paths as URL-relative. Without a leading slash the rule is invalid and crawlers may skip it. The validator catches the mistake before you publish.

Question 2

Should I use Allow rules?

Accepted Answer

Use Allow when you Disallow a parent directory but want a specific sub-path crawlable — e.g. Disallow /admin/ + Allow /admin/public/. Without Disallow above it, an Allow line is redundant; everything is crawlable by default.

Question 3

Do all crawlers respect Crawl-delay?

Accepted Answer

No. Google ignores Crawl-delay; Bing and Yandex respect it. Use Search Console's crawl-rate setting for Googlebot.

Question 4

What does the Host directive do?

Accepted Answer

Host is non-standard but Yandex specifically uses it to identify the preferred hostname when a site is reachable on multiple domains. Most crawlers ignore it.

Question 5

Can I block a single crawler entirely?

Accepted Answer

Yes. Add a group with User-agent set to that bot's name (e.g. AhrefsBot) and Disallow set to "/". Place it before the wildcard "*" group; crawlers stop at the first matching group.

Question 6

Where do I put the generated file?

Accepted Answer

At the root of your site, accessible at https://yourdomain.com/robots.txt. It must be served as text/plain with HTTP 200 — not 4xx — for crawlers to read it.

Robots.txt Generator

Group 1

How to use this robots.txt generator

About this robots.txt generator

FAQ