Skip to main content

Robots.txt Generator

Generate a valid robots.txt with per-user-agent Allow, Disallow, Crawl-delay, and Sitemap.

Written by Golam Rabbani, Founder & Lead Engineer

Group 1

Use "*" for all crawlers

One per line, each starting with /

One per line, each starting with /

Seconds between requests (most crawlers respect)

One absolute URL per line

Preferred hostname, used by some crawlers (e.g. Yandex)

How to use this robots.txt generator

  1. Set User-agent to * (all crawlers) or to a named bot like Googlebot.
  2. List Disallow paths one per line (each must start with /).
  3. Optionally list Allow paths and a Crawl-delay in seconds.
  4. Add additional user-agent groups for bot-specific rules.
  5. Add Sitemap URLs and (optionally) a Host hostname.
  6. Press Generate and copy the resulting robots.txt to the root of your site.

About this robots.txt generator

The robots.txt generator builds a valid Robots Exclusion Protocol file from form input. Each user-agent group emits the User-agent line followed by Allow and Disallow rules; if both fields are empty the group falls back to the canonical empty Disallow (which means "everything allowed"). Sitemap URLs and an optional Host directive are appended at the end. Paths are validated to ensure they start with /, sitemap URLs are validated as absolute http/https URLs, and Crawl-delay must be a non-negative number.

Worked example: Group 1 with User-agent "*", Disallow "/admin/" and "/private/", plus a sitemap URL "https://example.com/sitemap.xml". Press Generate and you get: User-agent: * Disallow: /admin/ Disallow: /private/

Sitemap: https://example.com/sitemap.xml

Add a second group for "Googlebot" with Allow "/admin/public/" and Disallow "/admin/" and the tool emits both groups separated by a blank line — the standard convention crawlers expect. Generation is entirely client-side.

FAQ

Why must each path start with a slash?
The robots protocol treats paths as URL-relative. Without a leading slash the rule is invalid and crawlers may skip it. The validator catches the mistake before you publish.
Should I use Allow rules?
Use Allow when you Disallow a parent directory but want a specific sub-path crawlable — e.g. Disallow /admin/ + Allow /admin/public/. Without Disallow above it, an Allow line is redundant; everything is crawlable by default.
Do all crawlers respect Crawl-delay?
No. Google ignores Crawl-delay; Bing and Yandex respect it. Use Search Console's crawl-rate setting for Googlebot.
What does the Host directive do?
Host is non-standard but Yandex specifically uses it to identify the preferred hostname when a site is reachable on multiple domains. Most crawlers ignore it.
Can I block a single crawler entirely?
Yes. Add a group with User-agent set to that bot's name (e.g. AhrefsBot) and Disallow set to "/". Place it before the wildcard "*" group; crawlers stop at the first matching group.
Where do I put the generated file?
At the root of your site, accessible at https://yourdomain.com/robots.txt. It must be served as text/plain with HTTP 200 — not 4xx — for crawlers to read it.