Where do I put robots.txt?

At the root of your domain: https://example.com/robots.txt. Subdirectories don't work; only the file at the root is checked.

Does Disallow prevent indexing?

No. Disallow prevents crawling, but Google can still index URLs based on other signals (links from other sites). To prevent indexing, allow crawling and use a noindex meta tag.

Should I block AI crawlers?

A policy decision. Some site owners want their content excluded from AI training (block GPTBot, ClaudeBot, etc.); others welcome inclusion. The generator supports either approach.

Are wildcards supported?

Yes for specific patterns. * matches any sequence; $ matches end of URL. Disallow: /*.pdf$ blocks all PDF URLs.

How do I block one bot but allow others?

Separate User-agent blocks. User-agent: BadBot followed by Disallow: / blocks BadBot. User-agent: * with Allow: / allows others.

Is robots.txt case-sensitive?

Path matching is case-sensitive. Disallow: /Admin does not match /admin. Match the actual case of your URLs.

Does Crawl-delay actually slow Google?

Google does not honor Crawl-delay. It honors crawl rate settings in Search Console. Other crawlers (Bing, Yandex) respect Crawl-delay.

No. Generation happens in your browser.

robots.txt Generator | Any-Tools.net

About robots.txt Generation

robots.txt is a plain text file at the root of a website that tells crawlers which paths they may visit and which they should avoid. The format dates to 1994 and is supported by virtually every search engine and well-behaved crawler. The file is advisory — malicious crawlers ignore it — but legitimate crawlers (Google, Bing, etc.) honor its directives reliably.

Common uses: blocking crawlers from admin areas (/wp-admin/, /admin/), excluding faceted search URL parameters that produce duplicate content, preventing indexing of staging or development paths, declaring sitemap location, and allowing specific user agents while blocking others. The file is read at the start of each crawl session.

This generator builds a syntactically correct robots.txt from form inputs. Common patterns (allow all, block all, block specific paths) are templates; custom rules can be added per user agent. The output goes at /robots.txt of your site root.

Why Use a robots.txt Generator

Hand-writing robots.txt is error-prone. Syntax mistakes (case sensitivity, exact path matching, ordering of rules) silently produce wrong behavior — paths you meant to block remain crawled, or paths you wanted indexed get excluded. A generator that produces correct syntax avoids these pitfalls.

robots.txt also has subtle interactions with other SEO tools. Disallowing a path in robots.txt does not prevent it from appearing in search results (Google may index the URL without crawling it); meta noindex requires the page to be crawled first. Knowing which tool to use for which intent matters; the generator can guide you.

How to Generate robots.txt

Pick a template, customize, deploy.

Choose a starting template: Allow all (default crawl-everything posture), Block all (block everything from indexing), or Custom (start from rules you specify).
Add user agent rules: Disallow specific paths for all crawlers, or for specific named bots (Googlebot, Bingbot, GPTBot). Each user agent block has its own set of Allow and Disallow directives.
Add sitemap URL: Include the absolute URL of your XML sitemap. Crawlers use this to discover URLs they might miss otherwise.
Save and deploy: Download the generated file. Upload to the root of your domain (so it appears at https://example.com/robots.txt). Verify by visiting that URL in a browser.

Common Use Cases

Blocking admin pages from indexing — /admin/, /wp-admin/, /login pages should not appear in search results. Disallow them in robots.txt.
Excluding staging environments — Staging or development sites should not be crawled. A blanket Disallow for the staging domain prevents accidental indexing.
Managing AI crawler access — GPTBot, ClaudeBot, and other AI crawlers can be blocked specifically while allowing standard search engines.
Declaring sitemap locations — Sitemap directive in robots.txt points crawlers to your sitemap, helping discover URLs faster.
Blocking duplicate content URL parameters — Faceted search URLs (?color=red&size=L&...) produce many crawled duplicates. Disallowing the parameter patterns prevents wasted crawl budget.

Technical Details

Format: User-agent: <name> followed by Allow/Disallow directives. * matches all bots. Specific names (Googlebot, Bingbot) target specific crawlers. Multiple User-agent blocks can stack rules.

Disallow: <path> blocks paths starting with the given prefix. Disallow: / blocks the entire site. Disallow: /admin/ blocks anything under /admin/. Trailing slash matters; Disallow: /admin (no slash) also matches /administrator.

Crawl-delay (in seconds) requests slower crawling. Sitemap (absolute URL) declares your sitemap location. # starts a comment line.

Best Practices

Don't rely on robots.txt for security — robots.txt is advisory. Sensitive content needs authentication, not just Disallow. Determined attackers ignore the file.
Use noindex for indexing control — Disallow prevents crawling but Google may still index URLs. To prevent indexing, allow crawling and use a noindex meta tag instead.
Declare your sitemap — Sitemap directive helps crawlers find URLs they might miss. It's a one-line addition with measurable benefit.
Test before deploying — Google Search Console has a robots.txt tester that confirms your rules block what you intend. Use it before and after deployment.

Frequently Asked Questions

Where do I put robots.txt?: At the root of your domain: https://example.com/robots.txt. Subdirectories don't work; only the file at the root is checked.
Does Disallow prevent indexing?: No. Disallow prevents crawling, but Google can still index URLs based on other signals (links from other sites). To prevent indexing, allow crawling and use a noindex meta tag.
Should I block AI crawlers?: A policy decision. Some site owners want their content excluded from AI training (block GPTBot, ClaudeBot, etc.); others welcome inclusion. The generator supports either approach.
Are wildcards supported?: Yes for specific patterns. * matches any sequence; $ matches end of URL. Disallow: /*.pdf$ blocks all PDF URLs.
How do I block one bot but allow others?: Separate User-agent blocks. User-agent: BadBot followed by Disallow: / blocks BadBot. User-agent: * with Allow: / allows others.
Is robots.txt case-sensitive?: Path matching is case-sensitive. Disallow: /Admin does not match /admin. Match the actual case of your URLs.
Does Crawl-delay actually slow Google?: Google does not honor Crawl-delay. It honors crawl rate settings in Search Console. Other crawlers (Bing, Yandex) respect Crawl-delay.
Is my data uploaded?: No. Generation happens in your browser.

robots.txt Generator

Add Rule

Generated robots.txt

About robots.txt Generation

Why Use a robots.txt Generator

How to Generate robots.txt

Common Use Cases

Technical Details

Best Practices

Frequently Asked Questions

robots.txt Generator

Add Rule

Generated robots.txt

Related Tools

Hreflang Tag Generator

OG Tag Generator

Twitter Card Tag Generator

Schema Markup (JSON-LD) Generator

About robots.txt Generation

Why Use a robots.txt Generator

How to Generate robots.txt

Common Use Cases

Technical Details

Best Practices

Frequently Asked Questions

Related Articles

Text & SEO Tools Every Content Creator Needs

Why Browser-Based Tools Are the Future: No Installs, No Uploads, No Risk