Robots.txt Generator
Generate a properly configured robots.txt file directly in your AI coding assistant. Includes AI crawler directives for GPTBot (OpenAI), Google-Extended, CCBot, and anthropic-ai. Automatic framework detection for Next.js, Astro, React, and WordPress. Add sitemap references, block admin paths, and optimize crawl budget.
Install and use:
curl -fsSL https://suparank.io/install | bash /suparank-robots Features
Framework Detection
Auto-detects Next.js, Astro, WordPress, and other frameworks
AI Crawler Rules
Include rules for GPTBot, ClaudeBot, PerplexityBot, Google-Extended
Path Analysis
Identifies admin, API, and build paths to block automatically
Sitemap Reference
Adds proper sitemap directive for search engines
llms.txt Reference
Includes llms.txt comment for AI content discovery
Best Practices
Follows Yoast, Google, and industry recommendations
Example robots.txt Output
# Suparank - https://suparank.io # Generated by Suparank robots.txt Generator User-agent: * Allow: / # Block non-content paths Disallow: /api/ Disallow: /_astro/ # AI Crawlers - Allow for AI search visibility User-agent: GPTBot Allow: / User-agent: ClaudeBot Allow: / User-agent: Google-Extended Allow: / User-agent: PerplexityBot Allow: / # Sitemap Sitemap: https://suparank.io/sitemap.xml # LLMs.txt for AI content discovery # https://suparank.io/llms.txt # Full version: https://suparank.io/llms-full.txt
Frequently Asked Questions
What is a robots.txt file?
A robots.txt file is a text file placed in your website's root directory that tells search engine crawlers which pages or sections of your site to crawl or not crawl. It's part of the Robots Exclusion Protocol (REP) and is the first file crawlers check when visiting your site. Modern robots.txt files also include sitemap references and AI crawler directives.
How do I block AI crawlers like GPTBot?
To block GPTBot (OpenAI's crawler), add "User-agent: GPTBot" followed by "Disallow: /" in your robots.txt. You can also block Google-Extended (Google's AI training), CCBot (Common Crawl), and anthropic-ai (Claude). Each AI company uses different user agents, so you need separate rules for each one you want to block.
Should I add sitemap to robots.txt?
Yes, adding your sitemap URL to robots.txt helps search engines discover all your pages faster. Use the format "Sitemap: https://yourdomain.com/sitemap.xml". You can list multiple sitemaps if needed. This is especially important for large sites or when you have dynamic content that changes frequently.
What paths should I block in robots.txt?
Commonly blocked paths include admin areas (/admin/, /wp-admin/), API endpoints (/api/), build directories (/_next/, /_astro/), temporary files, search results (?s=), and user-specific pages. Blocking these paths helps optimize crawl budget by focusing crawlers on your important content pages. Never block CSS, JavaScript, or images as this can hurt SEO.
Generate Your robots.txt
Install and create an optimized robots.txt in seconds.
curl -fsSL https://suparank.io/install | bash