All Tools
Technical

Robots.txt Generator

Generate a properly configured robots.txt file directly in your AI coding assistant. Includes AI crawler directives for GPTBot (OpenAI), Google-Extended, CCBot, and anthropic-ai. Automatic framework detection for Next.js, Astro, React, and WordPress. Add sitemap references, block admin paths, and optimize crawl budget.

Install and use:

curl -fsSL https://suparank.io/install | bash
/suparank-robots

Features

Framework Detection

Auto-detects Next.js, Astro, WordPress, and other frameworks

AI Crawler Rules

Include rules for GPTBot, ClaudeBot, PerplexityBot, Google-Extended

Path Analysis

Identifies admin, API, and build paths to block automatically

Sitemap Reference

Adds proper sitemap directive for search engines

llms.txt Reference

Includes llms.txt comment for AI content discovery

Best Practices

Follows Yoast, Google, and industry recommendations

Example robots.txt Output

# Suparank - https://suparank.io
# Generated by Suparank robots.txt Generator

User-agent: *
Allow: /

# Block non-content paths
Disallow: /api/
Disallow: /_astro/

# AI Crawlers - Allow for AI search visibility
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: PerplexityBot
Allow: /

# Sitemap
Sitemap: https://suparank.io/sitemap.xml

# LLMs.txt for AI content discovery
# https://suparank.io/llms.txt
# Full version: https://suparank.io/llms-full.txt

Frequently Asked Questions

What is a robots.txt file?

A robots.txt file is a text file placed in your website's root directory that tells search engine crawlers which pages or sections of your site to crawl or not crawl. It's part of the Robots Exclusion Protocol (REP) and is the first file crawlers check when visiting your site. Modern robots.txt files also include sitemap references and AI crawler directives.

How do I block AI crawlers like GPTBot?

To block GPTBot (OpenAI's crawler), add "User-agent: GPTBot" followed by "Disallow: /" in your robots.txt. You can also block Google-Extended (Google's AI training), CCBot (Common Crawl), and anthropic-ai (Claude). Each AI company uses different user agents, so you need separate rules for each one you want to block.

Should I add sitemap to robots.txt?

Yes, adding your sitemap URL to robots.txt helps search engines discover all your pages faster. Use the format "Sitemap: https://yourdomain.com/sitemap.xml". You can list multiple sitemaps if needed. This is especially important for large sites or when you have dynamic content that changes frequently.

What paths should I block in robots.txt?

Commonly blocked paths include admin areas (/admin/, /wp-admin/), API endpoints (/api/), build directories (/_next/, /_astro/), temporary files, search results (?s=), and user-specific pages. Blocking these paths helps optimize crawl budget by focusing crawlers on your important content pages. Never block CSS, JavaScript, or images as this can hurt SEO.

Generate Your robots.txt

Install and create an optimized robots.txt in seconds.

curl -fsSL https://suparank.io/install | bash