AI crawlability scan.
Enter a domain. We run a real server-side scan and return a graded report covering robots.txt rules for the 21 major AI bots, sitemap.xml validity, llms.txt presence, .well-known core-pages, and homepage HTML signals (canonical, JSON-LD, X-Robots-Tag, SSR).
How this works
When you submit a domain, our server fetches a small set of well-known files and the homepage with a 5 to 6 second timeout each. We parse robots.txt with proper allow/disallow longest-match logic and check it against the 21 AI bots that matter most: GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, Claude-Web, anthropic-ai, PerplexityBot, Perplexity-User, Google-Extended, GoogleOther, Applebot-Extended, CCBot, cohere-ai, Meta-ExternalAgent, FacebookBot, Bytespider, Amazonbot, YouBot, Diffbot, DuckAssistBot, AI2Bot.
Each of the six checks contributes a weighted score on a 0 to 10 scale, mapped to a letter grade A through F.
What it does not do
It does a one-shot probe of public, well-known endpoints plus your homepage. It does not crawl interior pages, render JavaScript, or audit per-page schema across your site. It does not score citation strength, AI-search rankings, or off-page signals. For a full audit including those, book a discovery call below.
Limits: 10 scans per hour per IP, results cached 1 hour per domain. We log only the host and an IP-derived counter; we do not store the report contents beyond cache TTL.
Book a discovery callNeed this done for real, end to end?
We design, build, and operate the full system across discovery, architecture, build, and run. Tell us what you are trying to ship and we reply within one business day.