Markdown, llms.txt and AI crawlers
In January, I made every page on my site available as Markdown. Immediately AI crawlers quickly found the Markdown versions. I was excited, but excitement isn't data. Now that the dust has settled, I pulled a month of Cloudflare logs and analyzed them.
I compared how much AI bots crawl my site to how often AI answer engines link back. For every citation they sent, their crawlers had fetched 1,241 pages. That is a lot of reading for very little traffic in return. It is the deal AI is offering creators right now, and it is not a good one.
People also asked whether serving Markdown reduced my bot traffic since the files are smaller. It does not. Bots fetch both versions, and my crawler traffic increased by about 7%. Offering a lighter format does not replace the heavier one. It simply gives bots more to crawl.
As the table below shows, several AI companies crawl my site. Some fetch thousands of pages each month, but very few request the Markdown versions.
| Bot | Vendor | Total | HTML files | .md files | Content Neg | % .md |
|---|---|---|---|---|---|---|
| Amazonbot | Amazon | 16,872 | 15,032 | 1,840 | 0 | 10.9% |
| ChatGPT-User | OpenAI | 13,864 | 13,856 | 8 | 0 | 0.1% |
| Meta AI | Meta | 9,011 | 8,526 | 485 | 0 | 5.4% |
| ClaudeBot | Anthropic | 7,144 | 6,995 | 149 | 0 | 2.1% |
| OAI-SearchBot | OpenAI | 5,722 | 4,422 | 1,300 | 0 | 22.7% |
| GPTBot | OpenAI | 3,385 | 2,208 | 1,177 | 0 | 34.8% |
| Bytespider | ByteDance | 1,190 | 1,190 | 0 | 0 | 0.0% |
| CCBot | CommonCrawl | 530 | 530 | 0 | 0 | 0.0% |
| PerplexityBot | Perplexity | 467 | 466 | 1 | 0 | 0.2% |
| Claude-User | Anthropic | 94 | 87 | 7 | 0 | 7.4% |
Interestingly, OpenAI runs three bots with different roles. OAI-SearchBot indexes content for search, GPTBot crawls for training data, and ChatGPT-User fetches pages in real time during live ChatGPT sessions.
When I added Markdown support to my site, I exposed it in two ways. The first is through dedicated Markdown URLs: append .md to any page and you get the Markdown version. The second is through content negotiation, where the original URL returns Markdown instead of HTML when the request includes an Accept: text/markdown header.
No AI crawler uses content negotiation. Not one. They only discover the Markdown pages through the dedicated URLs, and only via the auto-discovery link. To be fair, the auto-discovery link points to the .md version. As a next step, I will experiment with pointing it to the content-negotiated version instead.
| Bot | robots.txt | sitemap.xml | llms.txt | .md files |
|---|---|---|---|---|
| Amazonbot | 182 | - | - | 1,840 |
| ChatGPT-User | - | - | - | 8 |
| Meta AI | - | 75 | - | 485 |
| ClaudeBot | 496 | 115 | - | 149 |
| OAI-SearchBot | 653 | - | - | 1,300 |
| GPTBot | - | 4 | - | 1,177 |
| Bytespider | 259 | - | - | - |
| CCBot | 8 | - | - | - |
| PerplexityBot | 142 | - | - | 1 |
| Claude-User | 87 | - | - | 7 |
And then my favorite: llms.txt, a proposed standard where sites describe their content for AI systems. My site received 52 requests for it last month. Every one came from SEO audit tools. Not a single request came from an AI answer engine or crawler. (I don't use or pay for SEO tools but apparently that doesn't stop them from auditing my site.)
For fun, we also looked across Acquia's entire hosting fleet and found about 5,000 llms.txt requests out of 400 million total (0.001%), nearly all from SEO tools. llms.txt is a solution looking for a problem. The bots it was designed for don't look for it.
So should you add Markdown support to your site? Probably not. There is no clear benefit today. It does not reduce crawl traffic, and I can't demonstrate that it improves how AI systems use your content.
We do know that AI systems love Markdown, and they fetch it when it is available. At best, it may become more useful over time.
If it is easy to add and you enjoy experimenting, go ahead. If it takes real effort, spend that effort on your content instead. What still works is what has always worked: clear writing, authoritative content, and timely publishing.
—