p0stman

What Is llms.txt? The AI Discovery File Explained

Last updated: March 2026

llms.txt is a plain-text file placed at your website's root (/llms.txt) that provides AI language models with a concise summary of your site's purpose, services, and capabilities. Think of it as a README for AI — while robots.txt tells crawlers what they can access, llms.txt tells them what they'll find. Adoption is currently around 10% of major websites, but implementation takes minutes and costs nothing.

The web was built for humans navigating with browsers. Sitemaps and robots.txt were added for search engine crawlers. But AI language models are a fundamentally different kind of consumer. They don't click links, render CSS, or follow navigation menus. They process text. And they need context about what a website is, what it does, and what's available — fast, in a format optimised for their token window.

That's the gap llms.txt fills. Proposed by Jeremy Howard (founder of fast.ai) in September 2024, llms.txt is a lightweight Markdown file that lives at your domain root. It provides a structured, human-readable-but-AI-optimised overview of your site. No authentication required. No API to call. Just a text file that any AI system can fetch and parse in a single request.

The honest picture: as of early 2026, no major AI vendor has officially confirmed that their base models automatically fetch llms.txt during inference. But the agentic layer — the tools, agents, and pipelines built on top of those models — increasingly does. And the cost-benefit calculus is simple: five minutes to create versus being invisible to an entire class of emerging traffic.

The llms.txt Specification

The llms.txt specification is intentionally minimal. Jeremy Howard designed it to be as simple as possible — a file that anyone can create in a text editor without needing to understand XML schemas, JSON-LD, or protocol buffers. The format uses a subset of Markdown with specific structural conventions.

File Location and Format

The file must be placed at the exact path /llms.txt on your domain root. Not in a subdirectory. Not with an HTML extension. Plain text, served with a text/plain or text/markdown content type. For Next.js sites using the public/ directory, this means creating public/llms.txt.

Required Structure

The specification defines a clear hierarchy:

  1. H1 heading — Your site or product name. Exactly one, at the top of the file.
  2. Blockquote — A one or two sentence summary immediately below the H1. This is the "elevator pitch" for your site. Use > Markdown blockquote syntax.
  3. H2 sections — Categorised groups of links and descriptions. Common sections include "Documentation", "API", "Key Pages", "Resources", and "Optional".
  4. Link entries — Each entry within a section is a Markdown link followed by a colon and a brief description. Format: - [Page Name](URL): Brief description of what this page contains

Formatting Rules

  • Use only H1 (#) and H2 (##) headings — no deeper nesting
  • Keep the entire file under 100 lines — this is a summary, not documentation
  • Write in plain language. Avoid marketing jargon. AI models parse "We provide cloud storage for enterprise teams" better than "Revolutionising the paradigm of digital asset management"
  • Use absolute URLs for all links
  • Descriptions should be one sentence, factual, and specific
  • An optional section marked with ## Optional can contain nice-to-have context that AI can skip if running low on context window

llms.txt vs llms-full.txt

The specification also defines an optional companion file: llms-full.txt. Where llms.txt is an index with links to deeper content, llms-full.txt contains all the content inline — no links to follow. This is useful for AI agents that want complete context in a single request without having to crawl multiple pages.

Aspect llms.txt llms-full.txt
PurposeConcise index with linksComplete content in one file
Typical size50-100 lines500-5,000+ lines
Agent behaviourMust follow links for detailsGets everything in one fetch
Best forLarge sites with many sectionsFocused products, API docs
MaintenanceLow — just update linksHigher — content duplicated
Token efficiencyVery efficient as an overviewExpensive but comprehensive

Complete Example: llms.txt for a SaaS Company

Here's a realistic llms.txt for a fictional project management SaaS. This demonstrates every element of the specification in practice.

# ProjectFlow

> ProjectFlow is a project management platform for software teams. It provides task tracking, sprint planning, time logging, and team analytics. API available. Free tier for up to 5 users.

## Key Pages

- [Features](https://projectflow.com/features): Complete list of platform capabilities including boards, timelines, and reporting
- [Pricing](https://projectflow.com/pricing): Three tiers — Free (5 users), Pro ($12/user/month), Enterprise (custom)
- [Integrations](https://projectflow.com/integrations): Connects with GitHub, GitLab, Slack, Jira, and 40+ other tools
- [Security](https://projectflow.com/security): SOC 2 Type II certified, GDPR compliant, SSO via SAML

## Documentation

- [API Reference](https://docs.projectflow.com/api): REST API with OpenAPI spec, rate limit 1000 req/min
- [Webhooks Guide](https://docs.projectflow.com/webhooks): Real-time event notifications for task and sprint changes
- [Getting Started](https://docs.projectflow.com/quickstart): 5-minute setup guide for new teams
- [SDKs](https://docs.projectflow.com/sdks): Official libraries for Python, JavaScript, Go, and Ruby

## API

- [Authentication](https://docs.projectflow.com/api/auth): OAuth 2.0 and API key authentication
- [Tasks API](https://docs.projectflow.com/api/tasks): CRUD operations for tasks, subtasks, and comments
- [Projects API](https://docs.projectflow.com/api/projects): Create and manage projects, boards, and sprints
- [Users API](https://docs.projectflow.com/api/users): Team member management and role assignment

## Optional

- [Blog](https://projectflow.com/blog): Product updates, engineering deep dives, and project management best practices
- [Changelog](https://projectflow.com/changelog): Weekly release notes
- [Status Page](https://status.projectflow.com): Real-time uptime and incident history

Notice the pattern: factual descriptions, specific numbers (pricing, rate limits, user counts), clear categorisation, and no marketing fluff. The blockquote summary tells an AI model everything it needs to decide whether ProjectFlow is relevant to a user's query — in two sentences.

The Honest Adoption Picture

Let's be direct about where llms.txt stands in early 2026, because there's a lot of optimistic content out there that skips the nuance.

What's True

  • Adoption is growing but still modest. Estimates put it at roughly 10% of major websites as of early 2026. That's meaningful growth from near-zero in late 2024, but it's far from ubiquitous.
  • Notable adopters exist. Anthropic, Cloudflare, Stripe, Vercel, Cursor, and several other developer-focused companies serve llms.txt files. Their presence lends credibility to the format.
  • The spec is stable. Jeremy Howard's original proposal hasn't undergone breaking changes. The simplicity of the format means there's little to argue about.
  • Agentic tools consume it. AI coding assistants, research agents, and RAG pipelines do fetch llms.txt when building context about a domain. This is the primary real-world use case today.

What's Not Confirmed

  • No major LLM vendor reads it during inference. OpenAI, Anthropic, and Google have not officially confirmed that ChatGPT, Claude, or Gemini automatically fetch llms.txt when answering user questions. The models' training data may include llms.txt files, but that's different from live fetching.
  • No ranking signal confirmed. Unlike robots.txt (which search engines definitively respect), there's no evidence that having llms.txt improves your visibility in AI-generated answers. It may, but nobody has proven it.
  • SearchGPT and similar don't officially consume it. AI search products like Perplexity, SearchGPT, and Gemini Search have their own crawling and indexing strategies. llms.txt is not part of their documented intake pipeline.

Where It Genuinely Helps

The real value of llms.txt today is in the agentic layer. When an AI agent is tasked with "find me a project management tool with a REST API and free tier", it may browse the web and encounter your site. If it can fetch /llms.txt and get a clean, structured summary in 50 lines of text, it understands your offering immediately — without parsing your entire homepage, pricing page, and docs. That's the use case that matters right now.

Secondary benefits include: providing context to AI tools that your team uses (Cursor, Claude Code, Windsurf), serving as a clean machine-readable summary for internal documentation, and positioning your site for whatever AI discovery protocols emerge next.

llms.txt vs robots.txt: Different Jobs, Same Neighbourhood

These two files live at the same level of your domain but serve fundamentally different purposes. Confusing them is common, so let's be precise.

Aspect robots.txt llms.txt
PurposeAccess control — what crawlers can and cannot fetchContext — what the site is and what it offers
AudienceSearch engine crawlers, AI crawlersAI language models, agents, RAG pipelines
FormatCustom syntax (User-agent, Disallow, Allow)Simplified Markdown
Standard statusDe facto web standard since 1994, formalised as RFC 9309Community proposal since September 2024
EnforcementRespected by major crawlers (though not legally binding)No enforcement — purely informational
What happens without itCrawlers assume everything is accessibleAI has no summary and must parse your full site
Can they coexist?Yes — and they should. Use robots.txt to control access and llms.txt to provide context.

A practical example: your robots.txt might block AI crawlers from your dashboard and admin pages (sensible), while your llms.txt tells them about your public features, pricing, and API. The two files work together — one sets boundaries, the other provides helpful context within those boundaries.

The robots.txt AI Crawler Debate

There's a separate but related discussion about whether to allow or block AI crawlers in robots.txt. Some site owners block GPTBot, ClaudeBot, and others to prevent their content being used for training. Others allow them to maximise visibility in AI-generated answers. This decision is independent of llms.txt — you can block crawlers in robots.txt while still providing llms.txt as a context document. The llms.txt file is typically fetched by agent-layer tools (not training crawlers), so blocking training crawlers doesn't prevent llms.txt from being useful.

llms.txt vs agents.md: When to Use Which

Both files help AI understand your site, but they serve different roles in the agentic web stack. The simplest way to think about it: llms.txt is the brochure, agents.md is the instruction manual.

Aspect llms.txt agents.md
Length50-100 lines200-2,000+ lines
Content typeSummary with linksDetailed instructions
Primary readerLLMs building general contextAI agents performing actions
Includes tool schemas?No — just mentions they existYes — full schemas with parameters
Includes auth details?Brief mention (e.g. "OAuth 2.0")Full auth flow with examples
Includes examples?NoYes — request/response examples
Update frequencyWhen major features changeWhen APIs or tools change
FormatSimplified Markdown (H1, H2, links)Full Markdown with code blocks

Should You Have Both?

If your site is purely informational (a blog, portfolio, or brochure site), llms.txt alone is sufficient. If your site has interactive capabilities — an API, tools, forms that AI agents might use — then both files add value. llms.txt gives the quick overview; agents.md gives the detailed operating instructions. For the agentic web stack we build at p0stman, we recommend both alongside machine-readable manifests like MCP and A2A.

llms.txt vs context.md: Depth Differences

A third file in the discovery layer is context.md, which sits between llms.txt and agents.md in terms of depth and purpose.

File Purpose Depth Best For
llms.txtQuick summary with linksShallowInitial discovery and relevance check
context.mdDeep business contextMedium-deepAgents that need to understand the full business
agents.mdInteraction instructionsDeep (technical)Agents that need to take actions on your site

context.md typically includes: company background, ideal customer profile, pricing details, feature explanations, competitive positioning, and domain-specific knowledge that an AI agent would need to accurately represent your product. It's the file an AI reads when it needs to answer "tell me about this company" in depth. llms.txt tells the AI the company exists and what it does; context.md tells the AI everything it would need to be a knowledgeable representative of that company.

The Argument For llms.txt Despite Low Adoption

Given the honest adoption picture, you might wonder if it's worth the effort. Here's the case for implementing llms.txt today:

1. The Cost-Benefit Is Overwhelmingly Positive

Creating an llms.txt file takes 5-15 minutes. There's no build step, no dependency, no ongoing cost. You write a text file, put it in your public directory, and deploy. The maintenance burden is near zero — update it when your product changes significantly. Compared to the potential upside of being discoverable to AI agents, the cost is trivial.

2. Early Movers Win in Protocol Adoption

robots.txt took years to go from proposal to universal adoption. Sitemaps followed a similar curve. The sites that adopted early didn't lose anything by being ahead of the curve. When llms.txt consumption becomes standard (and the trajectory points that direction), sites that already have one will be immediately indexed while competitors scramble to create theirs.

3. Agentic AI Is Growing Exponentially

The number of AI agents browsing the web is increasing rapidly. Coding assistants, research tools, shopping agents, travel planners — these tools need to understand websites quickly. Even if base models don't fetch llms.txt, the agent layer does. And the agent layer is where the growth is.

4. It Sharpens Your Own Understanding

Writing a good llms.txt forces you to articulate what your site does in clear, factual terms. No hiding behind marketing language. If you can't describe your offering in 50 lines of plain text, that's a signal worth paying attention to. The exercise itself has value beyond AI discovery.

5. It's Part of a Larger Stack

llms.txt doesn't exist in isolation. It's one layer of the agentic web architecture — alongside agents.md, JSON-LD schema, MCP servers, and A2A endpoints. Each layer reinforces the others. llms.txt is the simplest entry point into that stack.

How to Write an Effective llms.txt

Writing a good llms.txt is more about what you leave out than what you put in. Here's a practical guide.

Step 1: Write the Blockquote Summary

Start with the most important part: the one or two sentence summary. This is what AI models will use for relevance checking. Be specific and factual.

# Bad
> Welcome to our amazing platform that revolutionises how teams collaborate!

# Good
> TaskHub is a task management API for developer teams. REST and GraphQL endpoints, webhook support, free tier for up to 10 users.

Step 2: Identify Your Key Sections

Group your most important pages into 3-5 sections. Common categories:

  • Key Pages — Features, pricing, security, about
  • Documentation — Getting started, guides, tutorials
  • API — Authentication, endpoints, SDKs
  • Resources — Blog, changelog, status page
  • Optional — Nice-to-have links that AI can skip if context-limited

Step 3: Write Link Descriptions

Each link entry should answer "what will an AI find at this URL?" in one sentence. Include specific details that help with relevance matching.

# Bad
- [Pricing](https://example.com/pricing): Our pricing page

# Good
- [Pricing](https://example.com/pricing): Three tiers from free to $49/month, annual discount available, enterprise custom pricing

# Bad
- [API](https://docs.example.com/api): API documentation

# Good
- [API Reference](https://docs.example.com/api): REST API with 47 endpoints, OpenAPI 3.1 spec, rate limit 500 req/min on free tier

Step 4: Review and Trim

Your first draft will probably be too long. Trim ruthlessly. The goal is under 100 lines. Remove any link that isn't essential for understanding what your site offers. Remove adjectives. Remove anything that sounds like marketing. If you're over 100 lines, you're including too much — save the detail for agents.md or llms-full.txt.

Do's and Don'ts

Do Don't
Use specific numbers (pricing, limits, counts)Use vague language ("affordable", "fast", "scalable")
Write factual descriptionsWrite marketing copy
Keep it under 100 linesTry to list every page on your site
Update when features changeLet it go stale with broken links
Use absolute URLsUse relative paths
Include your API if you have oneInclude internal/admin pages
Mention authentication methodInclude actual API keys or secrets

How to Verify Your llms.txt Is Working

Once deployed, verify the file is accessible and correctly formatted.

Manual Checks

  1. Direct URL access: Open https://yourdomain.com/llms.txt in a browser. You should see plain text, not HTML or a 404.
  2. Content-Type header: Use browser DevTools (Network tab) to verify the response has text/plain or text/markdown content type. If your server returns text/html, AI tools may not parse it correctly.
  3. cURL test: Run curl -I https://yourdomain.com/llms.txt to check status code (should be 200) and content type.
  4. Link validation: Click every link in the file to confirm none are broken.

Using curl for Detailed Checks

# Check the file is accessible and get headers
curl -I https://yourdomain.com/llms.txt

# Expected output:
# HTTP/2 200
# content-type: text/plain; charset=utf-8

# Fetch the full content
curl https://yourdomain.com/llms.txt

# Check with a realistic user agent
curl -H "User-Agent: Mozilla/5.0 (compatible; GPTBot/1.0)" https://yourdomain.com/llms.txt

Community Tools

Several community tools have emerged for checking llms.txt files:

  • llmstxt.fyi — A directory of sites with llms.txt files, useful for seeing how others structure theirs
  • llms-txt.com — Validator and examples from the community
  • Screaming Frog / Sitebulb — Enterprise crawlers that can be configured to check for llms.txt alongside robots.txt and sitemap.xml

Framework-Specific Considerations

Different web frameworks handle static files differently. Here's how to ensure llms.txt is served correctly:

# Next.js: place in /public directory
public/llms.txt  → served at yourdomain.com/llms.txt

# Vercel: no config needed if in /public
# Nginx: add to location blocks if needed
location /llms.txt {
    default_type text/plain;
}

# WordPress: place in root directory alongside wp-config.php
# Django: add to STATICFILES_DIRS or serve via urls.py
# Rails: place in /public directory

Real-World llms.txt Examples

Examining how established companies structure their llms.txt files reveals patterns worth following.

Anthropic (anthropic.com/llms.txt)

Anthropic's llms.txt is notably concise — fewer than 30 lines. It links to their documentation, API reference, research papers, and model card. No marketing language. Each entry describes what the linked page contains in specific terms. They include both llms.txt and llms-full.txt. The brevity is instructive: even a company with extensive documentation keeps the summary tight and focused.

Cloudflare (cloudflare.com/llms.txt)

Cloudflare's file is longer, reflecting their broader product surface. They organise by product category: CDN, DNS, Security, Workers, R2, etc. Each section has 3-5 links with descriptions that include specific technical details (e.g. protocol support, geographic coverage). This demonstrates how larger companies with many products can still create a useful llms.txt by grouping logically.

Stripe (stripe.com/llms.txt)

Stripe's llms.txt leans heavily on API documentation, which makes sense for a developer-first product. The blockquote summary mentions payments, billing, and financial infrastructure. Links to API reference pages include specifics like supported payment methods and available SDKs. They also provide llms-full.txt with more comprehensive API documentation.

p0stman.com/llms.txt

Our own llms.txt describes p0stman as an AI product studio and lists key services (prototype to production, AI agents, fractional AI leadership), case studies, and guide pages. We include links to our MCP server endpoint and A2A agent card, demonstrating how the discovery layer connects to the action layer. It's under 50 lines.

Common Patterns Across Good llms.txt Files

  • Blockquote summary is always 1-2 sentences, never more
  • Sections rarely exceed 5-6 links each
  • Technical products emphasise API docs and authentication
  • Content sites emphasise topic categories and key pages
  • Pricing details appear in link descriptions, not just on the pricing page link
  • No file exceeds ~80 lines; most are 30-60 lines

Should Your Site Have an llms.txt? Decision Framework

Not every site needs an llms.txt. Here's a framework for deciding.

High Value: Implement Immediately

  • SaaS products with APIs — AI agents are increasingly tasked with finding and evaluating tools. llms.txt gives them instant context.
  • Developer documentation sites — Coding assistants fetch context about libraries and services. llms.txt helps them understand what's available.
  • B2B service companies — When AI agents help businesses find vendors, a structured summary gives you an edge.
  • Open source projects — Adoption by Anthropic, Vercel, and others shows the developer community values this signal.

Medium Value: Worth Doing

  • E-commerce sites — Shopping agents are emerging. A clear product/category summary helps.
  • Content-heavy sites — Blogs, news sites, educational content. Helps AI systems understand your content landscape.
  • Portfolio and agency sites — When prospects ask AI "find me a web development agency in London", having structured context helps.

Lower Value (But Still Free)

  • Personal blogs — Unless you're writing about niche technical topics that AI systems might reference.
  • Single-page sites — If your entire site is one page, the llms.txt would essentially duplicate it.
  • Internal tools — If the site isn't public-facing, llms.txt serves no external purpose.

The honest recommendation: if you have any doubt, just create one. The downside is essentially zero. Five minutes of work. No dependencies. No maintenance burden. And if the agentic web develops the way current trends suggest, you'll be glad you were early.

llms.txt in the Agentic Web Stack

llms.txt is one file in a broader architecture for making your website agent-ready. Here's how it connects to the other layers:

Layer 1: Discovery (Where llms.txt Lives)

The discovery layer is how AI agents find and initially understand your site. llms.txt is the lightweight entry point. It sits alongside:

  • agents.md — Detailed instructions for interacting with your site
  • context.md — Deep business context and domain knowledge
  • mcp.json — Machine-readable manifest of your MCP tools
  • .well-known/agent.json — A2A protocol agent card
  • robots.txt — Access permissions for crawlers

Layer 2: Comprehension

Once discovered, agents need to understand your content deeply. JSON-LD schema on every page, an /api/ai/context endpoint, and answer capsules on content pages all feed this layer.

Layer 3: Action

The action layer is where agents interact with your site programmatically via MCP servers, WebMCP browser registration, and data-mcp-tool attributes. llms.txt points agents toward these capabilities; the action layer lets them use them.

Layer 4: Agent-to-Agent

The A2A layer enables direct agent-to-agent communication. Your site's AI agent can receive tasks from other agents, process them, and return results. Skills advertised in your agent card define what your agent can do. llms.txt provides the human-readable context; the agent card provides the machine-readable interface.

The full agentic web architecture guide covers all four layers in detail. llms.txt is step one — the simplest, fastest piece to implement, and the one that makes everything else discoverable.

Frequently Asked Questions

What is llms.txt?

llms.txt is a plain-text file placed at the root of your website (e.g. example.com/llms.txt) that provides AI language models with a concise summary of your site's purpose, services, and key content. It was proposed by Jeremy Howard of fast.ai in September 2024 and uses a simplified Markdown format.

What is the difference between llms.txt and robots.txt?

robots.txt tells crawlers what they are allowed to access (permissions). llms.txt tells AI models what your site contains and does (context). They serve completely different purposes and you should have both. robots.txt controls access; llms.txt provides understanding.

Do AI models actually read llms.txt?

No major AI vendor has officially confirmed that their models automatically fetch and read llms.txt during inference. However, tools built on top of LLMs (like AI coding assistants, research agents, and RAG pipelines) do consume llms.txt when available. The file is most useful as context for agentic workflows rather than for the base models themselves.

What is the difference between llms.txt and llms-full.txt?

llms.txt is a concise index with links to deeper content. llms-full.txt contains all the detailed content in a single file, eliminating the need for AI agents to follow links. Use llms.txt for large sites with many pages, and optionally provide llms-full.txt for comprehensive single-file context.

How is llms.txt different from agents.md?

llms.txt is a brief, structured summary of what your site is and offers — typically under 100 lines. agents.md is a detailed instruction document that tells AI agents how to interact with your site, including tool schemas, authentication, rate limits, and example interactions. Think of llms.txt as the elevator pitch and agents.md as the full instruction manual.

What should I include in my llms.txt file?

Include your site name as an H1 heading, a blockquote summary, then sections for key pages, documentation, API endpoints, and optional resources. Each entry is a Markdown link with a brief description. Keep it under 100 lines, factual, and free of marketing language.

How many websites currently have llms.txt?

As of early 2026, adoption is estimated at around 10% of major websites. Notable adopters include Anthropic, Cloudflare, Stripe, and Vercel. The number is growing but llms.txt is far from universal. Implementation takes minutes and costs nothing, so the barrier is awareness rather than effort.

Should my website have an llms.txt file?

If your website offers a product, service, API, or substantial content that you want AI systems to understand and recommend, yes. It takes five minutes to create, costs nothing to maintain, and positions your site for the agentic web. The downside of not having one is greater than the effort of creating one.

Make Your Website Visible to AI

We implement the full agentic web stack — llms.txt, agents.md, MCP servers, A2A endpoints, and structured schema — so AI agents discover, understand, and recommend your business.