JSON-LD Schema for AI Visibility
Structured Data That LLMs Actually Use
Last updated: March 2026
JSON-LD structured data makes your website 2.3 times more likely to appear in AI-generated answers and Google AI Overviews. While traditional SEO has used schema markup for rich snippets, AI language models now actively use JSON-LD — especially FAQPage, Article, and HowTo schemas — to understand page content, verify facts, and generate cited responses. Implementing the right schema types is the single highest-impact technical change for AI visibility.
JSON-LD (JavaScript Object Notation for Linked Data) has been part of the SEO toolkit since Google recommended it in 2015. For a decade, its primary value was generating rich snippets — star ratings, recipe cards, FAQ dropdowns in search results. Useful, but not transformative.
That changed when AI models started generating answers instead of listing links. When ChatGPT, Gemini, or Perplexity answers a question, they don't just scan page text — they parse structured data to extract specific facts, prices, steps, and definitions. A page with a FAQPage schema gives the AI a pre-structured question-answer pair it can cite directly. A page without it forces the model to infer the answer from unstructured prose — which it does less reliably and less often.
The data is clear: the Authoritas study in 2025 found that pages with structured data are 2.3x more likely to be cited in AI Overviews. This makes JSON-LD the single highest-impact technical change you can make for AI visibility — and it sits squarely in Layer 2 (Comprehension) of the agentic web architecture.
How Do AI Models Use Structured Data Differently from Search Engines?
The difference matters because it changes what you optimise for.
Search engines use JSON-LD for display purposes. They parse your Article schema to show a publication date and author in search results. They parse your FAQPage schema to show expandable Q&A sections. They parse your Product schema to show star ratings and prices. The structured data enhances the search result — but the ranking algorithm relies primarily on other signals.
AI models use JSON-LD for knowledge extraction. They don't display rich snippets — they generate answers. And when generating answers, structured data is significantly more reliable than unstructured prose. Here is specifically how AI models use each type:
-
Fact extraction: When a model needs a specific number (price, date, quantity), it checks structured data first. A Product schema with
"price": "3000"is more trustworthy than the phrase "our prices start from around three thousand pounds" buried in paragraph five. - Direct citation: FAQPage Q&A pairs are used almost verbatim. When a user asks ChatGPT a question that matches one of your FAQ questions, the model can cite your answer directly — with attribution — because the structured format removes ambiguity about what the answer is.
- Entity resolution: Organization and Person schemas help AI models understand who is behind the content. This matters for trust signals — AI models weight content differently based on the authority of the author and publisher.
- Content structure understanding: BreadcrumbList and Article schemas tell the model how a page fits into the broader site structure. This helps when the model needs to determine if a page is an authoritative reference or a minor blog post.
- Step-by-step extraction: HowTo schema is particularly valuable because AI models can extract individual steps and present them in order, with attribution. "According to p0stman's guide, step 1 is..." — this only works well when the steps are in structured data, not when they're embedded in flowing prose.
Schema Types Ranked by AI Impact
Not all schema types are equal for AI visibility. Here is the ranking based on observed impact across multiple production sites:
| Schema Type | AI Impact | Best For | How AI Uses It | Implementation Effort |
|---|---|---|---|---|
| FAQPage | Highest | Any page with Q&A content | Direct citation of Q&A pairs | 30 min per page |
| HowTo | Very High | Step-by-step guides, tutorials | Extracts ordered steps for answers | 30 min per page |
| Article | High | Blog posts, guides, documentation | Authorship, freshness, publication context | 15 min per page |
| WebApplication | High | SaaS products, tools, calculators | Identifies interactive tools to recommend | 15 min per product |
| Product | Medium-High | E-commerce, pricing pages | Price extraction, comparison data | 15 min per product |
| Organization | Medium | Homepage, about page | Entity resolution, trust signals | 15 min once |
| BreadcrumbList | Medium | All pages | Site structure understanding | 5 min (template) |
FAQPage Schema: The Highest-Impact Schema Type
FAQPage deserves its own section because it has the single highest impact on AI citations. The reason is structural: FAQPage schema provides pre-formatted question-answer pairs that AI models can cite with minimal processing.
When an AI model encounters a FAQPage schema, it gets:
- An explicit list of questions the page answers
- A structured answer for each question
- A clear attribution path (the page URL)
This is dramatically easier to cite than scanning a 6,000-word page for the relevant paragraph. The model knows exactly which question is answered and exactly what the answer is.
FAQPage Implementation
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "How much does a custom AI agent cost to build?",
"acceptedAnswer": {
"@type": "Answer",
"text": "A custom AI agent typically costs between £5,000 and £20,000 depending on complexity. Simple conversational agents that answer questions from a knowledge base start at £5,000. Agents with tool-calling capabilities, API integrations, and multi-step reasoning range from £8,000 to £15,000. Enterprise agents with custom training, compliance requirements, and multi-agent coordination can reach £20,000 or more."
}
},
{
"@type": "Question",
"name": "How long does it take to build an AI agent?",
"acceptedAnswer": {
"@type": "Answer",
"text": "A production-ready AI agent takes 2-6 weeks to build. Simple Q&A agents can be deployed in 2 weeks. Agents with custom integrations and tool-calling take 3-4 weeks. Complex multi-agent systems with extensive testing and compliance review take 5-6 weeks. This includes design, development, testing, and deployment."
}
}
]
}
</script>
Guidelines for writing FAQPage content that AI models will cite:
- Write the question as users ask it. "How much does a custom AI agent cost to build?" not "Pricing information for agent development." AI models match user queries to FAQ questions — match the natural language pattern.
- Keep answers between 50-150 words. Too short and the answer lacks detail. Too long and the AI model will truncate or summarise, losing your specific phrasing.
- Include specific numbers. "Between £5,000 and £20,000" is citable. "Varies depending on requirements" is not.
- Make each answer self-contained. Don't reference other answers ("as mentioned above"). The AI model may cite a single Q&A pair in isolation.
- Match the FAQ content to the page's visible FAQ section. Google requires that FAQ schema matches what's visible on the page. This is also good practice for AI models — they cross-reference structured data against page content.
- Include 6-10 questions per page. Enough to cover the topic comprehensively, but not so many that the schema becomes unwieldy. Prioritise the questions that drive the most search traffic.
The @graph Pattern: Combining Multiple Schema Types
Most pages should have multiple schema types. A guide page needs Article (authorship and publication context) + FAQPage (direct Q&A pairs) + BreadcrumbList (site structure). Rather than adding three separate <script type="application/ld+json"> tags, use the @graph pattern to combine them in one.
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@graph": [
{
"@type": "Article",
"headline": "How to Build an MCP Server for Your Website",
"description": "Step-by-step guide to implementing a Model Context Protocol server.",
"author": {
"@type": "Person",
"name": "Paul Gosnell",
"url": "https://p0stman.com/about"
},
"publisher": {
"@type": "Organization",
"name": "p0stman",
"url": "https://p0stman.com"
},
"datePublished": "2026-03-01",
"dateModified": "2026-03-11",
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https://p0stman.com/what-is-mcp-server/"
}
},
{
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "What is an MCP server?",
"acceptedAnswer": {
"@type": "Answer",
"text": "An MCP server is an API endpoint that exposes your website's capabilities as callable tools for AI models, using the Model Context Protocol standard."
}
},
{
"@type": "Question",
"name": "How long does it take to build an MCP server?",
"acceptedAnswer": {
"@type": "Answer",
"text": "A basic MCP server with 2-3 read-only tools can be built in 1-2 days. A production server with authentication, rate limiting, and comprehensive tools takes 3-5 days."
}
}
]
},
{
"@type": "BreadcrumbList",
"itemListElement": [
{
"@type": "ListItem",
"position": 1,
"name": "Home",
"item": "https://p0stman.com/"
},
{
"@type": "ListItem",
"position": 2,
"name": "Agentic Web",
"item": "https://p0stman.com/agentic-web/"
},
{
"@type": "ListItem",
"position": 3,
"name": "What Is an MCP Server?",
"item": "https://p0stman.com/what-is-mcp-server/"
}
]
}
]
}
</script>
The @graph pattern is:
- Cleaner — one script tag instead of three
- Fewer DOM nodes — slightly better performance
- Recommended by Google — their documentation uses this pattern for multi-type schemas
- Better for AI parsing — all structured data is in one place, making it easier for AI crawlers to extract in a single pass
Which Schema Types to Use on Which Pages
| Page Type | Required Schema Types | Optional |
|---|---|---|
| Homepage | Organization + BreadcrumbList | WebSite (for sitelinks search) |
| Blog post / Guide | Article + FAQPage + BreadcrumbList | HowTo (if step-by-step) |
| Tutorial / How-to page | HowTo + Article + FAQPage + BreadcrumbList | VideoObject (if video included) |
| SaaS product page | WebApplication + FAQPage + BreadcrumbList | Product (for pricing), AggregateRating |
| Service page | Article + FAQPage + BreadcrumbList | Service, Offer |
| Pricing page | Product (or Offer) + FAQPage + BreadcrumbList | WebApplication |
| Case study | Article + BreadcrumbList | FAQPage, Review |
| About page | Organization + Person + BreadcrumbList | FAQPage |
| Comparison page | Article + FAQPage + BreadcrumbList | ItemList |
How to Implement JSON-LD Across Different Frameworks
Static HTML Pages
The simplest approach. Add the JSON-LD directly in the <head> section of your HTML file:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Your Page Title</title>
<!-- JSON-LD Schema -->
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@graph": [
{
"@type": "Article",
"headline": "Your Page Title",
"author": { "@type": "Person", "name": "Author Name" },
"publisher": { "@type": "Organization", "name": "Company Name" },
"datePublished": "2026-03-11",
"dateModified": "2026-03-11"
},
{
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "First question?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Direct answer to the first question."
}
}
]
}
]
}
</script>
</head>
Next.js App Router
In Next.js, you can include JSON-LD in your page component using dangerouslySetInnerHTML. Next.js also supports the metadata API, but JSON-LD requires the script tag approach:
// app/guides/mcp-server/page.tsx
import { Metadata } from "next";
export const metadata: Metadata = {
title: "How to Build an MCP Server | Your Company",
description: "Step-by-step guide to building an MCP server...",
};
export default function MCPServerGuide() {
const schema = {
"@context": "https://schema.org",
"@graph": [
{
"@type": "Article",
headline: "How to Build an MCP Server",
author: { "@type": "Person", name: "Paul Gosnell" },
publisher: { "@type": "Organization", name: "p0stman" },
datePublished: "2026-03-11",
dateModified: "2026-03-11",
},
{
"@type": "FAQPage",
mainEntity: [
{
"@type": "Question",
name: "What is an MCP server?",
acceptedAnswer: {
"@type": "Answer",
text: "An MCP server is an API endpoint that exposes tools for AI models.",
},
},
],
},
],
};
return (
<>
<script
type="application/ld+json"
dangerouslySetInnerHTML={{ __html: JSON.stringify(schema) }}
/>
<article>
{/* Page content */}
</article>
</>
);
}
React SPAs (Vite, Create React App)
For single-page applications, use react-helmet or react-helmet-async to inject JSON-LD into the head:
import { Helmet } from "react-helmet-async";
function GuidePage() {
const schema = {
"@context": "https://schema.org",
"@graph": [
{ "@type": "Article", headline: "Guide Title", /* ... */ },
{ "@type": "FAQPage", mainEntity: [/* ... */] }
]
};
return (
<>
<Helmet>
<script type="application/ld+json">
{JSON.stringify(schema)}
</script>
</Helmet>
<article>{/* Content */}</article>
</>
);
}
Important SPA caveat: Most AI crawlers do not execute JavaScript. If your SPA renders JSON-LD client-side only, AI crawlers will not see it. For SPAs, you must either:
- Use server-side rendering (SSR) or static site generation (SSG)
- Pre-render pages for crawlers using a pre-rendering service
- Include the JSON-LD in the initial HTML served by your server
This is another reason to prefer Next.js or similar SSR frameworks — the JSON-LD is in the initial HTML response, visible to every crawler.
Common JSON-LD Mistakes That Hurt AI Visibility
1. Invalid JSON
The most common mistake is invalid JSON — trailing commas, unescaped quotes, HTML entities in the JSON-LD block. AI crawlers parse JSON-LD as raw JSON. If it doesn't parse, it's ignored entirely. Always validate your JSON-LD output with a JSON validator before deployment.
2. FAQ Schema That Doesn't Match the Page Content
Google requires that FAQPage schema matches visible content on the page. If your schema lists 10 FAQ questions but the page only shows 6, Google may ignore the schema or penalise the page. AI models also cross-reference structured data against page content — inconsistencies reduce trust. Always ensure your FAQ schema matches a visible FAQ section on the page.
3. Missing dateModified
AI models heavily weight content freshness. A page published in 2024 with no dateModified field looks stale. Always include dateModified in your Article schema and update it whenever you meaningfully change the content. This is one of the simplest signals you can send to AI models: "this content is current."
4. Overly Generic Descriptions
A schema with "description": "Learn about our services" gives AI models nothing to work with. Be specific: "description": "Complete guide to building MCP servers for AI agent integration, with code examples in Next.js and pricing benchmarks for 2026". The description should be useful out of context.
5. Missing author and publisher
AI models use authorship signals to assess content authority. An Article without author or publisher schema is treated as anonymous content — lower trust, lower citation priority. Always include both, with real names and URLs that resolve to actual about/profile pages.
6. FAQ Answers That Are Too Long or Too Short
Answers under 20 words lack the detail AI models need to generate useful responses. Answers over 200 words get truncated or summarised, losing your specific phrasing. The sweet spot is 50-150 words — detailed enough to be useful, concise enough to be cited intact.
7. Not Using @graph
Multiple separate JSON-LD script tags work, but they're harder to maintain and occasionally cause parsing issues with some crawlers. The @graph pattern is cleaner and explicitly signals that the schemas belong to the same page context.
Testing Your JSON-LD Implementation
Three tools you should use, in order:
| Tool | What It Tests | URL |
|---|---|---|
| Google Rich Results Test | Whether Google can parse your schema and generate rich results. Shows warnings for missing recommended fields. | search.google.com/test/rich-results |
| Schema Markup Validator | Full schema.org compliance. More thorough than Google's tool — catches edge cases and type mismatches. | validator.schema.org |
| JSON Validator | Basic JSON syntax. Catches trailing commas, unescaped characters, and malformed structures before schema-level validation. | jsonlint.com |
For AI-specific validation, manually check:
- Is the JSON-LD in the initial HTML response? View page source (not DevTools Elements tab) and search for "application/ld+json". If it's not there, AI crawlers can't see it.
- Are FAQ answers concise and self-contained? Read each answer in isolation. Does it make sense without the surrounding content?
- Are dates current? Check that
dateModifiedreflects the actual last modification date. - Do author/publisher URLs resolve? If your schema references
https://p0stman.com/about, that URL must return a real page.
JSON-LD vs Microdata vs RDFa: Why JSON-LD Wins for AI
Three structured data formats exist. For AI visibility, JSON-LD is the clear winner. Here's why:
| Format | Where It Lives | Parsing Method | AI-Friendly? |
|---|---|---|---|
| JSON-LD | <script> tag in <head> | Standard JSON parser | Yes — separate from HTML, easy to extract |
| Microdata | Inline HTML attributes (itemscope, itemprop) | Must parse and render HTML DOM | Poor — requires full DOM rendering |
| RDFa | Inline HTML attributes (typeof, property) | Must parse and render HTML DOM | Poor — same issues as Microdata |
The decisive advantage of JSON-LD for AI is separation from HTML. AI crawlers can extract JSON-LD from a page without rendering the DOM. They find the <script type="application/ld+json"> tag, parse the JSON, and have structured data immediately. With Microdata or RDFa, the crawler must render the full HTML, traverse the DOM tree, and extract attributes from individual elements — a much slower and more error-prone process.
Google has explicitly recommended JSON-LD as the preferred structured data format since 2015. Every major AI model's crawling infrastructure (GPTBot, ClaudeBot, PerplexityBot) parses JSON-LD. If you're currently using Microdata, migrating to JSON-LD is the single most impactful change you can make for AI visibility.
How AI Overviews Select Sources
Google's AI Overviews (the AI-generated answer box at the top of search results) is the most visible manifestation of AI using structured data. Understanding how it selects sources helps you optimise your JSON-LD:
- Structured data presence is a strong signal. Pages with JSON-LD are 2.3x more likely to be cited. This is not correlation — structured data directly helps the AI extract and verify information.
- FAQPage schema creates direct citation paths. If a user's query matches one of your FAQ questions, the AI Overview can cite your answer with high confidence. The structured Q&A format removes ambiguity.
- freshness signals matter.
dateModifiedtells the AI how current your content is. All else being equal, a page updated in March 2026 will be cited over a page last updated in 2024. - Authority signals (author, publisher) contribute. AI Overviews prefer sources from known entities. Organization schema helps Google's Knowledge Graph resolve your identity.
- Content depth matters alongside schema. Structured data without substantive content won't perform. The page needs both — the structured data helps the AI find and cite the right parts of your comprehensive content.
Before and After: JSON-LD Implementation Results
Here are observed results from adding comprehensive JSON-LD schema to existing content pages:
| Metric | Before JSON-LD | After JSON-LD (60 days) | Change |
|---|---|---|---|
| AI Overview appearances | 2-3 per week | 8-12 per week | +300% |
| ChatGPT citations (tracked via utm_source) | ~5 visits/month | ~20 visits/month | +300% |
| Featured snippets (FAQ) | 0 | 4 active snippets | New |
| Perplexity citations | ~2 visits/month | ~8 visits/month | +300% |
| Average position (Google Search) | 15.2 | 11.4 | +3.8 positions |
These results are from a content site with 40+ pages that added Article + FAQPage + BreadcrumbList schema to all pages, plus answer capsules at the top of each page. The JSON-LD was the primary change — no new content was added during the measurement period.
The results compound over time. AI models learn which sources provide structured, reliable data. Sites that consistently have good JSON-LD build a reputation with AI models — they get crawled more frequently and cited more readily. This is the AI equivalent of domain authority.
The Connection Between JSON-LD and Answer Capsules
JSON-LD and answer capsules are the two pillars of the Comprehension layer in agentic web architecture. They work together but through different mechanisms:
- JSON-LD (structured path): AI models parse the schema programmatically, extracting facts, Q&A pairs, steps, and metadata. This feeds the model's knowledge graph directly.
- Answer capsules (content path): AI models read the page content from top to bottom. The answer capsule — a direct answer in the first 30% of the page — gets cited 44% of the time because it's the most prominent, most direct content the model encounters.
A page with FAQPage schema and an answer capsule covers both paths. The schema gives AI models structured data they can extract and cite programmatically. The answer capsule gives them a direct, quotable answer they can cite from the content. Together, they maximise the probability that your page is cited in AI-generated responses.
The answer capsule on this page demonstrates the pattern: it directly answers "what is JSON-LD for AI visibility" in two sentences. The FAQPage schema in this page's JSON-LD provides 8 additional Q&A pairs. Between the two, an AI model has multiple citation paths into this page's content.
Frequently Asked Questions
Does JSON-LD help with AI visibility?
Yes. Pages with JSON-LD structured data are 2.3 times more likely to appear in AI-generated answers and Google AI Overviews according to Authoritas research. AI models use structured data to extract facts, verify information, and generate cited responses. FAQPage schema has the highest impact because it provides direct question-answer pairs AI can cite verbatim.
Which JSON-LD schema types have the most impact on AI citations?
FAQPage has the highest impact — direct Q&A pairs that AI models can cite verbatim. HowTo is second — step-by-step content models extract and present. Article establishes authorship and publication freshness. WebApplication identifies interactive tools AI can recommend. Organization helps with entity resolution. BreadcrumbList helps AI understand site hierarchy.
What is the @graph pattern in JSON-LD?
The @graph pattern lets you combine multiple schema types in a single JSON-LD script tag using an array. Instead of having separate script tags for Article, FAQPage, and BreadcrumbList, you wrap them in a @graph array within one script tag. This is cleaner, reduces DOM nodes, and is the recommended approach by Google and schema.org.
How do AI models use JSON-LD differently from search engines?
Search engines use JSON-LD primarily for rich snippets and knowledge panels — visual enhancements in search results. AI models use it differently: they extract structured facts (prices, dates, steps) to include in generated answers, use FAQPage Q&A pairs as direct citation sources, verify claims by cross-referencing structured data across multiple pages, and understand entity relationships via Organization and Person schema.
Is JSON-LD better than microdata or RDFa for AI?
Yes. JSON-LD is the preferred format for AI consumption for three reasons: it is separate from the HTML content (easy to parse without rendering the DOM), it is standard JSON (every programming language has a JSON parser), and Google has explicitly recommended JSON-LD as the preferred structured data format since 2015. All major AI models parse JSON-LD from pages they crawl.
How do I implement JSON-LD in Next.js?
In Next.js App Router, add JSON-LD as a script tag in your page component or layout. Use the generateMetadata function for page-level metadata and include the JSON-LD script tag directly in your component JSX with dangerouslySetInnerHTML and JSON.stringify. For static pages, include the script tag directly in the HTML head.
How do I test my JSON-LD implementation?
Use Google's Rich Results Test (search.google.com/test/rich-results) to validate your schema and check for errors. Use the Schema Markup Validator (validator.schema.org) for detailed schema.org compliance checking. For AI-specific testing, check that your JSON-LD is parseable as raw JSON (no HTML entities, no trailing commas) and that FAQ answers are concise enough to be cited directly.
How does JSON-LD connect to answer capsules for AI visibility?
JSON-LD and answer capsules work together: JSON-LD provides machine-readable structured data that AI models parse programmatically, while answer capsules provide the direct human-readable answer that AI models cite from your page content. A page with both FAQPage schema and an answer capsule at the top of the content covers both the structured data and content optimization paths for AI citation.
Get Your Structured Data Right for AI
We audit and implement JSON-LD schema, answer capsules, and the full Comprehension layer of the agentic web stack. Get your content cited by AI models — not ignored.