Structured Data AI Pickup Validator
Syntax check is table stakes. Foglift grades each schema for whether ChatGPT, Claude, and Perplexity will actually pick it up: unnamed nested entities, over-nesting, weak entity disambiguation, sparse citation metadata. The pitfalls Google's Rich Results Test silently passes.
What this tool actually checks
Standard structured data testers (including Google's Rich Results Test) answer one question: is the JSON-LD syntactically valid and does it have the fields needed for Google rich results. That is necessary, not sufficient. AI engines tokenize JSON-LD differently. They can ingest schema that Google flags AND skip schema that Google passes. This tool layers an AI Pickup Score on top of the syntax check, so you see both: the legacy SEO verdict, and the AI ingestion verdict.
The 5 AI Pickup dimensions
Identity
Does the schema have a name or headline AI engines can quote in citations. Sounds obvious. We see it missing constantly on Article schemas where the dev set headline only via metadata.
Entity disambiguation
url plus sameAs (Wikipedia, Wikidata, social URLs) plus a stable @id. AI engines need to reconcile this entity to known knowledge. Without it, you are a string, not an entity.
Freshness (content types)
Recent dateModified or datePublished. Zyppy / Digital Bloom IQ, 2025: content updated within 30 days gets 3.2x more AI citations. AI engines treat stale content as less citable.
Nested-entity hygiene
The pattern most testers miss. Nested Person, Organization, Brand, Product, Place, and LocalBusiness entities need name or @id. {"@type": "Person"} alone is invisible to AI engines.
Citation/content richness
Type-aware: Article gets points for citation, mentions, and about. FAQ gets points for 3+ Q&A pairs. Product gets points for aggregateRating, review, offers, brand. Organization gets points for description, sameAs, logo, contact.
Risk flags (separate)
Beyond the score: deeply nested schemas (depth >5), over-stuffed keyword lists (>20), descriptions longer than 600 chars (truncated in citation panels), string-only authors. These do not subtract from the score but are surfaced as warnings.
How AI engines actually use structured data
ChatGPT, Claude, Perplexity, and Google AI Overviews each parse JSON-LD on ingestion. The parsing is lossy. They look for a small set of high-signal patterns:
- •Entity reconciliation. sameAs pointing to Wikipedia or Wikidata is the strongest signal. It connects your schema to the AI's training-time knowledge graph.
- •Q&A extraction. FAQPage schemas are the single highest-cited type in AI answer panels because they pre-format question-answer pairs that match the prompt-response shape.
- •Trust signals. Product schemas with aggregateRating and reviewCount get surfaced in comparison answers. Without them, you are not in the comparison.
- •Authority chains. Article.citation referencing CreativeWork or ScholarlyArticle gives AI engines a verifiable source path. Most blogs ignore this field. AI engines reward it.
Frequently Asked Questions
How is this different from Google's Rich Results Test?
Google's tester answers a binary question: will Google render rich results from this schema. Foglift answers a different question: will AI engines pick up this schema when they crawl the page. The two checks overlap on syntax (valid JSON, required fields present), but diverge on what they reward. Google's tester does not flag unnamed nested entities, over-nesting, or sparse citation metadata. Foglift does, because those are the patterns that AI engines silently deprioritize during ingestion. Use both. Google's tester is necessary for traditional rich results. Foglift's is necessary for AI citations.
What does the AI Pickup Score actually measure?
Five dimensions, 20 points each. Identity (does the schema have a name AI can quote). Entity disambiguation (url + sameAs + @id, so AI can reconcile to known knowledge). Freshness, for content types (recent dateModified). Nested-entity hygiene (do nested Person, Organization, Brand entities have name or @id). Citation richness (type-aware: Article expects citation/mentions/about; FAQ expects 3+ Q&As; Product expects rating/review/offers/brand; Organization expects description/sameAs/logo/contact). The score is local to this tool, not the same as the Foglift Website Audit's AI Readiness Score, which evaluates the whole page across additional signals.
Why is over-nesting a problem if the JSON is valid?
AI engines do not parse JSON-LD the same way Google's structured data parser does. Schemas nested more than five levels deep get partially extracted: deeper branches are dropped or summarized. The same applies to over-stuffed fields: keywords lists with more than 20 entries, descriptions longer than 600 characters (citation panels truncate around 200 to 400 chars), and FAQs with more than 25 questions (typically only the first 10 to 15 are ingested). All of these are syntactically valid. Google's tester passes them. AI engines silently deprioritize them.
What does an unnamed nested entity look like in practice?
A common shape: an Article with author set to {"@type": "Person"} and no name field. Syntactically valid. Google accepts it. AI engines, however, cannot extract a Person entity that has no label, so the authorship signal is silently lost. Same pattern with brand under Product, publisher under Article, location under Event. Foglift's tester walks the schema tree, finds these unnamed Person, Organization, Brand, Product, Place, and LocalBusiness nodes, and counts them against your AI Pickup Score.
Which schema types should I add first if I'm starting from zero?
Three to start. Organization (or LocalBusiness) on the homepage with name, url, sameAs, logo, description. Article or BlogPosting on every blog post with headline, author as a Person object (not a string), datePublished, dateModified, and a citation field if you can populate it. FAQPage on any page with question/answer content (this is the highest-cited type in AI answer panels). Then layer in Product, BreadcrumbList, and HowTo as relevant.
Can I have multiple schemas on one page?
Yes, and you should. A typical post combines Article, Organization, BreadcrumbList, and FAQPage. Each can live in its own application/ld+json script tag, or be combined under @graph. The AI Pickup Score grades each schema independently and then computes a site-wide average, so weak schemas pull the page-level score down even if individual schemas pass. Fix the lowest-scoring schemas first.