ai-search-geo-aeo-playbook-2026.html
< BACK TO BLOG Three monitors showing search citations, JSON-LD schema, and an AI chat citing sources

AEO and GEO in 2026: a practical playbook with Tavily, Winston, and schema

Most SEO advice for AI search is still vague slogans. "Write for E-E-A-T." "Be the answer." Useful frames, but you cannot operationalize a slogan.

This is the working playbook we use to get pages cited by ChatGPT web search, Perplexity, Google AI Overviews, and Bing Copilot. It is not theory. The same playbook is running on this site right now.

AEO and GEO, defined cleanly

Two acronyms, often used loosely. The clean definitions:

  • AEO — Answer Engine Optimization. Getting your content cited as the direct answer in answer-first surfaces (Google AI Overviews, featured snippets, voice assistants, ChatGPT search results, Perplexity citations).
  • GEO — Generative Engine Optimization. The broader category covering how generative AI systems retrieve, summarize, and credit your content. AEO is a subset.

Both share a common requirement: the AI engine has to find your content, parse it confidently, attribute it to a credible source, and reference it back to a user. Each of those steps is a lever you can pull.

Why the old SEO playbook is not enough

Classic SEO optimizes for one thing: a clickable link in a SERP. AI search optimizes for two things: getting the content into the model's context window, and getting cited when the model generates the answer. Those are different problems.

Examples of where the playbooks diverge:

  • Keyword density does not help an LLM. Clear factual statements do.
  • Long preambles before the answer hurt AI citation. The TL;DR pattern wins.
  • Pages that hide content behind tabs or accordions get partial extraction. Flat HTML with semantic headings gets full extraction.
  • Generic content rewritten from competitors loses. First-party experience and specific data win.

The five-stage pipeline

The pipeline we run for every published piece on this site:

Stage 1: Research with Tavily

Tavily is a search API designed for LLMs. Same kind of API any AI agent uses to fetch fresh ground truth. Before drafting, we pull 5 to 10 recent sources on the topic via the Tavily search endpoint, with `search_depth: "advanced"` for high-stakes pieces.

Why this matters for AEO: AI Overviews favor content that aligns with current consensus. Drafting from training data alone produces stale content. Drafting from fresh sources produces content that matches what the AI already "knows" is true — and gets cited because of it.

We do not copy from sources. We use them to verify dates, numbers, named entities, and current state of any fast-moving topic before writing.

Stage 2: Draft with first-party voice

AI engines are trained to identify and trust first-party experience. Three patterns that earn citation:

  • First person: "I tested this for a week" beats "This was tested".
  • Specific data: "LCP dropped from 2.8s to 0.7s" beats "performance improved significantly".
  • Honest scoping: "This works for X, not Y" beats "comprehensive solution for everyone".

We avoid the AI-generic register. Words like comprehensive, leverage, cutting-edge, robust, seamless, streamline, and unlock potential signal AI authorship to humans and to the engines that grade content. The full banlist we use lives in our content style guide.

Stage 3: Verify with Winston AI detector

Winston is an AI content detector with a public API endpoint at /v1/ai-content-detection. We run every draft through it before publish. The score returns a human-likeness percentage.

Our gate: anything under 80 percent human-likeness goes back for another pass. The work that gets to 95+ is content where structure, voice, and rhythm have all moved away from default LLM patterns. That score correlates with how AI engines themselves classify the content — content that reads as human gets cited at higher rates than content that reads as AI.

On this site, the most recent six posts all scored 100/100 human-likeness in Winston. That is a deliberate choice, not a happy accident.

Stage 4: Structure with schema

This is where most AEO advice stops at the surface. The schema patterns that actually move the needle:

  • Article / BlogPosting with mainEntityOfPage, author Person with sameAs, publisher Organization with logo, datePublished, dateModified, keywords, articleSection, wordCount, isAccessibleForFree, image as ImageObject with width and height.
  • FAQPage when the body has Q&A structure. This is the highest-leverage AEO schema. Direct AI Overview citation source.
  • Speakable with cssSelector pointing to your TL;DR or intro paragraphs. Voice and AI Overview engines surface speakable content.
  • HowTo when the post is genuinely a step-by-step guide. Captures the "how to do X" intent in AI search.
  • Service schema with provider, areaServed, audience, Offer with priceSpecification on commercial pages.
  • Linked entities via @id graph. Tying your Article back to a Person and the Person back to an Organization with sameAs creates a verifiable identity graph the engines actually use.

We emit all of this dynamically per route on this site. The static head only contains the entity-level schema (Person, Organization, WebSite). Per-page schema is injected by JS based on the active route.

Stage 5: AI crawler accessibility

Final stage, often skipped: making sure the AI crawlers can actually retrieve your content.

  • llms.txt at root: a concise markdown file listing your top URLs with descriptions. AI crawlers look for this. Many sites still do not have one.
  • robots meta tag with `max-snippet:-1, max-image-preview:large, max-video-preview:-1`. Removes default snippet length caps. AI Overview snippets get longer when this is present.
  • JS-rendered content: most AI crawlers do not render JavaScript. If your hero, H1, or canonical only exists after JS executes, the engines see nothing. Server-render or pre-render the critical SEO elements.
  • Allow ChatGPT-User, PerplexityBot, Google-Extended, ClaudeBot, anthropic-ai user-agents in robots.txt. Some sites block them by default with overly aggressive bot rules.

What gets you cited (in order of impact)

From observing dozens of pages getting cited and dozens that should but do not, the priority order:

  • 1. Specific factual claims with numbers, dates, and named entities
  • 2. FAQ-style structure with direct question-answer pairs
  • 3. First-party experience ("I tested", "we measured", "our client")
  • 4. Recency — dateModified in the last 90 days outperforms older content
  • 5. Linked entity graph (Person, Organization, sameAs)
  • 6. TL;DR or summary at the top of the piece
  • 7. Clean semantic HTML with one H1 and proper H2/H3 hierarchy
  • 8. Speakable schema flagging the citable passages

What kills citation chances

  • Content that reads as default LLM output (verbose, hedged, generic)
  • Long preambles before the actual answer
  • Content gated behind tabs, accordions, or carousels that crawlers cannot expand
  • Affiliate-stuffed pages where 40 percent of the content is product cards
  • Pages with no clear author, publisher, or credentials
  • Schema that is wrong (broken JSON-LD silently disqualifies the page)
  • Sites that block AI crawlers in robots.txt then complain about not being cited

Measuring whether the playbook works

Three signals to track:

  • Direct: Google Search Console performance data, segmenting by query type. AI Overview impressions appear under specific position-ranges.
  • Indirect: track ChatGPT and Perplexity referrer traffic in your analytics. The volume is small but growing fast.
  • Audit-based: query ChatGPT, Perplexity, and Google with your target queries monthly. Note which sources get cited for each. Track week over week.

Tools like the DataForSEO AI mentions endpoint and the ChatGPT scraper give programmatic access to the same data at scale.

Honest limitations

Two things to be straight about:

AI search is not yet a primary traffic source for most sites. The volume is real and growing, but it does not yet match organic Google. Optimize for it because the trajectory is clear, not because it is paying the bills today.

The engines change weekly. What earned citations in February 2026 may not in November. The pipeline above is more durable than any specific tactic because it focuses on signals (recency, structure, voice, identity) that all the engines weight, not on quirks of one engine.

What we run on this site

Concrete, for the curious:

  • Tavily Search API for pre-draft research on fast-moving topics
  • Winston AI detector v1/ai-content-detection on every draft, gating at 80+ human-likeness
  • FAL flux-dev for hero images, stored as Sanity assets
  • Sanity headless CMS feeding a single-page app rendered on Netlify
  • Per-route dynamic schema (Article, FAQPage, Speakable, HowTo, Service, BreadcrumbList) injected by JS
  • llms.txt at root, robots meta with max-snippet:-1 and max-image-preview:large
  • Linked-entity graph: Person with sameAs to LinkedIn/X/YouTube, Organization for Seahawk Media as publisher

If you want to see the schema in action, view source on any post on this site. Or readWordPress vs Next.js in 2026for the comparison piece this playbook produced first.

The bottom line

AEO and GEO are not magic. They are a content production discipline plus a small set of technical accessibility steps. The teams that do well in AI search are the same teams that do well in classic SEO — they just add three things: fresh research, voice that does not read as AI, and schema that lets engines parse the content cleanly.

If you want help operationalizing this for your site,book a 30-minute calland we will walk through your stack, your content workflow, and where the highest-impact changes are.

Prompt engineering in 2026: what it is and what it pays

I Built This AI Website in 24 Hours

WordPress vs Next.js in 2026: my honest comparison

Headless WordPress in 2026: the complete practical guide

< BACK TO BLOG