Nov 13, 2025 AEO

Generative Engine Optimization: Field Notes for Early Adopters

The tactics that work for AI visibility: citations, statistics, direct quotes, are exactly the ones that feel like writing for machines.

Someone at Webflow was looking at attribution data when they noticed something. Eight percent of their signups were coming from somewhere they hadn't been tracking. Large language model referrals. The traffic converted at six times the rate of Google search traffic. Nobody had optimized for it. Nobody had even known to look for it. It was just there in the numbers one day.

Around the same time, researchers at Princeton were running experiments. They wanted to know what happens when you change how content is written and whether those changes affect visibility in AI-powered search. They tested nine different approaches. Made the language more authoritative. Added statistics. Included direct quotes from credible sources. Stuffed in more keywords the way people used to do for Google.

The keyword stuffing made things worse. Sites using that approach saw their visibility drop by 8.3 percent compared to doing nothing. When they tested the same thing on Perplexity, which is a real search engine that real people use, the drop was 9.1 percent.

The tactics that used to work don't work anymore.

Visibility Without Measurement

There's a restaurant somewhere. Someone asks ChatGPT for recommendations. The system pulls from whatever sources it trusts and generates an answer. Your restaurant gets mentioned. The person reads it. They show up for dinner that night.

Google Analytics sees none of this. No click. No session. No conversion path. The economics are real but the measurement is absent.

The companies that do click through from AI answers convert differently than traditional search traffic. Webflow's six times conversion rate. An early-stage software company nobody had heard of saw twenty-seven times higher conversion from AI referrals. Go Fish Digital, a marketing agency that documented what they were doing, found twenty-five times the conversion rate.

These aren't the small improvements you get from changing button colors or rewriting headlines. These are differences in kind, not degree. By the time someone clicks through from an AI-generated answer, they're past the research phase. The AI already filtered the options. The person clicking has nearly decided.

How Rankings Invert

Traditional search engines reward age and authority. Older domains rank better. Sites with more backlinks rank better. The established players compound their advantages over time.

The Princeton researchers tested their nine optimization methods on websites at different ranking positions. They wanted to see if the techniques worked the same way for sites ranked first versus sites ranked fifth.

They didn't.

A website ranked fifth in traditional search results that used the "cite sources" method saw its visibility improve by 115 percent. A website ranked first using the same method saw its visibility drop by thirty percent. The pattern held across different tactics. Adding quotations helped low-ranked sites by about 100 percent. Adding statistics gave low-ranked sites a 98 percent improvement.

Around this time, researchers at the University of Toronto were analyzing how different AI systems decide what to cite. They looked at whether the systems preferred to cite brand-owned content, like a company's own website, or earned media, which is when other people write about you, or social media. They published their findings in a comparison table.

Platform	Brand Content	Earned Media	Social Media
Google	45%	43%	12%
ChatGPT	14%	82%	4%
Perplexity	8%	90%	2%

Google splits things relatively evenly. ChatGPT cites earned media 82 percent of the time. Perplexity is even more skewed at 90 percent earned media.

When someone's trying to decide which product to buy, the bias gets stronger. ChatGPT's earned media citations jump to 90 percent. Brand content drops to 8 percent.

The systems don't care about domain authority the way Google does. They care whether credible third parties have written about you, and whether the information they find is structured in a way they can extract and use. A five-person company with properly structured content can show up in the same answer as a Fortune 500 competitor. The person asking the question can't tell which one is bigger.

Maybe this isn't new. Every technology shift creates a brief window where the old advantages don't matter as much. Then new advantages solidify and things stratify again. But right now, in late 2025, the window is open.

What the Tests Measured

The Princeton researchers ran their experiments on ten thousand queries. They used different domains. They ran everything five times with different random seeds to make sure the results were consistent. Then they validated the findings on Perplexity to see if the same patterns held on a real platform. They documented what worked and what didn't.

Optimization Method	Visibility Change	Perplexity Validation
Quotation Addition	+40.9%	+20.7%
Combination (Fluency + Statistics)	+31.4%	-
Statistics Addition	+30.6%	+21.2%
Fluency Optimization	+28.0%	-
Cite Sources	+27.5%	-
Technical Terms	+17.6%	-
Easy-to-Understand	+14.0%	-
Authoritative Tone	+10.4%	-
Keyword Stuffing	-8.3%	-9.1%

Adding direct quotations from credible sources increased visibility by 40.9 percent. Not paraphrased references. Actual quoted text with attribution. When they tested this on Perplexity, the improvement was 20.7 percent.

Adding statistics increased visibility by 30.6 percent. The difference between saying "email marketing could be effective" and saying "email marketing generates forty-two dollars of ROI per dollar spent" is the difference between a claim and a verifiable fact. On Perplexity, statistics increased visibility by 21.2 percent.

Explicitly citing sources for claims increased visibility by 27.5 percent. Making the writing clearer and more readable increased visibility by 28 percent. Not simpler, necessarily. Just clearer. Less friction between the information and whoever's trying to extract it.

When they combined fluency improvements with statistics, the effect was 31.4 percent. Better than any single method by itself.

Smaller improvements came from using technical terms, using language that was easier to understand, adopting a more authoritative tone.

Keyword stuffing decreased visibility by 8.3 percent. The systems can detect unnaturally dense keyword usage. They penalize it.

The earned media bias means the systems don't care who's talking. They care who's being quoted. The numbers from Princeton lined up with what the Toronto researchers found and what Semrush found when they analyzed eight hundred websites across eleven industries. The models reward citation density over keyword density. They reward third-party validation over owned assertions. They reward structure over scale.

Patterns by Subject

The Princeton researchers tagged their queries by category. They wanted to see if different types of content responded better to different optimization methods.

In debate-style questions and historical queries, authoritative tone mattered. Makes sense. Persuasion matters when the topic is argumentative.

For factual questions and statements, citations mattered most. 115 percent improvement. Citations let people verify claims.

For questions about law and government and opinion, statistics mattered most. Authority comes from numbers in those contexts.

For questions about people and society, explanations, historical events, direct quotations mattered most. Actual quotes add something when you're dealing with events that happened to real people.

Legal services should probably focus on statistics and citations. Historical consulting should probably focus on quotations and tone. Scientific organizations should probably focus on data and precision. The tactics work differently depending on what you're trying to be cited for.

Where the Citations Come From

In October 2025, Semrush analyzed more than eight hundred websites across eleven industries. They tracked which domains got cited most frequently by AI systems.

Three domains appeared in every single industry they studied. Reddit was cited about 66,000 times. Wikipedia about 25,000 times. YouTube about 19,000 times.

These aren't corporate websites. These are places where people talk about things. User-generated content. Community-driven authority. Third-party validation.

When Semrush looked at correlations between different metrics and AI visibility, they found something that contradicts twenty years of search engine optimization assumptions. They published the correlations.

Metric Relationship	Correlation Strength
AI Visibility ↔ AI Mentions	0.87
Topic Coverage Breadth ↔ AI Visibility	0.41
Backlink Count ↔ AI Visibility	0.37

Having comprehensive content across a subject area matters more than having concentrated link equity. Combined with the Toronto finding that 80 to 90 percent of citations come from earned media, the implication is clear. You can't just optimize your own website.

You need to be reviewed on platforms the systems trust. You need to participate in discussions where your expertise is relevant. You need educational content on YouTube that answers real questions. You need to be mentioned in industry publications. You need to be present in the places where people in your domain actually talk.

Your owned content matters. But if that's the only thing you're working on, you're addressing maybe ten or twenty percent of potential visibility.

The Measurement Gap

In March 2025, the Tow Center tested eight different generative search tools. They wanted to see how accurate the citations were. The systems failed to produce accurate citations in over sixty percent of tests. Citations were wrong. Sometimes fabricated. Sometimes attributed to the wrong source. Sometimes too vague to verify even if you wanted to.

This works both ways. You might be getting cited without knowing it. Or misrepresented. Or both.

You can test things manually. Ask the same questions in ChatGPT, Claude, Perplexity, Gemini. Take screenshots. Note which sources appear. The problem is that output depends on context. Previous conversations. Previous prompts. Settings. You get a feel for how the systems perceive your brand. You don't get comprehensive measurement.

Specialized tracking tools exist. They cost between ninety-nine and five hundred ninety-five dollars per month. They automate what you'd do manually. Multi-engine tracking. Competitor benchmarking. Sentiment analysis. Citation frequency. They're expensive enough that small companies have to calculate whether the investment makes sense before they have proof it works.

Google Analytics can track traffic that clicks through from AI referrals. You can measure conversion rates. You can see which pages people visit. But this only captures traffic that clicks. A lot of the value never generates a trackable click.

The companies that moved early didn't wait for perfect measurement. Webflow. LS Building Products. Smart Rent. They tested what they could track. They made decisions based on directional indicators and bottom-line revenue impact even when attribution was incomplete.

How Systems Differ

The Toronto researchers measured how different AI systems choose what to cite. The differences are large enough that you can't really optimize for "AI" as one thing.

Claude shows high stability across languages. It reuses the same domains whether someone asks in English or Spanish or French. If Claude cites you in English, there's a reasonable chance it'll cite you in other languages.

ChatGPT shows almost no overlap across languages. It switches to entirely different sources by language.

Google shows low overlap. About 0.1. It leans toward English-language sources even when the query is in another language.

If you're trying to build visibility in multiple languages, this matters. With Claude, authoritative English content might get you cited in Spanish. With ChatGPT, you need language-specific coverage in authoritative local media.

The source preferences shift based on what someone's trying to do. For queries where someone's evaluating which product to buy, the patterns are clear.

Google cites brand content 10 percent of the time, earned media 50 percent, social media 40 percent.

ChatGPT cites brand content 8 percent of the time, earned media 90 percent, social media 2 percent.

ChatGPT overwhelmingly favors earned media during the evaluation phase. That's exactly when someone's deciding between you and your competitors.

For local search, the divergence is even more pronounced. The Toronto researchers measured how much overlap exists between the domains Google ranks and the domains AI systems cite for local businesses. They tested six categories.

Business Category	Google/AI Overlap
Home Cleaning	20.6%
Roofing	17.1%
Tax Preparation	15.4%
Dentists	11.3%
Auto Repair	2.5%
IT Support	0.1%

In specialized or fragmented sectors, AI and Google are citing almost completely different sources. If you're in one of these categories, local AI search requires different strategies than traditional local SEO. The directories that matter are different. The review platforms are different. The content structure that gets you cited has almost nothing to do with what gets you ranked in Google's local results.

What to Build

The data suggests certain patterns of behavior.

Make claims verifiable. Go through existing content and look for assertions that could be supported with specific numbers. Add the number. Add the citation. Link to the source. Make it checkable. The effect size is 30.6 percent for statistics, 27.5 percent for citations.

Let other people say things for you. When you interview someone who knows what they're talking about, quote them directly. Not paraphrased. Actual quoted text with attribution. The effect size is 40.9 percent. The highest in the Princeton study.

The corollary is that 80 to 90 percent of AI citations come from third parties. Being reviewed, mentioned, discussed, referenced on platforms AI systems read matters more than anything on your own website.

Remove friction between information and extraction. Clear headings. One idea per paragraph. Logical flow. This isn't about making things simpler. It's about making things clearer. Structure signals what matters. The effect size is 28 percent.

The combination of fluency and statistics is 31.4 percent. Structure and substance reinforce each other.

Build coverage breadth. Topic coverage correlates with AI visibility at 0.41. Backlink count correlates at 0.37. Comprehensive content across a subject area matters more than concentrated link equity.

Some things degrade performance. Keyword stuffing decreases visibility by 8.3 percent. On Perplexity it's 9.1 percent. Content that's unnaturally dense with keywords gets penalized.

Relying only on owned content addresses maybe 8 to 30 percent of potential citations depending on the platform. If you're not building earned media presence, you're working on a fraction of what matters.

Social media content gets less than 5 percent of AI citations. It's useful for other reasons. For AI visibility, it's low leverage.

Chasing backlinks exclusively misses the point. Correlation of 0.37 versus 0.41 for topic breadth means comprehensive content matters more than concentrated links.

Where We Are

In early November 2025, an article described what happens during early-stage searches. Someone types "best public health programs" or "top accounting software for small businesses." The query surfaces an AI summary first. If you're not referenced in that summary, you're excluded from the conversation. Not ranked lower. Not on page two. Excluded.

Traditional search gave you multiple entry points. Different result types. Different positions. Different ways to show up somewhere in the research journey.

AI-mediated search is more winner-take-all. The system synthesizes information and presents one integrated answer. If you're cited, you're in. If you're not, you're not.

Semrush predicts that traffic from large language models will overtake traditional Google traffic by the end of 2027. Google AI Overviews already appear on 13 percent of search results. Click-through rates on pages with AI features dropped from 32 percent to 16 percent. Backlinko reported an 800 percent year-over-year increase in referrals from large language models over the past three months. ChatGPT has more than four hundred million weekly users.

The companies moving now are establishing themselves as cited sources before most of their competitors understand the change. Webflow getting 8 percent of signups from large language models. LS Building Products with a 540 percent increase in AI Overview mentions. Smart Rent seeing 32 percent of their sales-qualified leads from ChatGPT. They built the infrastructure before most people realized it mattered.

By 2027, this won't be a competitive advantage. It'll be baseline. The question isn't whether to do this. The question is whether you're building visibility now, while the market is still figuring things out, or waiting for more proof and then trying to catch up.

Visibility compounds quietly. By the time the measurement catches up, the advantage will already belong to whoever moved before the metrics did.

Plate Lunch Collective works with businesses on AI-native content strategy and optimization for generative search. If you're seeing these patterns and wondering whether your content is positioned for this shift, we should talk.

Sources:

Aggarwal, P., et al. (2024). GEO: Generative Engine Optimization. arXiv. https://arxiv.org/abs/2311.09735

Chen, M., et al. (2025). Generative Engine Optimization: How to Dominate AI Search. arXiv. https://arxiv.org/abs/2509.08919

Liaison. (2025, November 7). Best Practices in the GEO Era: What Still Matters and What's New. https://www.liaisonedu.com/resources/blog/best-practices-in-the-geo-era-what-still-matters-and-whats-new/

McKenzie, L. (2025, October 23). Generative Engine Optimization (GEO): The Complete Guide. Backlinko. https://backlinko.com/generative-engine-optimization-geo

Maximus Labs. (2025, October 14). GEO Success Stories: Case Studies of Leading Brands and Startups. https://www.maximuslabs.ai/generative-engine-optimization/geo-case-studies-success-stories

Search Engine Land. (2025, October 14). Tracking AI Search Citations: Who's Winning Across 11 Industries. https://searchengineland.com/ai-search-citations-11-industries-463298

The Rank Masters. (2025, September 12). Generative Engine Optimization (GEO) Case Study: 8,337% Growth. https://www.therankmasters.com/blog/generative-engine-optimization-geo-case-study-trm-chatgpt

Go Fish Digital. (2025, September 24). Generative Engine Optimization (GEO) Case Study: 3X'ing Leads. https://gofishdigital.com/blog/generative-engine-optimization-geo-case-study-driving-leads/

Empathy First Media. (2025, May). Google AI Updates May 2025. https://empathyfirstmedia.com/google-ai-updates-may-2025/

Writesonic. (2025, November 11). Top 10 Generative Engine Optimization Tools To Try In 2025. https://writesonic.com/blog/generative-engine-optimization-tools

Generative Engine Optimization: Field Notes for Early Adopters

by Hayden Bond

Visibility Without Measurement

How Rankings Invert

What the Tests Measured

Patterns by Subject

Where the Citations Come From

The Measurement Gap

How Systems Differ

What to Build

Where We Are

Member discussion

Visibility Without Measurement

How Rankings Invert

What the Tests Measured

Patterns by Subject

Where the Citations Come From

The Measurement Gap

How Systems Differ

What to Build

Where We Are

Similar topics

Generative Engine Optimization Framework for Shopify Merchants: What Actually Works

The Rental Agreement You Never Signed

AI Browser Market Intelligence - 11/04/2025

AI Referral Traffic Intelligence in North America Week 44

The AI Citation Arms Race: Why Your Optimization Tactics Have an Expiration Date