Direct Traffic Is Changing: What 12 Businesses Reveal About AI-Era Search Discovery
If you live on Oahu you have probably seen me walking. I walk a lot. Yes, I do it for the health benefits, but it's also a habit I picked up from my very first SEO mentor. He would ask me to walk and we would discuss current projects and methodologies. There's research showing walking boosts creative output by 60% or so. I believe it. Something about movement loosens up the brain.
During these walks I develop hypotheses about AI-mediated search. Then I test them.
This is one of those tests.
The hypothesis: When AI agents can verify your business claims (high entity confidence), you get more direct traffic in the cookieless, AI-mediated era.
The test: I analyzed twelve businesses across three industries—four beachfront hotels in the same Hawaiian market, four Hawaiian apparel retailers, and four SaaS companies. I measured their entity confidence scores and cross-referenced them with SimilarWeb traffic data.
The finding: Entity confidence correlates with direct traffic percentage at r = 0.779 (p = 0.003). Within e-commerce, the correlation was perfect (r = 1.00). The retailer with the most complete product specifications received 4x the traffic of the retailer with minimal specifications.
Here's what the data shows and what it might mean.
The Context
The industry has been documenting this shift. Conductor noted in May 2025 that mobile LLM traffic shows up as direct. MarTech wrote in February 2026 about how AI visibility increases direct traffic even when nobody clicks. The Retail Media Breakfast Club coined "Dark Search" to describe this AI-influenced, unattributed traffic.
The prevailing narrative treats this as a measurement problem. A black box. Attribution failure.
I wanted to see if there was a pattern worth optimizing for.
What Is Entity Confidence?
Before I explain the methodology, here's what I mean by "entity confidence."
Entity confidence measures how well AI agents can verify your business claims. It answers: Can AI fact-check you?
High entity confidence:
- Your website says you have 524 rooms. Booking.com says 524 rooms. TripAdvisor says 524 rooms. They all match.
- You list actual measurements (room size: 350 sq ft) not vague claims ("spacious rooms").
- Independent sources corroborate what you say about yourself.
- Your information is current and consistent everywhere.
Low entity confidence:
- Your website says one thing, other sites say something different.
- You use marketing language without specifics ("world-class," "luxury," "premium").
- Third-party sources contradict or can't verify your claims.
- Your information is outdated or inconsistent across platforms.
When AI agents evaluate whether to cite your business, they look for verifiable, consistent information. High entity confidence means they can fact-check you. Low entity confidence means they can't, so they either cite you with uncertainty or route to third-party sources instead.
What I Measured
Entity Confidence Score (0-100):
Four dimensions, weighted:
Summary Integrity (35%): Do your claims match across sources? If your website says you have 524 rooms but Booking.com says 520 and TripAdvisor says 530, that's low integrity. If all sources agree, that's high integrity.
Noun Precision (30%): What percentage of your content consists of verifiable claims (room dimensions, fabric composition, specific features) versus unverifiable qualifiers (stunning, luxurious, world-class, premium)?
Consensus Gap (25%): Does your claimed positioning match what independent sources say about you? If you claim to be a 5-star luxury resort but Booking.com classifies you as 3-star and reviews describe you as mid-tier, you have a consensus gap.
Data Freshness (10%): Is your information current? Last renovation was 2012 but not mentioned anywhere. Awards listed without dates. Closed amenities still listed on third-party sites. These reduce freshness.
For each business, I scored:
- Percentage of measurable claims on homepage
- Percentage of measurable claims on product/service pages
- Specification completeness (0-4 scale)
- Number of independent sources corroborating official claims
- Level of discrepancy between sources
Traffic Data (SimilarWeb, August 2025 - January 2026):
- Monthly visits
- Direct traffic percentage
- Organic search percentage
- Paid search percentage
- Referral percentage
- Bounce rate
Then I ran correlations.
The Findings
Cross-Category: Entity Confidence × Direct Traffic %
Pearson r = 0.779 (p = 0.003)
Properties scoring 80+ for entity confidence averaged 68-81% direct traffic. Properties scoring 50-55 averaged 35-44% direct traffic.
The correlation holds across all three industries despite significant differences in business models, market size, and traffic volume.
Caveat: Correlation doesn't prove causation. Entity confidence might correlate with direct traffic because both reflect underlying brand strength, not because one causes the other. A well-established brand might have both high entity confidence (because they've invested in content quality) and high direct traffic (because people already know them). The data can't distinguish between "entity confidence drives direct traffic" and "brand strength drives both." I can't rule out this alternative explanation with a 12-property observational study.
Within E-Commerce: Perfect Rank-Order Match
Spearman r = 1.00 (p = 0.000)
Four Hawaiian apparel retailers operating independent e-commerce domains, ranked by entity confidence:
| Property | Entity Score | Measurable Claims % (Product Pages) | Monthly Visits | Direct Traffic % |
|---|---|---|---|---|
| Specification-Rich Retailer | 81 | 68% | 16,085 | 60.5% |
| Mid-Tier Retailer A | 72 | 61% | 10,149 | 50.2% |
| Mid-Tier Retailer B | 71 | 52% | 9,080 | 38.6% |
| Specification-Light Retailer | 52 | 38% | 4,059 | 41.9% |
The top-scoring retailer publishes fabric composition, detailed sizing charts, and care instructions on product pages. The lowest-scoring retailer provides minimal product specifications. The top-scoring retailer received 4x the traffic of the lowest-scoring retailer. Perfect rank-order correlation between entity confidence and traffic volume.
This is the cleanest finding in the dataset. All four retailers operate independent brand domains of comparable scope. They compete in the same vertical (Hawaiian apparel). They serve the same geographic market. The confounds that plague hospitality (multi-brand parent domains) and SaaS (application login traffic) don't apply here. When you control for category and domain structure, entity confidence perfectly predicts traffic rank.
Within Hospitality: Confounded by Domain-Level Measurement
Spearman r = 0.40 (p = 0.60) (Not statistically significant)
Four hotels in the same Hawaiian market showed no clear correlation. The issue: domain-level traffic measurement.
Hotel A (luxury positioning, 73/100 entity score) operates on a single-brand domain (943K visits). Hotel B (historic luxury, 68/100 entity score) operates on a parent domain serving 8,000+ properties globally (24.4M visits).
Individual hotel performance is invisible in the data. The entity confidence ranking (Hotel A > Hotel B > Hotel C > Hotel D) doesn't match traffic ranking because traffic measures brand portfolio scale, not individual property quality.
Within SaaS: Application Traffic Dominates
Spearman r = -0.40 (p = 0.60) (Not statistically significant, negative direction)
Four SaaS companies showed inverse correlation driven by one outlier: a workspace/docs platform with 166M monthly visits but the lowest entity score in the category (79/100).
81.5% of its traffic is direct, but this reflects logged-in users accessing the application, not content discovery. The entity confidence framework measures content specificity for discovery, not product usage patterns.
The other three SaaS properties (scores 85-88) showed expected patterns: higher entity confidence, higher direct traffic percentage (68-76%).
What This Suggests
Direct Traffic Captures AI-Mediated Discovery
When someone asks ChatGPT "best Hawaiian shirt brands," gets a recommendation, and types the URL into their browser, that registers as direct traffic, not organic search or referral.
The discovery happened via AI. The attribution vanished.
This is the hypothesis. The data shows correlation but doesn't prove this mechanism. The correlation between entity confidence and direct traffic could run through AI citations (high entity confidence → more AI citations → more typed-in URLs → more direct traffic). Or it could reflect brand strength (established brands invest in both content quality and marketing, driving both entity confidence and brand recognition independently). The 12-property sample can't distinguish between these explanations.
What the data does show: Properties with high entity confidence (verifiable specifications, cross-source consistency, minimal adjective creep) have higher direct traffic percentages. Properties with low entity confidence have lower direct traffic percentages.
Direct Traffic Reflects Brand Recognition in Cookieless Era
Safari blocks third-party cookies. Chrome is phasing them out. Firefox blocks by default.
When someone reads about a retailer with detailed product information in an article, clicks through, and lands on the site, that increasingly measures as direct because the referrer header is stripped for privacy.
Properties with clear, memorable, verifiable content get shared more and remembered more. This drives direct navigation that analytics can't attribute to specific sources.
Within Comparable Categories, Entity Confidence Predicts Competitive Position
The e-commerce finding (r = 1.00) suggests that when controlling for market category and domain scope, entity confidence predicts relative traffic rank.
The retailer with detailed specifications and 68% measurable claims received 4x the traffic of the retailer with minimal specifications and 38% measurable claims in the same vertical.
This pattern likely holds across other categories but is obscured by domain-level measurement issues (hospitality) or product usage patterns (SaaS).
The Limitations
Small Sample Size: 12 properties across 3 industries. This is exploratory, not definitive.
Domain-Level Traffic: Individual hotel performance is invisible when properties operate on multi-brand domains. Future research needs property-level data.
Direct Traffic Is Noisy: The "direct" bucket includes brand loyalists, AI-influenced users, dark social sharing, untagged campaigns, and stripped referrers. It's composite, not pure.
Traffic ≠ Business Success: A SaaS product with 166M visits but low entity score is obviously succeeding. Traffic volume measures market adoption, not optimization quality.
Evolving Measurement: Analytics platforms or AI tools might provide better attribution in the future, changing what "direct" actually captures.
What This Might Mean
The industry has documented that AI-mediated discovery and cookieless attribution are inflating direct traffic. The conversation treats this as a problem to solve.
The data I pulled suggests we might need to optimize for it.
Entity confidence (content specificity, verification pathways, cross-source consistency) correlates with direct traffic percentage at r = 0.779. Within e-commerce, where domain structures are comparable, the correlation was perfect. The retailer that published fabric specifications received 4x the traffic of the retailer that didn't.
If this pattern holds across larger samples:
Direct traffic percentage might become a leading indicator for performance in AI-mediated, cookieless discovery.
Entity confidence might become the optimization lever. Not Schema markup for its own sake, but data sanitation that enables AI agents to verify claims and cite you confidently.
Within your category, the businesses that publish verifiable specifications, maintain cross-source consistency, and reduce adjective creep could capture higher direct traffic share.
The Open Question
Is this real, or am I seeing patterns in noise?
The r = 0.779 correlation is statistically significant (p = 0.003). The e-commerce perfect match (r = 1.00) is compelling. But 12 properties is a small sample. The hospitality data is confounded. The SaaS pattern is weird.
I'm publishing this as exploratory research, not definitive findings. If you run similar analysis and find different results, I want to know. If you have access to property-level traffic data for hospitality, I'd love to collaborate on a larger study. If you think I'm measuring the wrong things or missing confounds, tell me.
This is an evolving field. No one has definitive answers yet. We're all testing hypotheses.
Tomorrow I'll walk again. New thoughts will come. I'll test the ones that seem worth testing.
That's the process.
Member discussion