Does llms.txt replace robots.txt?

No. robots.txt controls crawler access. llms.txt guides LLMs toward priority content. They have complementary functions.

Business | Irene Burresi

Q: What is the difference between SEO and GEO?

SEO optimizes for lists of results from traditional search engines. GEO optimizes for being cited in synthesized responses from generative search engines like ChatGPT and Perplexity.

The AI Act Is Not (Just) Compliance: It's Industrial Policy

Irene Burresi — Tue, 06 Jan 2026 00:00:00 GMT

The Only Lever Left

74% of companies listed in Europe use American email providers. 89% of German enterprises consider themselves technologically dependent on foreign providers. The AI Act exists in this context. Reading it only as a compliance problem means missing the picture.

TL;DR: The AI Act is industrial policy. Europe is in structural technological dependence and regulation is the only lever where it still has global weight. The “Brussels Effect” (ability to export standards) is contested but likely for high-risk AI systems. In November 2025 the Digital Omnibus delayed implementation by 16 months, but the direction remains the same. Those reading the AI Act only as a regulatory checklist are looking at the tree and missing the forest.

The numbers on Europe’s technological position are known to insiders. They rarely enter the AI Act debate.

A Proton report from October 2025 analyzed DNS records of European listed companies: 74% use American email providers. Not startups: companies listed on stock exchanges, with governance and security obligations. A Bitkom survey of German companies with over 20 employees reveals that 89% consider themselves technologically dependent on foreign providers.

The EPRS report from the European Parliament completes the picture. Of the 100 largest global digital platforms by market cap, only 2% of the combined value is European. In cloud computing, hyperscalers, foundation AI models, Europe is a net importer.

This context changes how you read the AI Act. It’s not just about protecting European citizens from algorithms. It’s about using the only lever Europe has left to negotiate its position in a market dominated by others.

The Mechanism

The term “Brussels Effect” was coined by Anu Bradford in 2012 and developed in her 2020 book. The thesis is direct: the EU, thanks to its market size and institutional quality, manages to export its standards globally.

The mechanism works two ways. The de facto effect: companies wanting access to the European market adopt EU standards elsewhere, because maintaining two versions costs more than one. The de jure effect: other governments copy European rules because they work and reduce the cost of designing regulation from scratch.

GDPR is the canonical example. Privacy laws inspired by European regulation have been adopted in Brazil, Japan, California. Tech companies extended many GDPR protections to non-European users to simplify operations. The form of European regulation spread beyond the Union’s borders.

On the AI Act, academic literature is more nuanced.

A 2022 GovAI paper analyzed the conditions for Brussels Effect applied to artificial intelligence. The conclusion: de facto and de jure effects are likely, especially for high-risk systems from large American tech companies. Microsoft, Google, Meta operate in Europe with recruiting, credit, and content moderation systems. They’ll need to comply. And for many of these companies, it’s more economical to apply one global standard than to segment products by market.

The paper also identifies limits. The Brussels Effect works best when the EU market is unavoidable (it is for big tech), when regulation is perceived as high-quality (contested), and when credible alternatives don’t exist (China offers a different model). For low-risk AI systems or companies not operating in Europe, the effect will be smaller or absent.

An article on Policy Review proposes a complementary frame: the AI Act as “experimentalist governance”. Not a model to export wholesale, but one approach among many in a context of technological uncertainty. Interaction with other regulatory models (United States, United Kingdom, China) will be more cooperative and less unidirectional than the Brussels Effect frame suggests.

The synthesis: the Brussels Effect on AI exists but is contested and uncertain. It’s not guaranteed that European rules become global standard. It’s not guaranteed they remain irrelevant. The game is open.

The Tactical Adjustment

In November 2025, the European Commission proposed the Digital Omnibus. The package includes AI Act modifications that generated headlines about “Europe backing down”.

The facts: requirements for high-risk AI systems now apply later, about 16 months later. The new deadline is December 2027 for Annex III systems (recruiting, credit, healthcare), August 2028 for those embedded in regulated products. It’s a significant delay.

But the AI Act’s structure remains intact. Risk categories remain the same. Obligations remain the same. What changes is the calendar, not the destination.

The Digital Omnibus is a tactical adjustment, not a strategic reversal. Europe is calibrating timing, not abandoning direction. Those reading the delay as “backing down” are confusing speed with trajectory.

The Missing Frame

Conversation on the AI Act in Italy revolves almost entirely around compliance. Which systems fall into high-risk categories. How much conformity costs. What sanctions you risk. These are legitimate questions, but incomplete.

The missing context is the numbers from the start. 74% dependence on email. 89% perception of technological dependence. 2% of value in European digital platforms. In this frame, the AI Act is not a regulatory conformity problem. It’s a tool in a larger game about Europe’s position in the global technology market.

Europe has few levers. It has no hyperscalers. It doesn’t have the dominant foundation models. It doesn’t have the venture capital base of the United States or the deployment scale of China. What it has is a 450-million-person market and institutional capacity to regulate that other blocs don’t.

Using this lever to influence global standards is industrial policy. Calling it just “consumer protection” is an incomplete description. Treating it only as “compliance” is missing the picture.

Microsoft has made alignment with European regulation an element of positioning. Meta chose the opposite path, delaying model releases in Europe and pressuring for weaker rules. They’re different strategies reflecting different readings of where the market is going. Neither treats the AI Act as a simple checklist.

Maybe we should ask why we do.

Sources

Bradford, A. (2020). The Brussels Effect: How the European Union Rules the World. Oxford University Press.

Siegmann, C. & Anderljung, M. (2022). The Brussels Effect and Artificial Intelligence: How EU regulation will impact the global AI market. GovAI, arXiv:2208.12645.

Policy Review. (2025). Brussels effect or experimentalism? The EU AI Act and global standard-setting.

European Commission. (2025). Digital Omnibus on AI Regulation Proposal.

European Parliamentary Research Service. (2025). European Software and Cyber Dependencies.

TechReport. (2025). Europe’s Digital Dependence: The Risks of the EU’s Reliance on US Tech.

AI 2026: Why Stanford Talks About a Reckoning

Irene Burresi — Sat, 20 Dec 2025 00:00:00 GMT

The Year of Reckoning: Why 2026 Will Be Critical for Enterprise AI

42% of companies have already abandoned most of their AI projects. The data suggests the worst may not be over.

TL;DR: 42% of companies abandoned AI projects in 2025, double the previous year. Stanford HAI predicts 2026 will be the year of reckoning: less hype, more demand for concrete proof. Brynjolfsson’s employment data shows the impact already: -20% for junior developers, +8% for senior. For investors, the implications are clear: metrics defined before launch, not after; vendor solutions (67% success rate) vs internal development (33%); attention to go-live timelines, which kill projects more than technology.

In mid-December 2025, nine faculty members from Stanford Human-Centered Artificial Intelligence published their predictions for 2026. This is not the usual academic futurology exercise, but a collective statement with a clear message: the party is over.

James Landay, co-director of HAI, opens with a phrase that sounds almost provocative in an era of triumphalist announcements: “There will be no AGI this year.” The point, though, is what he adds immediately after: companies will begin publicly admitting that AI has not yet delivered the promised productivity increases, except in specific niches like programming and call centers. And we’ll finally hear about failed projects.

This is not a prediction about the future. It’s a snapshot of something already happening.

The Numbers No One Wants to Look At

In July 2025, the MIT Project NANDA published a report that generated considerable debate for a single statistic: 95% of enterprise AI projects generate no measurable return. The number has been contested, the methodology has its limitations, the definition of “success” is debatable. But it’s not an isolated data point.

During the same period, S&P Global found that 42% of companies abandoned most of their AI initiatives in 2025. In 2024, the percentage was 17%. The abandonment rate has more than doubled in a year. On average, the surveyed organizations threw out 46% of proof-of-concepts before they reached production.

According to the RAND Corporation, over 80% of AI projects fail, double the failure rate of traditional IT projects. Gartner reports that only 48% of AI projects reach production, and over 30% of GenAI projects will be abandoned after the proof of concept by end of 2025.

The causes are always the same: insufficient data quality (43% according to Informatica), lack of technical maturity (43%), skills shortage (35%). But beneath these numbers lies a deeper pattern. Companies are discovering that AI works in demos but not in production, generates enthusiasm in pilots but not ROI in balance sheets.

It’s these numbers that explain why Stanford HAI, an institution hardly known for technological pessimism, is shifting the conversation. No longer “can AI do this?” but “how well, at what cost, for whom?”.

Canaries in the Coal Mine

If failure rates are the symptom, Erik Brynjolfsson’s work offers a more precise diagnosis. “Canaries in the Coal Mine”, published in August 2025 by Stanford’s Digital Economy Lab, is among the most rigorous studies currently available on AI’s impact on the job market.

The paper uses ADP payroll data, the largest payroll service provider in the United States, covering over 25 million workers. The goal is to track employment changes in AI-exposed professions.

What emerges is clear. Employment for software developers ages 22-25 has declined 20% from the peak of late 2022, roughly since the launch of ChatGPT, through July 2025. This is not an isolated data point: early-career workers in the most AI-exposed occupations show a relative decline of 13% compared to colleagues in less exposed roles.

The most interesting finding, though, is the age divergence. While young workers lose ground, workers over 30 in the same high-exposure categories have seen employment growth between 6% and 12%. Brynjolfsson puts it this way: “It appears that what young workers know overlaps with what LLMs can replace.”

It’s not a uniform effect, but a realignment: AI is eroding entry-level positions faster than it creates new roles. The “canaries in the coal mine”—young developers and customer support staff—are already showing symptoms of a larger change.

When Brynjolfsson predicts the emergence of “AI economic dashboards” that track these shifts in near-real-time, he’s not speculating. He’s describing the infrastructure needed to understand what’s happening, infrastructure that doesn’t exist today but could become urgent in 2026.

The Divergence Between Adoption and Results

There’s a paradox in 2025 data that deserves attention. AI adoption is accelerating: according to McKinsey, the percentage of companies claiming to use AI rose from 55% in 2023 to 78% in 2024. Use of GenAI in at least one business function more than doubled, from 33% to 71%.

Yet, in parallel, project abandonment rates are growing instead of declining. S&P Global shows a jump from 17% to 42% in a single year. The MIT NANDA report speaks of a “GenAI Divide”, a clear division between the 5% extracting real value and the 95% that remain stalled.

Many companies have gone through the phases of enthusiasm, pilots, impressive demos, and then crashed against the wall of real production. They discovered that the model works in a sandbox but not with their data; that integration into existing workflows is more complex than expected; that the ROI promised by vendors doesn’t materialize.

Angèle Christin, a communication sociologist and HAI senior fellow, puts it plainly: “San Francisco billboards saying ‘AI everywhere!!! For everything!!! All the time!!!’ betray a slightly manic tone.” Her prediction: we’ll see more realism about what we can expect from AI. Not necessarily the bubble bursting, but the bubble might stop inflating.

The Measurement Problem

One of the most concrete, and potentially most significant, predictions comes again from Brynjolfsson. He proposes the emergence of high-frequency “AI economic dashboards”: tools that track, at the task and employment level, where AI is increasing productivity, where it’s displacing workers, where it’s creating new roles.

Today we have nothing like that. Labor market data arrives months late. Companies measure AI adoption but rarely its impact. Industry reports capture hype but not results.

If these dashboards do emerge in 2026, they’ll change how we talk about AI. The debate will shift from the generic “does AI have an impact?” to more precise questions: how fast is this impact spreading, who’s being left behind, which complementary investments work.

It’s an optimistic vision: better data leads to better decisions. But it’s also an implicit admission: today we’re navigating blind.

Healthcare and Legal: The Test Sectors

Two sectors emerge from Stanford predictions as particularly relevant testbeds.

Nigam Shah, Chief Data Scientist at Stanford Health Care, describes a problem that anyone in the sector will recognize. Hospitals are flooded with startups wanting to sell AI solutions. “Every single proposal can be reasonable, but in aggregate they’re a tsunami of noise.”

According to Shah, 2026 will see systematic frameworks emerge for evaluating these solutions: technical impact, the population the model was trained on, ROI on hospital workflow, patient satisfaction, quality of clinical decisions. This is work Stanford is already doing internally, but it will need to extend to institutions with fewer technical resources.

Shah also signals a risk. Vendors, frustrated by hospitals’ long decision cycles, might start going directly to end users. “Free” applications for doctors and patients that bypass institutional controls. This is already happening: OpenEvidence for literature summaries, AtroposHealth for on-demand answers to clinical questions.

In the legal sector, Julian Nyarko predicts a similar shift. The focus will move from “does this model know how to write?” to more operational questions: accuracy, citation integrity, exposure to privilege violations. The sector is already working on specific benchmarks, like those based on “LLM-as-judge”, frameworks where one model evaluates another model’s output for complex tasks like multi-document summarization.

Healthcare and legal share a characteristic: they’re highly regulated, with severe consequences for error. If AI must prove its value anywhere, it’s where the test will be hardest. And most significant.

Track Record: How Reliable Are These Predictions?

Stanford HAI publishes annual predictions going back several years. It’s worth asking how accurate they’ve been.

At the end of 2022, Russ Altman predicted for 2023 a “shocking rollout of AI way before it’s mature or ready to go”. It’s hard to find a more accurate description of what happened: ChatGPT, Bing Chat, Bard launched in rapid succession, with accuracy problems, hallucinations, embarrassing incidents. Altman had also predicted a “hit parade of AI that’s not ready for prime time but launches because driven by an industry too zealous.” Exactly right.

Percy Liang, also at the end of 2022, predicted that video would be a focus of 2023 and that “we might reach the point where we can’t tell if a human or computer generated a video”. He was a year early (Sora arrived in February 2024) but the direction was correct.

For 2024, Altman predicted a “rise of agents” and steps toward multimedia systems. Both came true, though agents are still more promise than production reality.

Not all predictions came true. Expectations of U.S. Congressional action were disappointed: Biden’s Executive Order happened, but the new administration changed direction. Overall, though, Stanford HAI’s track record is reasonable: they tend to be cautious rather than enthusiastic, and technical predictions are generally well-founded.

This doesn’t guarantee that 2026 predictions will come true. But it means they’re worth taking seriously.

What It Means for Decision-Makers

If Stanford predictions and failure rate data converge on anything, it’s this: 2026 will be the year when enterprise AI must show results, not demos.

For those managing tech budgets, the implications are concrete.

On the metrics front, AI projects must have success criteria defined before launch, not after. Not “let’s explore AI for customer service” but “reduce average ticket resolution time by 15% within 6 months, with cost-per-interaction below X”. Projects without clear metrics have a disproportionate likelihood of ending up in the 42% of abandonments.

On the make-or-buy front, the MIT NANDA report indicates that solutions bought from specialized vendors have a 67% success rate, against 33% for internal development. This doesn’t mean internal development is always wrong, but it requires skills, data, and infrastructure that many organizations overestimate having.

On timing, mid-market enterprises move from pilot to production in about 90 days, according to the same report. Large enterprises take nine months or more. Bureaucracy kills AI projects more than technology does.

Finally, a matter of honesty. The shadow economy of AI (90% of employees use personal tools like ChatGPT for work, according to MIT NANDA) indicates that individuals already know where AI works better than official enterprise initiatives. Instead of fighting it, organizations could learn from this spontaneous adoption.

What’s Missing

Stanford predictions have clear blind spots.

None of the experts mention energy consumption and AI’s environmental impact. Christin hints at it (“tremendous environmental costs of the current build-out”) but the topic isn’t developed. Yet AI data centers are becoming one of the world’s biggest energy consumers, and this will eventually factor into ROI calculations.

There’s also a lack of serious discussion about market concentration. Frontier models are developed by a handful of companies. This creates dependencies, influences pricing, determines who can compete. It’s a strategic factor that anyone planning AI investments should consider.

Landay alludes to “AI sovereignty”, countries wanting independence from American providers, but the topic remains superficial. This is rapidly evolving, with significant geopolitical implications, that deserves deeper analysis.

A Shift in Tone

More than individual predictions, what strikes you about the Stanford article is the tone. There’s no industry-typical enthusiasm. No promises of imminent transformation. There’s caution, demand for proof, emphasis on measurement.

When the co-director of a Stanford AI institute opens by saying “there will be no AGI this year,” he’s taking a stand against a dominant narrative. When economists like Brynjolfsson publish data on young workers losing employment, they’re documenting costs, not just benefits.

This doesn’t mean AI is overvalued or that projects should stop. It means the phase of uncritical adoption is ending. Whoever continues to invest will need to do so with calibrated expectations, defined metrics, ability to admit failure when it occurs.

2026, if these predictions are correct, will be the year when we discover which AI projects were sound and which were built on hype. For many organizations it will be a painful discovery. For others, an opportunity: whoever has already learned to measure, iterate, and distinguish value from promise will have a competitive advantage that generic enthusiasm cannot buy.

Sources

Brynjolfsson, E., Chandar, B., & Chen, R. (2025). Canaries in the Coal Mine: Six Facts about the Recent Employment Effects of AI. Stanford Digital Economy Lab.

McKinsey & Company. (2024). The State of AI in 2024: Gen AI adoption spikes and starts to generate value. McKinsey Global Institute.

MIT Project NANDA. (2025). The GenAI Divide 2025. Massachusetts Institute of Technology.

RAND Corporation. (2024). The Root Causes of Failure for Artificial Intelligence Projects and How They Can Succeed. RAND Research Reports.

S&P Global Market Intelligence. (2025, October). Generative AI Shows Rapid Growth but Yields Mixed Results. S&P Global.

Stanford HAI. (2025, December). Stanford AI Experts Predict What Will Happen in 2026. Stanford Human-Centered Artificial Intelligence.

AI Sovereignty: Europe's Decision Point

Irene Burresi — Sat, 20 Dec 2025 00:00:00 GMT

Full English Translation Coming Soon

This comprehensive analysis of AI sovereignty, European choices, and geopolitical implications will be fully translated soon.

The article covers:

Four definitions of AI sovereignty (legality, economic competitiveness, national security, value alignment)
Two operational models for achieving sovereignty
Gulf states’ AI infrastructure investments
Europe’s position between American providers and sovereign models
GAIA-X and EU cloud sovereignty initiatives

For now, please refer to the Italian version for the complete content.

You're Measuring AI Wrong

Irene Burresi — Sat, 20 Dec 2025 00:00:00 GMT

The measurement paradox

60% of managers admit they need better KPIs for AI. Only 34% are doing anything about it. Meanwhile, the data that actually matters already exists, but nobody’s looking at it.

TL;DR: Companies measure activity (hours saved, tasks automated) instead of impact. A Stanford paper analyzing 25 million workers shows what to do instead: segment by role and seniority, distinguish substitutive from augmentative use, use control groups, monitor in real time. Those who adopt these principles will have an information advantage over those still tracking vanity metrics.

The 2025 AI adoption reports tell a strange story. On one hand, companies claim to measure everything: completed deployments, hours saved, tickets handled, costs reduced. On the other, 42% are abandoning most of their AI projects, more than double the previous year. According to MIT NANDA, 95% of pilot projects generate no measurable impact on the bottom line.

If we measure so much, why do we fail so often?

The problem is we’re measuring the wrong things. Typical enterprise AI metrics (time saved per task, volume of automated interactions, cost per query) capture activity, not impact. They tell you whether the system works technically, not whether it’s creating or destroying value.

A paper published in August 2025 by Stanford’s Digital Economy Lab offers a different approach to what it means to truly measure AI. And the implications for those managing technology investments are concrete.

The vanity metrics problem

Most corporate AI dashboards track variants of the same metrics: how many requests processed, how much time saved per interaction, what percentage of tasks automated. These are numbers that grow easily and look good in slides. Their flaw is fundamental: they say nothing about real business impact.

A chatbot handling 10,000 tickets per month looks like a success. But if those tickets still require human escalation 40% of the time, if customer satisfaction has dropped, if your most profitable customers are migrating to competitors, the number of tickets handled captures none of this.

The S&P Global 2025 report documents exactly this pattern: companies that accumulated “deployments” and “completed experiments” only to discover, months later, that ROI wasn’t materializing. Costs were real and immediate; benefits were vague and perpetually deferred to next quarter.

According to an MIT Sloan analysis, 60% of managers recognize they need better KPIs for AI. But only 34% are actually using AI to create new performance indicators. The majority continues using the same metrics they used for traditional IT projects, metrics designed for deterministic software, not for probabilistic systems interacting with complex human processes.

What serious measurement looks like

“Canaries in the Coal Mine”, the paper by Erik Brynjolfsson, Bharat Chandar, and Ruyu Chen published by Stanford’s Digital Economy Lab, isn’t about how companies should measure AI. It’s about how AI is changing the labor market. But the method it uses is exactly what’s missing from most enterprise evaluations.

The authors obtained access to ADP payroll data, the largest payroll processor in the United States, with monthly records of over 25 million workers. Not surveys, not self-reports, not estimates: granular administrative data on who gets hired, who leaves, how much they earn, in which role, at which company.

They then cross-referenced this data with two AI exposure metrics: one based on theoretical task analysis (which jobs are technically automatable) and one based on actual usage data (how people actually use Claude, Anthropic’s model, in daily work).

The result is an X-ray of AI’s impact with unprecedented granularity. Not the generic “AI is changing work” but precise numbers: employment for software developers aged 22-25 dropped 20% from the late 2022 peak, while for those over 35 in the same roles it grew 8%. In professions where AI use is predominantly substitutive, young workers lose employment; where it’s predominantly augmentative, there’s no decline.

This type of measurement should inform corporate AI decisions. Not because companies need to replicate this exact study, but because it illustrates three principles that most enterprise metrics ignore entirely.

Measure differential effects, not averages

Aggregate data hides more than it reveals. If you only measure “hours saved by AI,” you don’t see who’s saving those hours and who’s losing their job. If you only measure “tickets automated,” you don’t see which customers are receiving worse service.

The Stanford paper shows that AI’s impact differs radically by age group. Workers aged 22-25 in exposed professions saw a 13% employment decline relative to colleagues in less exposed roles. Workers over 30 in the same professions saw growth. The average effect is nearly zero, but the real effect is massive redistribution.

For a CFO, aggregate productivity metrics can mask hidden costs. If AI is increasing output from the senior team while making it impossible to hire and train juniors, the short-term gain could transform into a talent pipeline problem in the medium term. The paper calls it the “apprenticeship paradox”: companies stop hiring entry-level workers because AI handles those tasks better, but without entry-level today there won’t be seniors tomorrow.

The operational consequence is that every AI dashboard should segment impact by role, seniority, team, and customer type. A single “productivity” number is almost always misleading.

Distinguish substitutive from augmentative use

One of the paper’s most relevant findings concerns the difference between substitutive and augmentative AI use. The authors used Anthropic’s data to classify how people actually use language models: to generate final outputs (substitution) or to iterate, learn, and validate (augmentation).

In professions where use is predominantly substitutive, youth employment has collapsed. Where use is predominantly augmentative, there’s no decline; in fact, some of these categories show above-average growth.

Not all “deployments” are equal. A system that automatically generates financial reports substitutes human labor differently from one that helps analysts explore scenarios. Metrics should capture this distinction: classify each AI application as predominantly substitutive or augmentative, separately track impact on headcount, skill mix, and internal training capacity. Augmentative systems might have less immediate ROI but more sustainable effects.

Control for external shocks

One of the Stanford paper’s most sophisticated methodological aspects is its use of firm-time fixed effects. In practice, the authors compare workers within the same company in the same month, thus isolating the AI exposure effect from any other factor affecting the company: budget cuts, sector slowdown, strategy changes.

The result: even controlling for all these factors, young workers in AI-exposed roles show a relative decline of 16% compared to colleagues in non-exposed roles at the same company.

This kind of rigor is rare in corporate evaluations. When an AI project launches and costs drop, it’s easy to credit the AI. But maybe costs would have dropped anyway due to seasonal factors. Maybe the team was already optimizing before the launch. Maybe the comparison is with an anomalous period.

The solution is to define baselines and control groups before launch. Don’t compare “before vs after” but “treated vs untreated” in the same period. Use A/B tests where possible, or at least comparisons with teams, regions, or segments that haven’t adopted AI.

Toward high-frequency economic dashboards

In his predictions for 2026, Brynjolfsson proposed the idea of “AI economic dashboards”, tools that track AI’s economic impact in near real-time, updated monthly instead of with the typical delays of official statistics.

It’s an ambitious proposal at the macro level. But the underlying logic is applicable at the company level: stop waiting for quarterly reports to understand if AI is working and instead build continuous monitoring systems that capture effects as they emerge.

Most AI projects are evaluated like traditional investments: ex-ante business case, periodic reviews, final post-mortem. But AI doesn’t behave like a traditional asset. Its effects are distributed, emergent, often unexpected. A continuous monitoring system can catch drift before it becomes a problem.

In practice, this means working with real-time data instead of retrospective data. If the payroll system can tell you today how many people were hired yesterday in each role, you can track AI’s effect on headcount with a lag of days, not months. The same applies to tickets handled, sales closed, errors detected.

Another key principle: favor leading metrics over lagging ones. The actual utilization rate (how many employees actually use the AI tool every day) is a leading indicator. If it drops, there are problems before they show up in productivity numbers.

As the Stanford paper segments by age, corporate dashboards should segment by role, tenure, and prior performance. AI might help top performers while harming others, or vice versa.

Internal comparisons are also essential: teams that adopted AI vs teams that didn’t, periods with the feature active vs periods with it deactivated. These comparisons are more informative than pure time trends.

The cost of not measuring

There’s a direct economic argument for investing in better measurement. The 42% of companies that abandoned AI projects in 2025 spent budget, time, and management attention only to get nothing. With better metrics, some of those projects would have been stopped earlier. Others would have been corrected mid-course. Others still would never have started.

The MIT NANDA report estimates that companies are spending $30-40 billion per year on generative AI. If 95% generates no measurable ROI, we’re talking about tens of billions burned. Not because the technology doesn’t work, but because it’s applied poorly, measured worse, and therefore never corrected.

The Brynjolfsson paper offers a model of what AI measurement could be. Administrative data instead of surveys. Demographic granularity instead of aggregate averages. Rigorous controls instead of naive comparisons. Continuous monitoring instead of point-in-time evaluations.

No company has Stanford’s resources or access to ADP’s data. But the principles are transferable: segment, distinguish substitutive from augmentative use, control for confounding factors, monitor in real time. Those who adopt these principles will have an information advantage over those who continue tracking deployments and hours saved.

Sources

Brynjolfsson, E., Chandar, B., & Chen, R. (2025). Canaries in the Coal Mine: Six Facts about the Recent Employment Effects of AI. Stanford Digital Economy Lab.

Deloitte AI Institute. (2025). State of Generative AI in the Enterprise. Deloitte.

MIT Project NANDA. (2025). The GenAI Divide 2025. Massachusetts Institute of Technology.

MIT Sloan Management Review. (2024). The Future of Strategic Measurement: Enhancing KPIs With AI. MIT Sloan.

S&P Global Market Intelligence. (2025, October). Generative AI Shows Rapid Growth but Yields Mixed Results. S&P Global.

Who Will the Senior Engineers of Tomorrow Be?

Irene Burresi — Mon, 06 Jan 2025 00:00:00 GMT

The apprenticeship paradox

Companies don’t hire juniors because AI does those tasks better. But without junior developers today, who will lead teams in ten years?

There’s a question that rarely appears in quarterly reports: if we stop hiring people who are learning, who will know how to do this job a decade from now?

The numbers tell a story that should concern anyone managing technical teams. Employment for software developers between ages 22 and 25 has dropped 20% from the peak in late 2022, according to a paper from Stanford’s Digital Economy Lab based on payroll data from 25 million workers. It’s not a uniform decline: in the same period, employment for those over 35 in the same roles grew 8%.

The mechanism is what we might call the apprenticeship paradox: companies stop hiring entry-level because AI does those tasks better than a recent graduate. But without entry-level workers today, they won’t have senior engineers tomorrow.

The numbers of collapse

The contraction is not an impression. It’s documented by multiple independent sources.

Entry-level hiring at the top 15 tech companies dropped 25% between 2023 and 2024, according to SignalFire. Since 2021, the average age of technical hires has increased by three years. Companies aren’t just hiring less: they’re hiring differently, preferring senior profiles who can be productive from day one.

Tech internships have collapsed 30% since 2023, according to Handshake. Meanwhile, applications have increased 7%. More people competing for fewer positions, and the remaining positions require increasingly prior experience.

A Harvard study of 285,000 American companies found that when firms adopt generative AI, junior employment drops 9-10% within six quarters. Senior employment remains stable. These aren’t mass layoffs: it’s a silent hiring freeze. Companies simply stop opening entry-level positions.

The pattern repeats in Europe. Junior tech positions have dropped 35% in major EU countries during 2024, based on aggregated data from LinkedIn, Indeed, and Eures. In the UK, the Big Four consulting firms cut graduate hiring between 6% and 29% in two years. In India, IT companies have reduced entry-level roles by 20-25%, according to an EY report.

The World Economic Forum, in its Future of Jobs Report 2025, warns that 40% of employers expect to reduce staffing where AI can automate tasks. And automatable tasks are, almost by definition, the ones junior developers used to do.

The logic of the short term

The rationale behind these choices is understandable. A senior engineer with AI tools can do what previously required two or three juniors, at least for certain tasks. GitHub Copilot, Cursor, and similar tools promise productivity gains of 20-50% according to their vendors. For a CFO looking at the next quarter, hiring a junior who will need six months of training before being productive seems like a difficult investment to justify.

James O’Brien, a computer science professor at Berkeley who works with startups, describes the shift: “Previously, startups would hire one senior person and two or three early-career coders to assist. Now they ask: why hire a recent graduate when AI is cheaper and faster?”

It’s a reasonable question in the short term. Code generated by AI isn’t top quality, but neither is code written by a recent graduate. The difference, O’Brien notes, is that the iterative process to improve AI code takes minutes. A junior might take days for the same task.

Heather Doshay, head of talent at SignalFire, sums it up: “Nobody has the patience or time for hand-holding in this new environment, where much of the work can be done autonomously by AI.”

The problem nobody calculates

There’s a flaw in this logic, and it’s called the talent pipeline.

Matt Garman, CEO of AWS, said it explicitly: “If you don’t have a talent pipeline you’re building, if you don’t have junior people you’re mentoring and growing in the company, we often find that’s where the best ideas come from. If a company stops hiring juniors and developing them, eventually the whole system falls apart.”

It’s not rhetoric. It’s demographic mathematics applied to organizations. Every senior engineer, every tech lead, every CTO was once a junior. The path from recent graduate to technical leader requires years of experience on real projects, mistakes made and corrected, feedback received, patterns internalized. There is no shortcut.

If the industry stops hiring juniors in 2023, by 2033 it will have a structural shortage of mid-level talent. By 2038, there will be a shortage of senior engineers. By 2043, there will be no one to promote to technical leadership roles.

The problem is that this cost doesn’t appear in any quarterly balance sheet. It’s an invisible debt that accumulates silently, and when it becomes obvious, it will be too late to remedy quickly.

AI that teaches and AI that atrophies

There’s a further irony in this situation. The same AI tools that are eliminating junior roles could, in theory, accelerate learning. An AI tutor available 24/7, patient, answering every question: it sounds like every student’s dream.

The reality is more complicated.

An experiment conducted by Wharton and Penn researchers on nearly a thousand high school math students tested two versions of a GPT-4-based tutor. The group with access to a ChatGPT-like interface (GPT Base) achieved 48% better results during assisted practice sessions. The group with a tutor designed to guide without giving direct answers (GPT Tutor) achieved 127% better results.

But here’s the point: when AI was removed and students took the exam on their own, the GPT Base group achieved 17% worse results than the control group who never used AI. The GPT Tutor group, by contrast, achieved results similar to control.

Students were using AI as a crutch. They performed better with assistance but learned less. When the assistance was removed, they found themselves worse off than those who never had it.

A study from MIT Media Lab documented what researchers call “cognitive debt”: using LLMs for writing seems to reduce mental effort during the task, but at the cost of more superficial learning. Researcher Nataliya Kosmyna expressed concern about developing brains: “Developing brains are the ones at highest risk.”

It doesn’t mean AI can’t help learning. The Wharton study shows it can, if designed with the right safeguards. But “wild” AI, the kind that gives answers instead of guiding toward answers, can do damage.

The new profile of the junior

If fewer juniors will be hired, what characteristics must they have to be hired?

Market signals are clear. It’s no longer enough to know how to code. Employers expect recent graduates to be able to manage projects, communicate with clients, understand the software development lifecycle. The “grunt work” that once served as a training ground is being automated. Those entering must be operational at a higher level almost from day one.

Jamie Grant, who manages career services for engineering at the University of Pennsylvania, describes the change: “They’re not necessarily just programming. There’s much more high-level thinking and understanding of the software development lifecycle.”

David Malan of Harvard, who teaches the world’s most-followed introduction to programming course, notes that the biggest impact of AI has been on programmers, not on roles that were expected (like call centers). The reason: programming work is relatively solitary and highly structured, perfect for automation.

But Malan also notes something interesting: in the United States, employment for “programmers” dropped 27.5% between 2023 and 2025, but employment for “software developers,” a more design-oriented position, dropped only 0.3%. The difference is in the level of abstraction. Those who write code are vulnerable. Those who design systems less so.

Three scenarios for the future

Scenario 1: The collapse of the pipeline

Companies continue not to hire juniors. In five to ten years, the shortage of mid-level talent becomes acute. The remaining senior engineers command astronomical salaries. Companies that can’t afford them lose competitiveness. The industry polarizes between a few giants who can attract talent and everyone else struggling.

Scenario 2: Apprenticeship reinvented

Some companies realize the problem is coming and invest against the trend. They create intensive training programs, perhaps assisted by AI designed to teach instead of replace. They become the preferred employers for top talent, who know they can grow there. In the long term, they have a competitive advantage.

Scenario 3: Uneven democratization

AI lowers the barrier to entry for some skills (writing working code) but raises it for others (designing systems, debugging complex problems, managing AI itself). Those with access to quality training and mentorship can skip some steps. Those without remain stuck. Inequality of opportunity increases.

None of these scenarios is inevitable. They are possibilities that depend on choices companies, educational institutions, and policymakers will make in the coming years.

What those who hire can do

If you manage a team or influence hiring decisions, some questions deserve reflection.

Are you optimizing for the next quarter or the next ten years? A junior costs more in the short term. But the alternative is to depend entirely on the external market for talent, competing with everyone else who made the same choice.

Is your team still teaching? If senior people spend all their time producing and no one teaching, you’re consuming human capital without regenerating it.

How do you use AI in training? If your juniors use Copilot to get answers instead of learning to find them, you’re accelerating their short-term productivity while compromising their long-term growth.

Are you hiring for today’s skills or tomorrow’s adaptability? Specific technical skills have an increasingly short half-life. The ability to learn, to reason about new problems, to work with people—those last.

What those starting out can do

If you’re early in your career in a market that seems to close doors on you, some principles can help.

AI isn’t eliminating all junior work. It’s eliminating repetitive, isolated junior work. The roles that survive require human interaction, judgment about ambiguous problems, creativity applied to specific contexts. Look for those.

Learn to use AI as a tool, not a crutch. The difference between using ChatGPT to get answers and using it to explore problems is the difference between atrophying and growing.

Networking matters more than ever. If junior positions are scarce, competition is fierce, and often the person with a connection wins, not the person with the best CV. It’s not fair, but it’s real.

Cross-functional skills are not optional. Communication, project management, understanding the business: these are things AI can’t do and employers seek even in technical profiles.

The unanswered question

I return to the initial question: who will the senior engineers of tomorrow be?

I don’t have a certain answer. No one does. We’re conducting a real-time experiment, without a control group, on a global scale.

What I know is that every senior person I know was once a junior who someone decided to hire and train. Every tech lead made beginner mistakes that someone had the patience to correct. Every systems architect wrote embarrassing code before writing elegant code.

If we eliminate that phase, if we treat it as a cost to cut rather than an investment to protect, we’re not optimizing. We’re consuming capital that we don’t know how to regenerate.

The question isn’t whether AI can replace juniors. It can, for many tasks. The question is whether we want an industry that only knows how to consume skills or one that also knows how to produce them.

For now, the numbers suggest we’ve chosen the first option. The bill will come. Not next quarter. But it will come.

Sources

Brynjolfsson, E., Chandar, B., & Chen, R. (2025). Canaries in the Coal Mine: Six Facts about the Recent Employment Effects of AI. Stanford Digital Economy Lab.

Bastani, H., Bastani, O., Sungu, A., Ge, H., Kabakcı, Ö., & Mariman, R. (2024). Generative AI Can Harm Learning. The Wharton School Research Paper.

Stack Overflow. (2025, December). AI vs Gen Z: How AI has changed the career pathway for junior developers. Stack Overflow Blog.

IEEE Spectrum. (2025, December). AI Shifts Expectations for Entry Level Jobs.

Rest of World. (2025, December). AI is wiping out entry-level tech jobs, leaving graduates stranded.

Kosmyna, N., et al. (2025). Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task. arXiv.

World Economic Forum. (2025). Future of Jobs Report 2025.

FinalRound AI. (2025). AWS CEO Shares 3 Solid Reasons Why Companies Shouldn’t Replace Juniors with AI Agents.

From SEO to GEO: Technical Guide to AI Search Optimization

Irene Burresi — Sat, 04 Jan 2025 00:00:00 GMT

The paradigm shift: from links to citations

GPTBot went from 5% to 30% of crawler traffic in one year. Traffic generated by user queries to AI has grown 15 times. Traditional SEO infrastructure no longer intercepts this flow.

TL;DR: Optimization for generative search engines (GEO) requires specific technical interventions: configure robots.txt for 20+ AI crawlers, implement llms.txt to guide LLMs toward priority content, extend structured data with JSON-LD including Person schema with complete E-E-A-T (73% higher selection rate). Structure content in answer blocks of 134-167 words to facilitate extraction. Multimodal content has +156% selection rate. Princeton research shows that adding citations from authoritative sources increases visibility up to 40%. Those who implement now build competitive advantages that are difficult to catch up with.

Traditional SEO optimizes for a specific goal: ranking in the sorted lists returned by search engines. The user searches, receives ten blue links, clicks. Traffic arrives.

Generative search engines work differently. ChatGPT, Perplexity, Gemini, Claude don’t return lists of links. They synthesize answers by drawing from multiple sources, citing (or not) the source. The user gets an answer, not a list of options.

According to Cloudflare data from December 2025, GPTBot reached 30% of AI crawler traffic, up from 5% the previous year. Meta-ExternalAgent entered at 19%. ChatGPT-User, the bot that accesses web pages when users ask questions, registered growth of 2,825%. Traffic related to user queries increased 15 times over the course of the year.

This is not marginal change. It’s a new acquisition channel that requires dedicated infrastructure.

robots.txt: configuration for AI crawlers

The robots.txt file communicates to crawlers which parts of the site they can access. For traditional search engines, the configuration is established. For AI crawlers, the landscape is fragmented: each provider uses different user-agents, with different purposes.

Map of major AI crawlers

OpenAI operates with three distinct crawlers:

User-agent: GPTBot
# Training foundational models. Collects data to train GPT.

User-agent: ChatGPT-User
# User browsing. Accesses pages when a user asks for information.

User-agent: OAI-SearchBot
# Search. Indexes content for ChatGPT's search function.

Anthropic uses:

User-agent: ClaudeBot
# Training and updating Claude.

User-agent: Claude-Web
# Web access for user functionality.

User-agent: anthropic-ai
# Generic Anthropic crawler.

Perplexity:

User-agent: PerplexityBot
# Indexing for AI answer engine.

User-agent: Perplexity-User
# Fetch for user queries.

Google has separated functions:

User-agent: Google-Extended
# Token for AI use. NOT a bot, it's a flag.
# Blocking this user-agent prevents use of content for AI training
# while maintaining standard indexing.

User-agent: Googlebot
# Traditional crawler for Search.

Meta:

User-agent: Meta-ExternalAgent
# Crawling for AI model training.

User-agent: Meta-ExternalFetcher
# Fetch for user requests. Can bypass robots.txt.

Other relevant crawlers:

User-agent: Amazonbot
User-agent: Bytespider      # ByteDance
User-agent: Applebot-Extended  # Apple AI (flag, not bot)
User-agent: CCBot           # Common Crawl
User-agent: cohere-ai
User-agent: cohere-training-data-crawler

Configuration strategies

Strategy 1: Full access for maximum AI visibility

# Allow all AI crawlers
User-agent: GPTBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: anthropic-ai
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: Meta-ExternalAgent
Allow: /

User-agent: Amazonbot
Allow: /

Strategy 2: AI search visibility, no training

This configuration allows AI systems to cite your content in responses, but prevents use for training models:

# Allow search/user crawlers
User-agent: ChatGPT-User
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Perplexity-User
Allow: /

# Block training crawlers
User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Meta-ExternalAgent
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: cohere-training-data-crawler
Disallow: /

Strategy 3: Selective access by directory

User-agent: GPTBot
Allow: /blog/
Allow: /docs/
Disallow: /api/
Disallow: /internal/
Disallow: /user-data/

Limitations of robots.txt

A critical point: robots.txt is a voluntary protocol. Crawlers can ignore it.

In August 2025, Cloudflare blocked Perplexity bots after documenting protocol violations. In October 2025, Reddit deliberately trapped Perplexity crawlers, demonstrating they bypassed restrictions through third-party tools. Legal action followed.

The operational consequence: robots.txt alone is not enough. For real enforcement, you need IP verification, WAF rules, or CDN-level blocks. Cloudflare reports that over 2.5 million sites use its managed robots.txt function to block AI training.

llms.txt: the new standard for guiding LLMs

In September 2024, Jeremy Howard of Answer AI proposed llms.txt, a new standard file for communicating with Large Language Models. Unlike robots.txt, which controls access, llms.txt guides models toward the most relevant content.

What llms.txt does

The llms.txt file is a markdown document positioned at the domain root (/llms.txt). It works as a curated map that tells LLMs which pages contain the most important information and how to interpret them.

It’s not a blocking mechanism. It’s a recommendation system, like a librarian guiding a visitor to the right shelves instead of letting them wander.

File structure

# example.com

> Technical site on AI implementations for enterprise.
> Content verified, updated monthly.

## Core Documentation

- [Production RAG Guide](https://example.com/docs/rag-production):
  RAG architectures tested in production, chunking patterns,
  evaluation metrics. Updated Q4 2024.

- [API Reference](https://example.com/docs/api):
  Complete REST API documentation. Includes code examples
  in Python and cURL.

## Technical Articles

- [LLM Latency Optimization](https://example.com/blog/llm-latency):
  Strategies to reduce p95 latency below 200ms.
  Includes benchmarks on Claude, GPT-4, Mistral.

- [AI Cost Management](https://example.com/blog/ai-costs):
  Framework for estimating and optimizing inference costs.
  Real data from enterprise deployments.

## Resources

- [AI Glossary](https://example.com/glossary):
  Technical definitions of 150+ AI/ML terms.

llms-full.txt: extended version

Beyond llms.txt, the standard provides an optional llms-full.txt file containing the full site content in flattened format. It removes non-essential HTML, CSS, JavaScript and presents text only. Some sites generate files of 100K+ words.

The advantage: allows LLMs to process the entire site in a single context. The limitation: easily exceeds the context window of most models.

Adoption status

As of January 2025, OpenAI, Google, and Anthropic do not natively support llms.txt. Their crawlers don’t automatically read the file.

Current adoption is concentrated in specific niches:

Technical documentation: Mintlify integrated llms.txt in November 2024. Documentation sites for Anthropic, Cursor, Cloudflare, Vercel use it.
Dedicated directories: directory.llmstxt.cloud and llmstxt.site catalog sites with implementations.
Manual use: Developers who upload the file directly to ChatGPT or Claude to provide context.

It’s an investment in future-proofing. When major providers adopt the standard, those who have already implemented will have an advantage.

Implementation

Create /llms.txt at the domain root
UTF-8 format, clean markdown
Include only indexable pages (no noindex, no blocked in robots.txt)
Add concise but informative descriptions for each URL
Optional: reference in robots.txt with # LLM-policy: /llms.txt

Differences with other standard files

Comparison of standard files for web and AI crawlers
File	Purpose	Target	Format
robots.txt	Crawler access control	Search engines, AI crawlers	Plain text, directives
sitemap.xml	Complete page catalog	Search engines	XML
llms.txt	Curated priority content map	LLM	Markdown
humans.txt	Team credits	Humans	Plain text

Structured Data and JSON-LD for AI

Structured data is not new. It’s been standard SEO since 2011. But its role changes in the context of generative search engines.

Why Structured Data matters for AI

LLMs process everything as tokens. They don’t natively distinguish between a price, a name, a date. Structured data provides an explicit semantic layer that disambiguates content.

An article with JSON-LD markup communicates in a machine-readable way: this is the author, this is the publication date, this is the publishing organization, these are the sources cited. The model doesn’t have to infer this structure from the text.

Basic JSON-LD implementation

JSON-LD (JavaScript Object Notation for Linked Data) is the preferred format. It’s inserted in a

Priority schema types for AI visibility

Article / TechArticle / NewsArticle

For editorial content. TechArticle for technical documentation.

FAQPage

Q&A structure that generative search engines can extract directly:

HowTo

For step-by-step guides:

Organization and Person: E-E-A-T for AI

E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) is no longer just a Google framework. Data shows that LLMs verify author credentials before citing: 96% of content in AI Overview comes from sources with verified authors. Content with detailed author bios has 73% higher selection probability.

The Person schema must go beyond the name. It needs to communicate credentials, affiliations, specific expertise:

E-E-A-T checklist for Person schema:

description with years of experience and specialization
hasCredential for verifiable certifications
knowsAbout with specific topics (not generic)
sameAs with links to verified profiles (LinkedIn, GitHub, Google Scholar)
alumniOf for academic affiliations
worksFor with organization URL

Citation schema

For content that cites external sources, the Citation schema adds context:

ImageObject and VideoObject for multimodal content

Multimodal content has 156% higher probability of being selected in AI Overview compared to text-only content. Gemini and Perplexity invest heavily in multimodal search. Schema for media becomes relevant:

For video with transcription:

Best practices for AI-friendly media:

Descriptive and contextual alt text (not “image1.png”)
Captions that explain the content, not just describe it
Transcriptions for all videos
Captions that contextualize the figure in surrounding text

Real impact on AI search

John Mueller of Google clarified in January 2025 that structured data is not a direct ranking factor. But the indirect impact is documented:

Rich snippets from structured data increase CTR by 30% according to BrightEdge
72% of sites on Google’s first page use schema markup
Google’s AI Overviews process structured data to build responses

Structured data doesn’t guarantee citations in generative search engines. But it provides the semantic context that facilitates correct interpretation of content.

Technical requirements for AI crawlers

Beyond structured data, there are technical requirements that influence LLMs’ ability to process and cite content.

Static HTML vs JavaScript rendering

AI crawlers struggle with JavaScript-rendered content. Unlike Googlebot, which executes JS, many AI crawlers prefer or require static HTML.

Operating rules:

Critical content must be present in static HTML, not generated dynamically
Avoid content hidden in tabs, accordions, or loaded on-scroll
If you use JS frameworks (React, Vue, Next.js), verify that SSR or SSG produces complete HTML
Test: view the page with JS disabled. What you see is what base AI crawlers see.

Content freshness signals

23% of content selected in AI Overview is less than 30 days old. Perplexity indexes daily. Freshness signals are prioritized over historical authority.

Implementation:

dateModified in schema must reflect actual updates:

Freshness checklist:

Update dateModified only for substantial changes (not typo fixes)
Prominently signal updates in content (“Updated: January 2025”)
Quarterly review of evergreen content
Update statistics and data at least annually
Remove or mark obsolete content as archived

Citation verification and fact-checking

AI systems cross-reference with authoritative sources in real-time. Content with verifiable citations has 89% higher selection probability compared to content with unsupported claims.

Rules:

Every statistic must have a linked source
“According to research” without a link = unverifiable claim = penalized
Prefer primary sources (papers, official documentation) over secondary sources
Citations from Wikipedia, Statista, Pew Research, arXiv papers carry more weight

GEO strategies: what the research says

The “GEO: Generative Engine Optimization” paper from Princeton, Georgia Tech, Allen Institute, and IIT Delhi is the most rigorous available study on optimization for generative search engines. It tested 9 techniques on 10,000 queries.

The three most effective strategies

1. Cite Sources: +40% visibility

Adding citations from authoritative sources is the strategy with the highest overall impact. For sites with low ranking in traditional SERPs, the effect is even more pronounced: +115% for sites in fifth position.

Simply citing is not enough. The citation must be from a recognized, relevant source, verifiable.

2. Quotation Addition

Incorporating direct quotes from industry experts increases authenticity and perceived depth. Works particularly well for opinion-based content.

3. Statistics Addition

Quantitative data beats qualitative discussion. “42% of AI projects fail” has more impact than “many AI projects fail”. Works particularly well for Legal and Government domains.

Structuring content for extraction: Answer Blocks

LLMs don’t cite entire pages. They extract specific blocks. Optimizing for this pattern is critical.

Passage length optimal: 134-167 words per citable block. For direct FAQ answers: 40-60 words. Content with summary boxes at the beginning has 28-40% higher citation probability.

Practical implementation:

TL;DR at the beginning: Every article opens with a self-contained summary block. It’s not just for human readers: it’s the block that LLMs preferentially extract.
Self-contained sections: Each H2/H3 should be citable independently from the rest. An LLM should be able to extract that section and have a complete answer.
Heading as questions: “What is RAG?” performs better than “RAG Overview”. Direct matching with conversational queries.
Modular paragraphs: 75-300 words per section. No wall of text. Modular blocks are easier to extract and cite.
Direct answers first, context after: The answer to the heading’s implicit question should appear in the first 2-3 sentences. Elaboration comes after.

Example of optimized structure:

## What is the difference between SEO and GEO?

SEO optimizes for ranking in lists of results from traditional
search engines. GEO optimizes for being cited in synthesized
responses from generative search engines like ChatGPT, Perplexity
and Gemini. [40-60 words of direct answer]

The fundamental change concerns the objective: from ranking to
citation. In classical SEO, success is position 1 in the SERPs.
In GEO, success is being the source that the AI cites when responding.
[Elaboration and context]

Domain-specific strategies

The paper found that effectiveness varies by domain:

History: Authoritative and persuasive tone
Facts: Citations from primary sources
Law/Government: Statistics and quantitative data
Science/Health: Technical terminology + authoritativeness

Platform-specific optimization

Each LLM has different preferences. An effective GEO strategy considers these differences:

Optimization preferences for generative AI platforms
Platform	Main preferences	Optimization
ChatGPT	Wikipedia, popular brands, established content	Authority building, Wikipedia presence if applicable
Perplexity	Reddit, recent content, real-time	Freshness priority, community engagement
Gemini	Multimodal, Google ecosystem, schema markup	Video, optimized images, complete structured data
Claude	Accuracy, balanced content, attribution	Proper attribution, neutral and evidence-based framing
Google AI Overview	Top 10 organic, strong E-E-A-T	Traditional SEO + extended structured data

Operational implications:

ChatGPT cites Wikipedia in 48% of responses. For topics with a Wikipedia entry, presence there matters.
Perplexity prefers Reddit (46.7% of citations). Content discussed in relevant subreddits has an advantage.
Gemini integrates images and video into responses. Multimodal content performs better.
Claude verifies accuracy more rigorously. Unsupported claims are discarded.

What doesn’t work

Keyword stuffing: Adding keywords from the query to content worsens visibility by 10% compared to baseline. Generative search engines penalize over-optimization.

Generic persuasive language: Persuasive tone without substance doesn’t improve ranking.

Democratization of results

An interesting aspect: GEO levels the playing field. Sites with low ranking in traditional SERPs benefit more from GEO optimizations than dominant sites. Cite Sources brings +115% to sites in fifth position and -30% to sites in first position.

For small publishers and independent businesses, it’s an opportunity to compete with corporate giants without comparable SEO budgets.

Implementation checklist

robots.txt

[ ] Map all AI crawlers relevant to your industry
[ ] Define strategy: full access, search-only, selective
[ ] Implement directives for each user-agent
[ ] Verify syntax with Google Robots Testing Tool
[ ] Monitor server logs for crawler activity
[ ] Verify actual compliance (IP check for suspicious crawlers)
[ ] Quarterly review: new crawlers emerge regularly

llms.txt

[ ] Create markdown file at domain root
[ ] Include site description and content type
[ ] Organize URLs by category/priority
[ ] Add concise descriptions for each link
[ ] Verify that all URLs are indexable
[ ] Consider llms-full.txt for sites with extended documentation
[ ] Update when new priority content is published

Structured Data / JSON-LD

[ ] Implement Organization schema for the site
[ ] Add Person schema for authors with complete E-E-A-T:
- [ ] description with years of experience and specialization
- [ ] hasCredential for verifiable certifications
- [ ] knowsAbout with specific topics
- [ ] sameAs with LinkedIn, GitHub, Google Scholar
[ ] Use Article/TechArticle for editorial content
[ ] Implement FAQPage for Q&A sections
[ ] Add Citation schema for research-based content
[ ] Implement ImageObject/VideoObject for media
[ ] Validate with Google Rich Results Test
[ ] Verify parity between markup and visible content

GEO-optimized content

[ ] TL;DR of 40-60 words at the beginning of each article
[ ] Self-contained sections (citable independently)
[ ] Headings formulated as questions where appropriate
[ ] Modular paragraphs: 75-300 words per section
[ ] Passage length: 134-167 words for key blocks
[ ] Include citations from authoritative sources in every article
[ ] Add statistics and quantitative data with source
[ ] Use expert quotations where relevant
[ ] Avoid keyword stuffing
[ ] Calibrate tone for domain

Technical requirements

[ ] Critical content in static HTML (not JS-only rendering)
[ ] No content hidden in tabs/accordions/lazy-load
[ ] Test page with JavaScript disabled
[ ] dateModified updated for substantial changes
[ ] Signal updates in content (“Updated: Month Year”)
[ ] Quarterly review of evergreen content
[ ] Every statistic with linked source

Media and Multimodal

[ ] Descriptive and contextual alt text for images
[ ] Captions that explain the content
[ ] Transcriptions for all videos
[ ] ImageObject/VideoObject schema implemented
[ ] Captions that contextualize figures in surrounding text

Monitoring

[ ] Track AI crawler activity in server logs
[ ] Monitor brand mentions in ChatGPT/Perplexity/Gemini responses
[ ] Analyze competitor citation share
[ ] Measure referral traffic from AI platforms
[ ] Monthly metrics review

The window of opportunity

Cloudflare data shows that crawling for AI training still dominates traffic, with volumes 8 times higher than search crawling and 32 times higher than user-action crawling. But the trend is clear: user-action traffic is growing faster than any other category.

Those who implement GEO infrastructure now build advantages that accumulate over time. Citations generate other citations. Authority recognized by models strengthens. First-mover advantage in this space isn’t just about technical positioning: it’s about building an established presence before competition intensifies.

Traditional SEO doesn’t disappear. It continues to serve 70% of search traffic that still goes through classic SERPs. But the remaining 30%, and its growth trajectory, requires new tools.

Sources

Aggarwal, P., et al. (2024). GEO: Generative Engine Optimization. arXiv:2311.09735. Princeton University, Georgia Tech, Allen Institute for AI, IIT Delhi.

AI Mode Boost. (2025). AI Overview Ranking Factors: 2025 Comprehensive Study.

Cloudflare. (2025, December). From Googlebot to GPTBot: Who’s Crawling Your Site in 2025. Cloudflare Blog.

Dataslayer. (2025). Google AI Overviews Impact 2025: CTR Down 61%.

Howard, J. (2024, September). llms.txt Proposal. Answer AI.

W3C Schema Community. (2024). Schema Vocabulary Documentation.

SEO Sherpa. (2025, October). Google AI Search Guidelines 2025.

Single Grain. (2025, October). Google AI Overviews: The Ultimate Guide to Ranking in 2025.

Yoast. (2025). Structured Data with Schema for Search and AI.

Overdrive Interactive. (2025, July). LLMs.txt: The New Standard for AI Crawling.