Which AI projects actually deliver ROI in 2025 and 2026?

The 25% of enterprises seeing genuine AI ROI share common traits: narrowly scoped use cases with structural AI advantage (classification, document processing, pattern detection), quantified success metrics defined before development began, human-in-the-loop architecture that routes low-confidence outputs to human review, and clean governed data infrastructure built before AI was introduced.

Why Companies Are Quietly Leaving AI

Q: Are companies actually abandoning AI, or just slowing down?

It is a strategic correction, not an abandonment. While 42% of companies have scrapped most of their AI projects, they are shifting focus from high-hype, low-utility pilots to narrow, measurable use cases that demonstrate actual business impact. The technology is not being rejected — the deployment approach is being reset.

Q: What should companies look for in an enterprise AI vendor?

Avoid vendors that only demo the model interface. Look for transparent pricing at production scale (not just pilot pricing), built-in observability so you can see when the AI fails, human-in-the-loop design that supports existing workflows rather than replacing human judgment entirely, and clear data governance documentation. Ask for case studies where the system failed and how it was caught.

The Hype Cycle Arrives at Its Trough

Every major technology goes through the same arc. There's the peak of inflated expectations — the moment where the technology is credited with solving everything from climate change to quarterly earnings calls. Then comes the trough of disillusionment: the phase where the gap between expectation and reality becomes impossible to paper over. For enterprise AI, 2025 was the trough.

According to S&P Global data, the share of companies that abandoned most of their AI projects jumped from 17% in 2024 to 42% in 2025 — a near-tripling in a single year. This isn't a niche trend. It spans financial services, retail, healthcare, logistics, and technology companies themselves. The retreat is broad, and the reasons are converging on a small set of structural problems that no amount of additional investment appears to be solving quickly.

Understanding why companies are pulling back requires going beyond the headline numbers. There are at least six distinct failure modes at work, and they stack on top of each other in ways that make AI projects particularly punishing compared to other enterprise software rollouts.

Reason 1 — The Hallucination Tax Nobody Calculated

🧠

Reliability in Production vs. Reliability in Demo

Critical Failure Mode

AI models that perform impressively in controlled evaluations routinely produce confident, plausible, and completely wrong outputs in production environments. The gap is not shrinking fast enough for enterprise use cases that demand accuracy over fluency.

When a company evaluates an AI tool, they typically test it on clean, well-formatted, representative data. The model performs beautifully. Then it goes live, encounters the full chaos of real-world inputs — ambiguous queries, edge cases, poorly structured data, unusual phrasing — and the cracks appear fast.

AI hallucinations cost businesses an estimated $67.4 billion globally in 2024. The figure for 2025 is expected to be higher. On a per-employee basis, enterprises are spending roughly $14,200 per year in hallucination-related mitigation: fact-checking outputs, correcting customer-facing errors, undoing decisions made on bad AI data, and paying the legal cost of advice or content that turned out to be fabricated.

Knowledge workers now spend an average of 4.3 hours per week verifying AI outputs. In many cases, this verification overhead exceeds the time that would have been spent doing the task manually. The efficiency gain evaporates completely — and in some workflows, the net result is negative productivity.

⚠️ The Confidence Inversion Problem

Research shows that AI models use 34% more confident language when generating incorrect information compared to correct information. This is the opposite of what you want: the model sounds most authoritative precisely when it's most wrong. Human reviewers, trained to trust confident sources, systematically fail to catch these errors.

In legal contexts the problem becomes acute: large language models hallucinate between 69% and 88% of the time on legal-specific queries. AI chatbots in customer support produce hallucinated responses in 15–27% of live interactions. And in enterprise sales workflows, 23% of late-stage deal losses have been traced back to qualification elements that AI incorrectly identified as confirmed.

What makes this particularly damaging is a 2025 mathematical proof confirming that hallucinations cannot be fully eliminated under current large language model architectures. They are not a bug to be patched. They are an inherent characteristic of how these systems generate language. Enterprises that were planning their AI strategy around "hallucinations will be solved in 18 months" are now rewriting those plans.

Reason 2 — The 80% Nobody Talked About

🏗️

Data Engineering is the Actual Work

Technical Failure Mode

Getting an AI model to answer a question in a demo takes hours. Getting it to answer the right questions on real enterprise data, reliably, at scale, integrated into existing systems — takes 18 months minimum and often fails regardless.

When AI vendors sell to enterprises, they demonstrate the model layer: the impressive interface, the natural language understanding, the fluent output. What they don't demo is everything underneath — the data pipelines, governance frameworks, integration middleware, workflow redesign, and measurement infrastructure that actually make an AI system production-viable.

Engineering teams who have shipped AI in production consistently report the same ratio: roughly 80% of the work required to move from pilot to production is data engineering, not model work. Cleaning data. Building pipelines. Resolving schema conflicts between legacy systems. Establishing data governance so that the AI doesn't accidentally surface sensitive information. Designing the feedback loops that let the system improve. None of this shows up in a vendor demo. All of it consumes time and budget in deployment.

Most enterprises have data infrastructure that was never designed for AI workloads. They have siloed databases built over decades, inconsistent data standards across divisions, compliance constraints that restrict what data the AI can access, and IT teams already stretched thin. Plugging a modern AI layer on top of this creates integration complexity that regularly kills projects long before they reach production.

📊 Engineering Reality Check

Projects with quantified success metrics defined upfront achieve a 54% success rate. Projects without them: just 12%. Yet 73% of failed AI projects had no agreed definition of success before the project started, and 61% of enterprise AI projects were approved on projected ROI that was never measured after launch. The investment was real. The measurement never happened.

Reason 3 — The Hidden Infrastructure Bill

💸

Compute, Storage, and Vendor Costs at Scale

Financial Failure Mode

What costs $500/month in proof-of-concept can cost $50,000/month in production. API pricing at scale, GPU infrastructure, vector database storage, monitoring, and redundancy all arrive as surprises to organisations that approved AI budgets based on pilot costs.

Enterprise AI has a cost structure that is fundamentally different from traditional software. Traditional software has high upfront licensing costs and low marginal costs — once you've paid for the licence, adding another user is cheap. AI has the opposite profile: low or zero upfront cost in pilot phase, but costs that scale aggressively with usage in production.

A company that runs 10,000 API calls per day in a proof-of-concept might run 10 million per day in production — a 1,000× increase that maps directly into a 1,000× increase in API spend. Add vector database hosting for retrieval-augmented generation systems, GPU compute for fine-tuned models, observability tooling, and the engineering time required to maintain and retrain models as business data changes — and the total cost of ownership frequently exceeds what was budgeted by a factor of 3 to 5×.

CFOs who signed off on AI investments based on pilot spend are now receiving production invoices that bear no resemblance to approved budgets. In many cases, finance teams are pulling the plug not because the AI isn't working, but because the cost of running it at scale makes the business case impossible to close.

Reason 4 — The Klarna Lesson

No company illustrated the AI retreat more visibly than Klarna. In 2023 and early 2024, the Swedish fintech became the poster child for aggressive AI adoption, publicly reporting that its AI customer service system was doing the work of 700 human agents and had allowed the company to cut its headcount by over a thousand. CEO Sebastian Siemiatkowski toured media circuits talking about AI productivity. Klarna's AI-first positioning was cited in dozens of investor presentations.

"We went too far. We need humans in the loop." — Klarna CEO Sebastian Siemiatkowski, 2025

By mid-2025, Klarna reversed course. Siemiatkowski told Bloomberg the company was actively recruiting humans again because the AI approach had led to measurably lower service quality. Customer satisfaction scores had fallen. Resolution times for complex cases had worsened. The AI was handling volume, but it was handling it badly — and the customers were noticing.

What makes Klarna instructive is not the reversal itself, but what drove it. The company had optimised for cost reduction and headcount metrics — inputs — rather than customer satisfaction and resolution quality — outputs. The AI scored well on the metrics it was told to optimise for and performed poorly on the metrics that actually determined business outcomes. This is a structural trap that many enterprises fell into: measuring AI adoption instead of measuring AI impact.

Reason 5 — The Skills Gap Is Real and Deep

👩‍💻

Knowing What to Build vs. Knowing How to Build It

Technical Failure Mode

The organisations with the clearest AI use cases frequently lack the engineering talent to build them. The organisations with the engineering talent frequently lack the domain knowledge to identify valuable use cases. Bridging this gap is harder than buying software.

Deploying AI in production requires a specific combination of capabilities that is rare in most organisations. You need machine learning engineers who understand model behaviour, data engineers who can build reliable pipelines, product managers who can translate business requirements into AI system specifications, and domain experts who can evaluate output quality. Finding or training all four in the same project team is genuinely difficult.

Many enterprises tried to bridge this gap with AI vendors and consulting firms. The results have been mixed. External consultants can build a system, but they often cannot maintain it. They don't have the institutional knowledge to understand why a particular AI output is wrong in context, or how to retrain a model when the underlying business process changes. When the engagement ends, the knowledge walks out the door.

Companies that have successfully scaled AI internally tend to share one characteristic: they started building internal AI literacy years before the current wave, not months into the hype cycle. The organisations that moved fastest in 2023 and 2024 were often the ones that had the least internal capability to evaluate what they were building — which explains much of the abandonment rate in 2025.

Reason 6 — Regulation Arrived

⚖️

Compliance Complexity Hit Before Strategy Was Ready

Critical Failure Mode

The EU AI Act's phased enforcement began in 2025. GDPR implications for AI-generated decisions are being litigated. US sectoral regulations are tightening. Compliance teams are blocking AI projects that legal cannot adequately risk-assess — and legal cannot risk-assess them because the law is moving faster than the case history.

The regulatory environment for enterprise AI has shifted dramatically since the initial wave of deployment. The EU AI Act, which began enforcement in 2025, created a tiered risk classification system that placed many enterprise AI applications — hiring tools, credit scoring, customer risk assessment, medical triage — into categories requiring human oversight, audit trails, and explainability documentation that current models cannot easily provide.

In parallel, GDPR enforcement authorities in Germany, France, and Ireland began investigating AI systems that make automated decisions affecting individuals, questioning whether these systems meet the standard of meaningful human review the regulation requires. US regulators in financial services and healthcare have issued similar guidance. Legal teams that once rubber-stamped AI pilots are now red-flagging deployments that cannot demonstrate compliance lineage.

For many enterprises, this created a straightforward calculation: the AI project was already underperforming on ROI, and now it carried regulatory risk on top. The combined weight made abandonment the rational choice.

The Companies That Didn't Retreat — And What They Did Differently

The AI retreat is real, but it is not universal. A subset of enterprises — consistently estimated at around 25% by IBM's research — are seeing genuine ROI from AI deployment. The gap between this group and the majority tells a clear story about what separates successful AI adoption from expensive pilots.

Narrowly Scoped Use Cases

What Works

Successful companies picked problems where AI has a clear structural advantage: repetitive classification tasks, high-volume document processing, pattern detection in large datasets. They didn't try to automate judgment — they automated specific, well-defined sub-tasks within larger human workflows.

ROI Defined Before Build

What Works

Projects with quantified success metrics defined before development began achieve a 54% success rate versus 12% for those without. The metric must be a business outcome — cost per resolution, error rate in document processing, time-to-decision — not a technology metric like "number of AI queries processed."

Human-in-the-Loop Architecture

What Works

Rather than replacing human judgment, sustainable deployments position AI as a tool that surfaces options, flags anomalies, or drafts outputs — and routes to human review for anything below a confidence threshold. This catches hallucinations before they become errors, and satisfies most regulatory requirements for human oversight.

Data Infrastructure First

What Works

Companies that invested in clean, governed, well-documented data before evaluating AI tools built on a foundation that models could actually use. Those that tried to bolt AI onto legacy data chaos discovered that garbage in means garbage out — at enterprise scale and enterprise speed.

The Retreat Is Not the End — It's the Correction

It would be a mistake to read the enterprise AI retreat as evidence that AI is not genuinely transformative. The technology is real. The capabilities are real. The pace of model improvement is, by most measures, accelerating. What 2025 exposed is not a fundamental flaw in AI — it is a fundamental flaw in how enterprises adopted it.

The hype cycle compressed a multi-year adoption journey into 18 months. Companies made vendor decisions without adequate evaluation criteria. They set success metrics in technology terms rather than business terms. They underestimated integration complexity and overestimated the quality of their own data. They hired fast and built slow — or built fast and hired slowly. All of these are patterns that show up in every major enterprise technology adoption wave, from ERP to the cloud to mobile-first.

The companies retreating from AI aren't abandoning the technology. They're abandoning the hype. The two things feel the same right now — but they won't by 2027.

The trough of disillusionment is temporary by design. What comes after is the slope of enlightenment: the period where practitioners — rather than vendors — set the terms of adoption. Where use cases are selected for genuine fit rather than headline value. Where the organisations that figured out what works share those patterns with the broader market.

If you're running an enterprise AI program and wondering whether the retreat applies to you, the honest diagnostic is straightforward: Can you name a specific business outcome that your AI deployment has measurably improved? Can you quantify the improvement? If the answer is yes, stay the course. If the answer involves gesturing at "strategic positioning" or "building capability for the future," you have more in common with the 42% than you might want to admit.

The Timeline of the AI Enterprise Boom and Retreat

Late 2022

ChatGPT releases — the enterprise scramble begins

OpenAI's consumer launch triggers immediate board-level mandates. "AI strategy" becomes a requirement at every major organisation within 90 days.

2023

Enterprise AI spending accelerates — pilots proliferate

Microsoft Copilot, Google Workspace AI, and hundreds of vertical SaaS AI tools launch. Enterprises sign contracts faster than they can deploy. POC culture explodes.

Early 2024

ROI questions start appearing in earnings calls

Institutional investors begin asking for specifics. "We're investing in AI" stops being a complete answer. The gap between AI spend and AI results becomes visible at the CFO level.

Mid 2024

Klarna's AI-first narrative peaks — then fractures

Klarna's AI-replaces-700-agents story becomes the template. Within months, quality metrics reveal the real cost of removing human judgment from customer-facing workflows.

2025

The Great AI Retreat — 42% of companies abandon most projects

S&P Global data confirms the scale of the pullback. $547B of enterprise AI spend produces no measurable results. EU AI Act enforcement begins. Legal teams start blocking deployments.

2026

The correction period — narrow use cases, real metrics, human-in-the-loop

A smaller subset of enterprises with disciplined deployment practices post genuine ROI. The gap between AI winners and AI losers widens. The next wave of adoption begins on different terms.

What This Means If You're Building on AI Right Now

If you're a developer, product manager, or founder building AI-powered products, the enterprise retreat is not bad news for you — it is a signal. The buyers who got burned in 2023 and 2024 are now the most sophisticated buyers in the market. They know what went wrong. They're not afraid of AI — they're afraid of AI deployed badly. Your job is to show them you know the difference.

Concretely, that means several things. Narrow your scope aggressively. The narrower and more specific your AI capability, the easier it is to demonstrate and measure. Be honest about hallucination risks upfront — buyers who have already been burned by overconfident vendors will trust you more for acknowledging the limitation. Design for human-in-the-loop from the start rather than treating it as a fallback. Build observability into your system so customers can see what the AI is doing and when it fails. And price transparently at scale — no surprises between pilot invoice and production invoice.

The companies leaving AI are not leaving AI forever. They are leaving the version of AI that was sold to them in 2023. The version that survives into 2027 and beyond will be narrower in scope, more honest about its limitations, better integrated with human workflows, and — crucially — actually measurable. That version will be worth building toward.

🔧 Engineering Perspective

From 13 years of building systems across XR, AI, and enterprise software: every technology adoption wave has a correction phase. The correction is healthy. It separates genuine utility from category enthusiasm. The companies that come out of this correction with working AI systems will have a meaningful advantage — not because they moved first, but because they moved carefully. The first-mover advantage in enterprise AI went to the vendors. The second-mover advantage is going to the deployers who learned from watching the first wave crash.

Frequently Asked Questions

Are companies actually abandoning AI, or just slowing down? +

It is a strategic correction, not a full abandonment. While 42% of companies have scrapped most of their AI projects, they are shifting focus from high-hype, low-utility pilots to narrow, measurable use cases that demonstrate actual business impact. The technology is not being rejected — the deployment approach is being reset. Companies that are "leaving AI" are largely leaving the version of AI that was sold to them in 2023, not the technology itself.

What is the "Hallucination Tax" in enterprise AI? +

The Hallucination Tax refers to the hidden operational cost of managing AI errors in production. Enterprises are currently spending roughly $14,200 per employee annually on hallucination mitigation — fact-checking outputs, correcting customer-facing errors, and undoing decisions made on fabricated AI data. Knowledge workers lose an average of 4.3 hours per week just verifying what the AI told them. When AI models are also 34% more confident when wrong, human reviewers systematically miss the errors.

Why is enterprise AI integration so expensive at scale? +

Unlike traditional software, AI costs scale directly with usage. A proof-of-concept running 10,000 API calls per day can require 10 million calls per day in production — a 1,000× jump that maps directly to API spend. Add vector database hosting for retrieval-augmented systems, GPU compute for fine-tuned models, observability tooling, and ongoing retraining costs, and total cost of ownership frequently exceeds approved budgets by 3 to 5 times. CFOs signing off on pilot budgets are receiving production invoices that bear no resemblance to what was approved.

What did the Klarna AI rollback actually reveal? +

Klarna optimised for cost reduction and headcount metrics — inputs — rather than customer satisfaction and resolution quality — the outputs that actually determine business outcomes. The AI scored well on the metrics it was told to optimise for, and performed poorly on the metrics that mattered to customers. This is the structural trap most enterprises fell into: measuring AI adoption instead of AI impact. When Klarna's CEO said they went "too far," what he was really saying is that they measured the wrong things.

Which AI projects actually deliver ROI in 2025–2026? +

The 25% of enterprises reporting genuine AI ROI share four traits: narrowly scoped use cases where AI has a structural advantage (classification, document processing, pattern detection in large datasets); quantified success metrics defined before development began (54% success rate vs. 12% without); human-in-the-loop architecture that routes low-confidence outputs to human review; and clean, governed data infrastructure built before AI was introduced. The projects that work tend to automate specific sub-tasks inside human workflows — not entire human roles.

What should companies look for in an enterprise AI vendor? +

Avoid vendors that only demo the model interface on clean test data. Look for: transparent pricing at production scale (not just pilot pricing), built-in observability so you can see what the AI is doing and when it fails, human-in-the-loop design that augments existing workflows rather than replacing human judgment, clear data governance documentation, and compliance lineage that satisfies EU AI Act and GDPR requirements. Most importantly — ask for case studies where the system failed and how it was caught. Any vendor who can't answer that question hasn't deployed at scale.

Why Companies Are QuietlyLeaving AI

The Hype Cycle Arrives at Its Trough

Reason 1 — The Hallucination Tax Nobody Calculated

Reason 2 — The 80% Nobody Talked About

Reason 3 — The Hidden Infrastructure Bill

Reason 4 — The Klarna Lesson

Reason 5 — The Skills Gap Is Real and Deep

Reason 6 — Regulation Arrived

The Companies That Didn't Retreat — And What They Did Differently

The Retreat Is Not the End — It's the Correction

The Timeline of the AI Enterprise Boom and Retreat

What This Means If You're Building on AI Right Now

Frequently Asked Questions

Related Reading

Why Companies Are Quietly
Leaving AI