How large is the AI economy according to Exponential View?

Exponential View estimates $110 billion of deduplicated generative AI sales over the past 12 months and a $175 billion annualized revenue run rate based on the most recent monthly data.

Why does deduplication matter in AI revenue estimates?

Deduplication matters because the same dollar can move through several AI suppliers. A customer may pay an AI application provider, which pays a model provider, which pays a cloud provider. Counting only the end-customer dollar avoids inflating market size.

Can falling token prices increase AI revenue?

Yes, if demand is elastic. Exponential View estimates that a 10% token price cut leads to 12% to 18% more token use across providers. In that case, lower unit prices can expand total usage enough for total spend to rise.

What should companies measure before scaling AI agents?

Companies should measure cost per accepted task, cycle time, throughput, review burden, retry rate, quality defects, and the business value of each completed workflow. Token spend alone is not enough.

What is the main AI economy test for mid-market operators?

The test is whether AI spend can be tied to operating outcomes: faster support resolution, more clean CRM updates, shorter finance close cycles, better sales follow-up, or lower manual review time.

The AI Economy Is Real. Now It Has to Become Workflow ROI

The argument about AI economics has been stuck between two loud positions.

One side says the whole thing is a capex bubble. Too many data centers, too much debt, too many GPUs, too little paying demand. The other side says the demand is obvious because every serious company is buying AI tools, models, agents, and compute.

Both miss the operator's question.

The question is not "is AI real?" The question is "which part of the AI economy is paid demand, which part is infrastructure pass-through, and which part becomes measurable workflow ROI for customers?"

That is why Exponential View's new State of the AI Economy work matters. Their headline estimate is large: $110 billion of generative AI sales over the past 12 months, measured after removing double-counting, with a $175 billion annualized run rate based on the most recent month. They say the work reconstructs demand from the bottom up, company by company and unit by unit, rather than only reading the supply side through chips, power, and cloud capex.

The number is useful. The method is more useful.

Why Deduplication Changes The Debate

AI revenue is easy to overstate if you count every handoff in the supply chain.

A company pays an AI application vendor. That vendor pays a model provider. The model provider pays a cloud provider. The cloud provider pays for chips, data centers, power, memory, and networking. If each layer is added as if it were separate final demand, the same customer dollar gets counted several times.

Exponential View's method tries to avoid that by reporting the dollar spent by the end customer. Internally, they still track the cloud and model costs behind the sale, but the reported market size is deduplicated. That is the right frame for an operating buyer because it answers a cleaner question: how much are customers actually willing to pay for AI output?

This distinction matters for boards and CFOs. Supplier revenue tells you who captures value inside the stack. End-customer spend tells you whether the market has a demand floor. When those two numbers get mixed together, every strategic discussion becomes foggy.

For a mid-market company, the same discipline should apply internally. If the support team pays for an AI tool, the engineering team pays for model calls, and the data team pays for vector storage, finance should not treat those as three independent bets. They are pieces of one workflow cost. The economic unit is the completed task, not the vendor invoice.

The Market Is Bigger Than Pilots

The $110 billion estimate does not prove every AI company is healthy. It does prove something narrower and more important: paid demand is no longer just a demo cycle.

Exponential View argues that AI revenues are growing much faster than earlier IT waves, including mobile and the internet. Their article also points to enterprise conversations that have moved beyond casual pilots, while noting that many firms remain early in scaling. That matches what we see in implementation work. Teams have stopped asking "can the model answer this?" and started asking "can it sit inside the workflow without breaking permissions, approvals, or reporting?"

McKinsey's State of AI research has pointed in the same direction: the gap is not model interest, it is operating maturity. Exponential View also cites CEO-level pressure to move from AI experimentation to business transformation. That is not a toy-market signal.

Still, demand at the market level does not guarantee ROI at the company level.

A company can spend real money on AI and still fail to convert it into operating improvement. The most common failure pattern is simple: teams buy tools before they define the workflow metric. The tool gets adoption for a few weeks, usage rises, token spend rises, and nobody knows whether tickets closed faster, sales follow-up improved, claims were processed more cleanly, or finance reconciliations took fewer human hours.

The AI economy can be real while your AI program is still uneconomic.

The GPU Bill Is Not One Bill

The bear case often asks whether AI revenues can pay for the GPUs. That is a good question, but it is too compressed.

There are at least five bills hiding inside it:

Cost layer	What has to be covered	Why it matters
Compute depreciation	GPUs, accelerators, servers, and networking over their useful life	Accounting life may differ from economic life if newer systems change price-performance quickly
Facility depreciation	Buildings, electrical gear, cooling, and site infrastructure	Longer-lived assets can still be stranded if power or customers arrive late
Power	Electricity, demand charges, onsite generation, and grid commitments	A low marginal electricity price does not capture fixed capacity commitments
Financing	Debt, leases, private credit, vendor financing, and customer prepayments	The cost of capital can turn a good gross margin into weak equity returns
Utilization	The share of capacity doing paid work at acceptable prices	Idle compute is a tax on every successful customer workload

Exponential View says its model separates AI-oriented capex from ordinary hyperscaler capex and depreciates compute over six years while using longer lives for other infrastructure. It also argues that hyperscaler AI-attributable revenue roughly covers depreciation. That is an important counterweight to the "no revenue exists" argument.

But depreciation is not the whole economic test. Operators still need to ask whether utilization, power timing, financing cost, customer concentration, and model price declines leave enough profit after the accounting charge. A cloud provider can clear depreciation and still disappoint investors if the return on capital is thin.

For enterprise buyers, the practical takeaway is not to become a data center analyst. It is to understand that AI prices are tied to physical scarcity. Token prices point back to memory bandwidth, GPUs, power, cooling, and the cost of keeping high-end infrastructure busy. The International Energy Agency's Energy and AI report frames electricity for data centers as a prerequisite for AI deployment, not a side issue. When a vendor changes pricing, rate limits, context windows, or batch discounts, those are not random packaging choices. They are ways to ration scarce capacity and shape demand.

Falling Token Prices Can Grow The Market

One of the strongest points in the Exponential View piece is about token-price elasticity. They estimate that across providers, every 10% price cut leads to 12% to 18% more token use. In plain terms: lower prices can increase total spend if usage expands faster than unit prices fall.

That sounds odd only if you think of AI as a fixed bundle of tasks. It is not.

At $10 per task, only high-value work gets automated. At $1, more internal analysis, support drafting, sales research, and code maintenance becomes viable. At $0.10, background agents, daily data checks, document cleanup, and low-value administrative flows start to make sense. The cheaper the unit, the more teams discover work that was not worth automating before.

This is why "tokens are getting cheaper" is not a complete bear case. It is also why "usage is exploding" is not a complete bull case. Both can be true. The missing variable is useful output per dollar.

Exponential View proposes quality-adjusted output tokens as one way to measure the economic value of intelligence moving through the market. For operators, the equivalent metric is cost per accepted task.

Accepted task beats token spend because it captures what the business actually wanted:

A support reply approved and sent
A CRM record updated correctly
A contract risk found and reviewed
A claim triaged to the right queue
A finance variance explained with evidence
A code change merged after tests and review

Tokens are input cost. Accepted work is output.

What Boards Should Ask About AI Spend

Most AI dashboards are too vendor-centric. They show seats, messages, tokens, model mix, and monthly spend. Those are useful controls, but they do not tell a board whether AI is working.

The better dashboard starts with workflow economics:

Question	Metric
Did the workflow get faster?	Cycle time before and after AI
Did the team complete more work?	Throughput per person or per queue
Did quality hold?	Defect rate, rework rate, escalation rate
Did review burden fall?	Human minutes per accepted output
Did the model waste spend?	Retry rate, abandoned run rate, cost per failed run
Did the business outcome move?	Conversion, retention, margin, collections, close time, or SLA attainment

This is where mid-market companies can have an advantage over larger firms. They have fewer systems, shorter approval chains, and clearer operating metrics. A private equity-backed services business can often pick one measurable workflow, instrument it end to end, and know within weeks whether AI is creating value.

The wrong move is to roll out a horizontal assistant and hope usage becomes ROI. Usage is not the goal. A team can use an AI assistant heavily because the workflow is broken, the prompts are too long, the model retries too often, or the output needs too much cleanup.

The right move is to choose a narrow workflow with a visible queue and a clear acceptance test. Support triage. RFP response drafting. Claims intake. Sales call summary and CRM writeback. Invoice exception handling. Compliance evidence collection. Then measure the full cost per accepted task before expanding.

The AI Economy Will Split Into Three Layers

The next phase of AI economics is likely to split into three layers.

The first layer is frontier intelligence. These are expensive models and systems used where the value of a better answer is high: complex coding, diligence, legal analysis, scientific work, incident response, and strategic research. Buyers will pay for success-rate gains when the task value justifies it.

The second layer is routed workflow intelligence. This is where most enterprise ROI will sit. A workflow uses cheaper models by default, calls frontier models only on hard cases, caches stable context, batches low-urgency work, and uses deterministic software for validation and writes. The system matters more than any single model.

The third layer is embedded background intelligence. These are low-cost checks and assists that run constantly: classify records, detect missing fields, summarize changes, compare invoices, extract obligations, update internal knowledge, and flag exceptions. This layer grows as token prices fall.

The companies that win will not be the ones that buy the most AI. They will be the ones that route work across those layers with evidence. Cheap models where they are good enough. Frontier models where their premium pays. Humans where judgment, accountability, or trust require it. Deterministic code where consistency matters more than creativity.

How OpenNash Can Help

The Exponential View numbers are a useful market signal: AI demand is real, revenue is scaling, and falling token prices may expand usage rather than shrink the market. The operating question is what your company does with that signal.

OpenNash helps teams turn AI spend into workflow ROI. We start by mapping the work, not the tool list: which queues exist, where the bottlenecks sit, what a correct output looks like, which systems the agent can read, and which writes need approval. Then we build the instrumentation around cost per accepted task, cycle time, review burden, and failure modes.

That lets you answer the question boards actually care about: did AI make this business run better?

For some workflows, the answer will be yes quickly. For others, the model may be good enough but the data, permissions, or approval process may not be ready. That is still progress because it stops AI spend from becoming a mystery line item.

Book a call to choose one workflow, define the acceptance metric, and test whether your AI spend is ready to become operating gain.