Token bill deadline looms: Industry battle to control runaway costs of AI

Across industries, companies are beginning to balk at the costs of AI. Uber exhausted its 2026 AI coding budget by April. Microsoft revoked developers’ cloud code licenses months after they were activated. A Priceline employee told TechCrunch that Cursor’s regular renewals cost four to five times more.

Despite the decline in the price per token, increasing adoption of AI and the rise of autonomous agents are leading to ever-increasing consumption of tokens. Companies that made a fortune on all-you-can-eat subscriptions in early 2025 are now scrambling to figure out where their money is going, cut spending, and see if they can recoup some ROI from leftover budgets.

Meanwhile, a market that caters to them is being formed. Startups, established vendors, and new standards bodies are all competing to give businesses the tools and language to track spending.

“Six months ago, I was having conversations with customers and it was all about, ‘What can we do? Is it good enough?'” Alexander Embrikos, head of enterprise at OpenAI, told TechCrunch at an event in New York City this week. “Now our conversations are never about things like that. Now the conversations are, ‘Hey, we’re spending a lot of money. What visibility do we have? What auditability do we have? What token management do we have? What does the efficiency of the model look like?'”

Against this backdrop, the Linux Foundation this week announced plans for the Tokenomics Foundation, a new standards body aimed at instilling the same cost discipline around AI tokens that FinOps has for cloud spending.

“In April and May, we started hearing from companies saying, ‘Wow, we’ve tripled our entire 2026 token budget and it’s only April,'” JR Storment, executive director of the FinOps Foundation, a project under the Linux Foundation, told TechCrunch. “We started hearing about an existential crisis, and the whole conversation changed from token maxing and ‘go fast’ to ‘we need guardrails, how do we control this?'”

The cries heard in the tech world followed an impassioned call from CEOs to ignore costs and force their teams to use the best models and move quickly. New models released in November, such as Anthropic’s Claude Opus 4.5, OpenAI’s GPT-5.1, and Google’s Gemini 3 Pro, brought significant improvements to agent tools and doubled consumption. That’s how one company reportedly received a $500 million Claude bill because it forgot to set usage limits for its employees.

“This is like the crack cocaine epidemic,” said Chris Reed, senior director of IT finance at Priceline, noting that the company has started placing token restrictions on certain groups. “They let you try it to get you hooked, and now you’re hooked.”

Vitaly Gordon, CEO of engineering operations platform Faros AI, said he recently spoke with a CTO who told him, “One of my engineers spent $40,000 on tokens last month, and I don’t really know if I should stop him or go tell everyone else to be like him.”

A March study by Faros found that production among 20,000 developers is increasing, but so are bugs and rewrites. Engineering management platform Jellyfish similarly found that engineers who used the most tokens were about twice as productive as those who used less AI, but spent 10 times more tokens to get there.

Nicholas Arcolano, head of research at Jellyfish, told TechCrunch in an email that AI spending is exploding, primarily driven by agent capabilities, with consumption per developer increasing approximately 18.6 times over nine months. Overall, these statistics make the productivity case more opaque than the spending would suggest.

“Whether extreme spending pays off depends on the ultimate business value (such as revenue) of shipped code, which most companies still cannot measure,” Arcolano said.

At least part of the measurement problem lies in the sheer scale at which AI is being used today.

“Tracking cloud costs is a problem with hundreds of millions of rows of data per month,” Storment said. “Tracking token costs is a problem with trillions of rows of data per month. You can’t just plug it into a spreadsheet or basic tool. To do that, you have to fundamentally rethink your tools, specifications, and accounting systems.”

At Priceline, Reed already sees a discrepancy. He pointed to issues between vendor-reported usage and Priceline’s internal data.

“I started my career in telecom expense management, and I see the same similarities in everything from telecom to cloud to AI,” he said. “Anytime you introduce something new, there are opportunities for billing errors, audits, and optimization.”

A market is starting to form around this issue. There are also pure-play companies like Pay-i that track, measure, and optimize the cost and performance of GenAI investments. Paid, on the other hand, allows developers to track costs, measure usage, and charge users based on actual value rather than subscription fees.

Additionally, there are companies like Jellyfish, Waydev, and Faros AI, all of which offer monitoring of AI agents to prove the ROI of developer tools. Most of the 180 vendors within the FinOps Foundation are leaning toward this space, Storment said.

Companies with existing distributions are also adding new features to take advantage of this new market. Ramp recently transitioned to AI spend management. Datadog and New Relic worked on services such as cloud cost management, token-level observability, and GPU monitoring. At next week’s FinOps X conference, AWS will introduce new financial management capabilities for enterprise AI spending.

NEA partner Tiffany Luck believes token efficiency and observability will likely be added to the “harness layer or app layer.” She pointed to Factory, a startup that develops AI agents for enterprises. This week, the company launched a model router that automatically selects the right model for any task.

Gordon expects Frontier Labs and other model providers to employ OpenRouter-style optimization to direct queries to the cheapest models. This trend is already visible in corporate claud invoices.

“The financial report on how much we spent on Anthropic, even if we call it the Opus model, some of the spending is going to go to Sonnet or Haiku because they’re smart enough to do that,” Gordan said. “I think this is going to become increasingly important.”

However, all these tools are built without a common language or shared definition of the cost of a token, what it generates, and how to compare spending across vendors. That’s where the Tokenomics Foundation hopes to help.

The Foundation is building a standard definition and framework for “tokenomics.” Open standards, specifications, and metrics for using and claiming AI tokens. It also includes new metrics for the AI economy, such as cost per intelligence and tokens per watt. We also plan to define indicators regarding the effectiveness and consumption efficiency of the token factory. The group is scheduled to officially launch in July and will announce more members at next week’s FinOps X conference.

“Token economics are fundamentally more abstract and opaque than we have traditionally managed at this scale,” Nishant Gupta, chief availability officer at Salesforce, said in a statement. “It requires a different operational capability than what the industry has built for the cloud.”

That said, Goldman Sachs predicts that global token usage will increase 24 times by 2030. Companies that are already over budget need a solution now, and the foundation’s first deliverables are still months away.

“They may have built a steam engine, but the assembly line is still unknown,” Gordon said.

The smart move, Arcolano said, is to adopt broadly and moderately.

“The best ROI comes from moving a broad middle class from low to moderate usage, rather than pushing heavy users to higher usage levels,” he said.

Russell Brandom and Tim Fernholz contributed to this report.

If you buy through links in our articles, we may earn a small commission. This does not affect editorial independence.

Source link

What's Hot

AMD to invest up to $5 billion in Anthropic as part of computing power deal

DWTS’ Val Chmerkovskiy says Next Pro’s AJ Pritchard has ‘arrogance’

BlackRock, Carhartt, Ford, Google launch skilled trades initiative

Passionfroot raises $15M to expand B2B creator market to US

The browser wars are no longer about search – here are the best browser alternatives to Chrome and Safari

Glow emerges from stealth with $1.2 billion valuation to take on endpoint security in the age of AI

Synthesia’s AI training platform moves beyond video to live coaching

Newly freed hostages face long road to recovery after two years in captivity

Former Kenyan Prime Minister Raila Odinga dies at 80

New NATO member offers to buy more US weapons to Ukraine as Western aid dwindles

Russia expands drone targeting on Ukraine’s rail network

DWTS’ Val Chmerkovskiy says Next Pro’s AJ Pritchard has ‘arrogance’

Latest update on RHOC’s Gina Keough cancer diagnosis: Surgery paused due to infection

Jerry O’Connell apologizes after breaking up with Giuliana Rancic over intercom

Did Gwen Stefani and Blake Shelton attend Taylor Swift’s wedding? Photo Fuel Rumors

Our Picks

Epstein: Model recruiter who introduced young women to sex offender found dead in France

Why a week of protests prompted President Zelensky to invite a young military commander

Hitler’s birthplace begins a new life as a police station

Subscribe to Updates

What's Hot

Token bill deadline looms: Industry battle to control runaway costs of AI

Related Posts

Subscribe to Updates