Model routing on AI is a problem for OpenAI and Anthropic

Fixing overspending on AI is a problem for OpenAI and Anthropic

A new spending discipline is taking hold within corporate America, as chief financial officers and boards of directors begin to crack down on inefficient spending on artificial intelligence. This change has the potential to reshape AI trade.

For the past two years, playbooks have defaulted to choosing the most powerful AI model and sending all queries through that model, regardless of complexity. Now that the AI bill is so far over budget, companies are starting to question whether they actually need a top-of-the-line or frontier model for every task. Two leaders at the center of building AI told CNBC this week that a solution is emerging: model routing.

What is model routing?

Routing is a tool that adapts jobs to models, sending difficult problems to expensive frontier models and easy problems to cheaper, faster alternative models.

Scott Wu, CEO of Cognition, which develops the coding agent Devin, said the benefits of routine work are huge. For many routine tasks, he says, companies can be five to 10 times more cost-effective by using models that are good enough for the task.

Today, most companies don’t do any routing at all. Glean CEO Arvind Jain estimates that approximately 95% of enterprise AI usage is still performed on the most expensive frontier models, even for tasks that could easily be handled by cheaper alternatives. Wu gave the example of asking a model to name the third US president. No matter how expensive they are, you’ll know they were all Thomas Jefferson.

Arvind Jain, CEO of Glean, takes to the SaaS Monster stage during day 1 of Web Summit 2022 at Altice Arena in Lisbon, Portugal on November 2, 2022.

Harry Murphy | Sports File | Getty Images

The pressure behind this change is a cost curve that has taken even the biggest technology companies by surprise. Jeetu Patel, Chief Product Officer Ciscolaid out the calculations. At approximately $200 in token usage per employee per week, that’s approximately $10,000 per employee per year. A company with 90,000 employees expects to make $900 million a year. A token is a block of data that a model uses to generate information. Usage is charged according to the number of tokens processed.

Patel said the adjustment was necessary because Cisco is way over its budget and currently has 30,000 engineers developing products primarily written in AI. Cisco reallocated resources and prioritized tokens over other spending.

Vendors under pressure

AI companies are aware of the concerns.

Cognition has announced what it calls the AI Productivity Guarantee. If the engineering value provided by Devin is less than the price paid by the customer, Cognition will fund up to $10 million in usage until it reaches par. Mr. Wu framed this as a way to cut through the noise of metrics that plague the industry: return on investment.

Wu said that rather than measuring activities such as tokens or lines of code consumed, Cognition estimates the number of human engineering hours agents actually save and backs up that estimate with a refund. He said you can spend billions of tokens and not do anything with it. Companies should strive to produce results, not activities.

If companies start steering easy, high-volume work to cheaper open source models in places like China, OpenAI and Anthropic won’t be able to get paid for every task. They only accept more complex tasks. Both companies have built their businesses and IPO expectations around them on the premise of huge demand at premium prices.

Patel doesn’t think that will sink Frontier Labs, and says its cutting-edge technology will continue to be valuable. But he sees the pricing model changing. Rather than simply charging more, labs will need to be more efficient in how they use their models, which Patel predicts will lead to a concerted industry effort.

The question was whether companies would continue to spend as AI-related costs skyrocket. Nowadays, many people seem to be finding ways to spend their money wisely. Pricing power is shifting from companies selling premium AI to those buying it.

Frontier Labs will still charge a premium for the most difficult research. But how much of the market do the others account for? The answer could go a long way in determining the valuation of the leading AI companies.

Never miss the most trusted news moments in business news when you choose CNBC as your preferred source on Google.

Source link

What's Hot

Nvidia details next-generation Vera CPUs to take on AMD and Intel

JetBlue acquires Spirit slots at LaGuardia Airlines, considers terminal relocation

Google releases three new Gemini models, but not the 3.5 Pro

Nvidia details next-generation Vera CPUs to take on AMD and Intel

Google expands Gemini lineup, adds cheaper model and new Mythos rival

OpenAI, Artificial Boost Lobbying as Legacy Technology, Decline in Defense Spending

5 things to know before the stock market opens on Tuesday

Newly freed hostages face long road to recovery after two years in captivity

Former Kenyan Prime Minister Raila Odinga dies at 80

New NATO member offers to buy more US weapons to Ukraine as Western aid dwindles

Russia expands drone targeting on Ukraine’s rail network

Tom Holland calls Zendaya his wife for the first time since their wedding

Harry Styles’ concert in Brazil canceled due to ‘health issues’

Kayley Hottle, Godzilla vs. Kongster, dies

Daughter Bailey Ann hints at the reason

Our Picks

Letters from Titanic passengers give a ‘very interesting’ account of life on board before the disaster

British politician Anne Widecombe was hit 21 times with a hammer.

On the sidelines of Iran war, Israel prepares plans for escalation

Subscribe to Updates

What's Hot

Model routing on AI is a problem for OpenAI and Anthropic

What is model routing?

Vendors under pressure

Related Posts

Subscribe to Updates