Startup Gimlet Labs solves AI inference bottlenecks in a surprisingly elegant way

Zayn Asghar, an adjunct professor at Stanford University and successful founder, has raised $80 million in Series A for a startup that cleverly solves the bottleneck problem of AI inference. The round was led by Menlo Ventures.

A company called Gimlet Labs has developed the first and only “multi-silicon inference cloud,” software that allows AI workloads to run on different types of hardware simultaneously. You can split the work of your AI apps across both traditional CPUs and AI-tuned GPUs, as well as high-memory systems.

“We basically run into different hardware that is available,” Asghar told TechCrunch.

A single agent may chain multiple steps, each of which “requires different hardware: inference is compute-dependent, decoding is memory-dependent, and tool invocation is network-dependent,” Menlo lead investor Tim Tully wrote in a blog post about the funding.

There isn’t a chip to do it all yet, but as new hardware is rolled out and aging GPUs are redeployed, “a multi-silicon fleet is ready. We’re just missing the software layer to make it work.” That’s what Tully believes the Gimlet Institute will deliver.

If current trends in deploy-more-computing continue, McKinsey estimates that data center spending will reach nearly $7 trillion by 2030. Asghar said the app only uses the existing hardware already deployed “between 15 and 30 percent” of the time.

“There’s another way to think about this: You’re wasting hundreds of billions of dollars because you’re just leaving idle resources,” he said. “Our goal was essentially to figure out how to make AI workloads 10x more efficient today than they have ever been before.”

tech crunch event

San Francisco, California
|
October 13-15, 2026

So he and co-founders Michelle Nguyen, Omid Azizi, and Natalie Serrino set out to build orchestration software that could split agent workloads and distribute them across all types of hardware simultaneously.

Gimlet Labs claims that it can reliably speed up AI inference by 3x to 10x for the same cost and power. Gimlet says the underlying model can also be sliced to run across different architectures, using the best chip for each part of the model.

The company already has partnerships with chip makers such as NVIDIA, AMD, Intel, ARM, Cerebras, and d-Matrix.

Gimlet’s products, provided as software or through APIs to our proprietary Gimlet Cloud, are not intended for general AI app developers. For the largest AI model labs and data centers.

The company went public in October and says it has achieved eight-figure revenues (or at least $10 million) since its inception. Asghar said the customer base has more than doubled in the past four months and now includes major model manufacturers and very large cloud computing companies, but declined to name them.

The co-founders previously worked together at Pixie, a startup that developed open source observability tools for Kubernetes. Pixie was acquired by New Relic in 2020, just two months after launching in a $9 million Series A led by Benchmark. (Pixie’s technology is now part of the open source organization that oversees Kubernetes.)

After Mr. Asghar met Mr. Talley by chance about a year ago and received angel investment from Stanford University professors, venture capitalists started calling. After the start, a term sheet arrived on Asgar’s desk. When VCs heard that Asghar was considering an offer, the round quickly maxed out because “we had quite a lot of money,” he said.

With the previous seed, the startup has now raised a total of $92 million from a number of angels, including Sequoia’s Bill Coughran, Stanford professor Nick McKeown, former CEO of VMware Raghu Raghuram, and Intel CEO Lip-Bu Tan. The company currently employs 30 people.

Other investors include Factory, which led the seed, Eclipse Ventures, Prosperity7, and Triatomic.

Source link

What's Hot

What is hedge fund manager Dan Niles doing during this market upturn?

Littlebird raises $11 million for AI-assisted ‘recall’ tool to read computer screens

Microsoft may be in a slump. But here’s why it’s wrong to give up now

Littlebird raises $11 million for AI-assisted ‘recall’ tool to read computer screens

Apple sets WWDC 2026 date for June, teases ‘advances in AI’

Vibe coding startup Lovable is exploring acquisition

Sam Altman-backed fusion startup Helion is in talks with OpenAI

Newly freed hostages face long road to recovery after two years in captivity

Former Kenyan Prime Minister Raila Odinga dies at 80

New NATO member offers to buy more US weapons to Ukraine as Western aid dwindles

Russia expands drone targeting on Ukraine’s rail network

Sarah Paulson opens up about Amanda Peet’s cancer diagnosis

Jude Law’s ex-girlfriend speaks out about the Chapel Lawn incident

Jessie J is hospitalized after hitting her head and fearing a broken neck

Sarah Ferguson speaks out about Queen Elizabeth dog cloning rumors

Our Picks

BTS fever: How K-pop took Latin America by storm

The Tanker War: How history is repeating itself on the Strait of Hormuz

Lebanon: 18 months have passed since the last war. Not this time

Subscribe to Updates

What's Hot

Startup Gimlet Labs solves AI inference bottlenecks in a surprisingly elegant way

Related Posts

Subscribe to Updates