
Another startup is in production to claim an edge over Nvidia, the world’s most valuable company, in the increasingly competitive AI chip market.
D-Matrix, located three miles from Nvidia’s Silicon Valley headquarters, says its chips can run inference workloads 10 times faster and consume one-fifth less power than the market leader’s standalone graphics processing units, as long as the workloads are small.
A new inference chip called Corsair takes a novel approach to memory, similar to Cerebras and Groq. As technology giants demand every available computing resource, it’s becoming clear that small businesses have a huge opportunity to find a niche market.
Founded in 2015, Cerebras held a massive IPO last month, raising more than $5.5 billion and is now valued at more than $50 billion. Groq’s assets were then acquired by Nvidia in December for $20 billion, the AI giant’s biggest acquisition to date. Then, at GTC in March, Nvidia released a new Groq chip called the Language Processing Unit.
“This is a $1 trillion market emerging,” D-Matrix co-founder and CEO Sid Sheth said in an interview with CNBC, adding that he has no intention of selling the company. “Could the market support another public company? Absolutely.”
Founded in 2019, D-Matrix has raised approximately $500 million to date, and is valued at approximately $2 billion. Microsoft was also an investor through its M12 venture arm. This is notable because of Microsoft’s own chip ambitions, including the Maia 200 chip for AI inference, a new PC processor built with Nvidia, and an in-house quantum computing chip announced last week.
Sheth hasn’t named any of Corsair’s customers yet, but said there are commitments from high-profile hyperscalers, neoclouds and frontier AI labs eager to get their hands on as much compute as possible. D-Matrix will begin shipping to these customers this month. About 90% of its customers are in the United States, Sheth said, with international customers in the Middle East and Southeast Asia.
Nvidia Corp. CEO Jensen Huang announced the RTX Spark superchip at the Nvidia GTC conference on the sidelines of Computex 2026 on Monday, June 1, 2026 in Taipei, Taiwan.
Yifei Lin | Bloomberg | Getty Images
“They very often sell this product to customers to use in conjunction with NVIDIA,” said Stacey Rasgon, a semiconductor analyst at Bernstein Research, adding that different chips are better at different tasks. “He seems to have quite a lot of real customer engagement.”
D-Matrix’s Corsair chips enable low-power, low-latency inference by tightly integrating memory and compute on a single chip.
Like Groq and Cerebras, D-Matrix relies on SRAM, a type of memory that can be manufactured in logic factories like Taiwan Semiconductor Manufacturing Company and integrated on the same chip. GPUs rely heavily on another type of memory called DRAM, which is packaged into stacks of high-bandwidth memory added around logic chips.
That DRAM is also what’s missing micronSamsung and SK Hynix.
“We haven’t encountered any issues with DRAM in our products because our products don’t really rely on DRAM to be successful,” Sheth said.
According to Rick Barr, adjunct professor of electrical engineering at Stanford University, a major drawback of the D-Matrix approach is that SRAM cannot handle large-scale inference models.
On-chip SRAM enables “incredible inference speeds” because data travels very short distances, but it cannot handle the trillions of parameters that make up the large models of leaders like OpenAI and Anthropic.
“It is impossible to incorporate this many parameters into an SRAM-based design,” Bahr says. “That’s the big challenge.”
Sheth said Corsair is designed for AI inference and is “optimized for interactivity and speed” rather than language size. Think chatbots, voice agents, and agent tools like Claude Code and OpenClaw.
Citing research from Gimlet Labs, D-Matrix says that when combined with Nvidia Blackwell GPUs, Corsair can run inference 10x faster, 3x cheaper, and up to 5x more energy efficient compared to standalone GPUs.
Nvidia CEO Jensen Huang said last week that the company remains a leader in low-cost inference with its cutting-edge Vera Rubin system because it’s not just about speed.
“That’s because we integrate everything, design everything from scratch, simulate the entire system, and take extreme co-design,” Huang said at Computex in Taiwan.
D-Matrix sells four Corsair chips packaged in a card that plugs into a slot in a data center’s server rack, priced in the tens of thousands of dollars, Sheth said.
It’s the plug-and-play approach that distinguishes D-Matrix from Cerebras and Groq, Sheth said, and with up to 128 gigabytes of SRAM memory in a single server, he called Corsair “the densest SRAM solution on the market today.”
D-Matrix supports Arista, Broadcom, and super micro We will build a full rack-scale system called SquadRack for deploying chips in AI data centers.
The chip is manufactured in Taiwan at TSMC’s 6-nanometer node. D-Matrix’s next chip, Raptor, is TSMC 4-nanometer and is scheduled to be released next year, but Sheth said there may be a shortage of the chip from the Taiwanese company’s factory in Arizona.
“The grand prize will be building a computing solution for AI inference,” said Sheth.
WATCH: How the top AI chips work, from GPU to TPU

