As AI begins to interact with the physical world, new types of labs are working on building models of the world that can be used to manipulate physical robots and model objects in physical space. Unlike large-scale language models, these models lack easy data sources, and many labs struggle to assemble the necessary training sets.
Now, a startup is emerging with an unlikely data source: the video game industry.
That’s the premise of Origin Lab, which just announced an $8 million seed funding round led by Lightspeed Ventures. SV Angel, Eniac, Seven Stars, and FPV also participated, with angel funding from Twitch co-founder Kevin Lin and Cruise founder Kyle Vogt.
“AI systems being built today need to understand how the physical world works and how things move,” co-CEO and co-founder Anne-Margot Rodde told TechCrunch. “That data essentially resides within the video game.” The company’s other co-founders (pictured above) are Antoine Gargot and Colin Carrier.
Simply put, Origin Lab serves as a marketplace where labs focused on world models, such as Yann LeCun’s AMI Labs and Fei-Fei Li’s World Labs, can purchase high-quality licensed data. On the other hand, video game companies can squeeze additional revenue out of the digital assets they have already created. Along the way, Origin Lab converts the video game assets into a format that serves as training data. This can be as simple as running a render or as complex as automating hours of walkthrough footage.
“It became clear that the video game industry was storing incredibly valuable data, but there was basically no real way or infrastructure to connect the AI lab and the video game industry,” Rodde said. “In short, we built that bridge.”
Labs have long been interested in video game footage as a data source, but licensing and data quality issues have often gotten in the way. In December 2024, OpenAI caused a small scandal when the first version of its Sora video generation model appeared to be regurgitating footage from popular video games and streamers. This is likely because they were trained on Twitch streams. Amazon has been open about its interest in using Twitch footage to train models.
Origin’s successful funding is a sign of a growing market for training data, but also for startups that serve as key suppliers to major AI labs. Faraz Fatemi, a partner at Lightspeed who led the investment in Origin, says the success of companies like Scale AI makes this opportunity impossible to ignore.
“We’ve seen how rapidly revenue for data vendors servicing major laboratories grows,” Fatemi told TechCrunch. “These businesses are very well-capitalized, and the bottleneck in all of them is data.”
If you buy through links in our articles, we may earn a small commission. This does not affect editorial independence.
