Sean Shen believes that for AI to succeed in the physical world, it needs to remember what it sees. Shen’s company Memories.ai uses Nvidia AI tools to build an infrastructure that allows wearables and robotics to remember and recall visual memories.
Memories.ai announced a partnership with semiconductor giant NVIDIA at Monday’s GTC conference. Through this partnership, Memories.ai continues to develop visual memory technology using Nvidia’s Cosmos-Reason 2, an inferential vision language model, and Nvidia Metropolis, a video search and summarization application.
Shen (pictured above, left) told TechCrunch that he and his co-founder and CTO Ben Zhou (pictured above, right) got the idea for the company while building the AI system behind Meta’s Ray-Ban glasses. Building the AI glasses led us to think about how the technology would actually be used in the real world if the user couldn’t remember the video data they were recording.
They looked around to see if anyone was already building that kind of visual memory solution for AI. When they couldn’t do that, they decided to spin out of Meta and build it themselves.
“AI already works very well in the digital world. What about in the physical world?” Shen said. “AI wearables and robots also need memory. …Ultimately, AI needs visual memory. We believe in that future.”
In general, the ability for AI systems to memorize is relatively new. OpenAI updated ChatGPT to start remembering past chats in 2024 and tweaked its functionality in 2025. Elon Musk’s xAI and Google Gemini have also launched their own memory tools in the past two years.
But these advances mainly focus on text-based memory, Shen said. Text-based memory is much more structured and easier to index, but it is less useful for physical AI applications that interact with the world primarily through sight and visuals.
tech crunch event
San Francisco, California
|
October 13-15, 2026
Memories.ai was founded in 2024 and has raised $16 million to date through an $8 million seed round in July 2025 and an $8 million extension. The round was led by Susa Ventures with participation from Seedcamp, Fusion Fund, Crane Venture Partners and others.
Shen said two things are necessary to successfully build this visual memory layer. One is to build the infrastructure needed to embed and index the video into a data format that can be stored and recalled, and the other is to capture the data needed to train a model to do so.
The company announced the Large-Scale Visual Memory Model (LVMM) in July 2025. Shen said this can be compared to a smaller version of Gemini Embedding 2, a multimodal indexing and retrieval model released earlier this month.
For data collection, the company created LUCI, a hardware device worn by its “data collectors” that records videos used to train models. Shen said his company had no intention of becoming a hardware company or selling these devices, but rather developed it on its own because it was dissatisfied with off-the-shelf video recorders that focused on high-resolution, battery-draining video formats.
The company has released the second generation of LVMM and has entered into a partnership with Qualcomm to run on Qualcomm’s processors starting later this year.
Shen said Memories.ai has already partnered with some major wearable companies, but declined to say which ones. Although there is some demand now, Shen sees even bigger opportunities yet to come in wearables and robotics.
“When it comes to commercialization, we are more focused on the model and infrastructure because we think that eventually there will be a market for wearables and robots, but maybe not now,” Shen said.
