Microsoft AI, the tech giant’s research lab, on Thursday announced the release of three basic AI models that can generate text, audio, and images.
The release marks Microsoft’s continued push to build its own stack of multimodal AI models and compete with rival AI labs, even as it maintains ties with OpenAI.
According to a company press release, MAI-Transcribe-1 transcribes audio to text in 25 different languages and is 2.5 times faster than Microsoft’s Azure Fast service. MAI-Voice-1 is a voice generation model. This voice model allows users to generate 60 seconds of voice per second and allows them to create custom voices. MAI-Image-2 is a video generation model.
MAI-Image-2 was originally released on March 19th in MAI Playground, a new large-scale language model testing software. All three models have now been released on Microsoft Foundry, and the transcription and audio models are also now available on MAI Playground.
These models were developed by Microsoft’s MAI Superintelligence team, an AI research team led by Microsoft AI CEO Mustafa Suleiman, founded and announced in November 2025.
“At Microsoft AI, we are building Humanist AI, with a clear vision of putting humans at the center of our AI models, optimizing how people actually communicate, and training them for practical use,” Suleyman said in a blog post. “You’ll soon be able to see our other models at Foundry and directly in Microsoft products and experiences.”
In the increasingly crowded LLM market, MAI hopes the selling point for these models will be that they are cheaper than those from Google and OpenAI, the company wrote in a blog post.
tech crunch event
San Francisco, California
|
October 13-15, 2026
MAI-Transcribe-1 starts at $0.36 per hour. MAI-Voice-1 starts at $22 per million characters, MAI-Image-2 starts at $5 per million tokens for text input, and $33 per million tokens for image output.
Despite releasing its own model, Suleiman reaffirmed Microsoft’s commitment to its partnership with OpenAI in an interview with VentureBeat. However, recent partnership renegotiations allow Microsoft to truly pursue this superintelligence research, Suleiman told The Verge.
Microsoft has invested more than $13 billion in its AI research lab and hosts its models in a variety of products through multi-year partnerships. Microsoft takes the same stance when it comes to chips. You can not only produce it yourself, but also buy it from external players.
