Close Menu
  • Home
  • AI
  • Entertainment
  • Finance
  • Sports
  • Tech
  • USA
  • World
  • Latest News

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

What's Hot

New NATO member Sweden announces $4 billion in defense investment. Serve pop 5%

May 20, 2026

Google’s Gemini Omni turns images, audio, and text into video – and that’s just the beginning

May 20, 2026

Polymarket launches private company deal to allow investors to speculate in Anthropic, OpenAI

May 20, 2026
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram Vimeo
BWE News – USA, World, Tech, AI, Finance, Sports & Entertainment Updates
  • Home
  • AI
  • Entertainment
  • Finance
  • Sports
  • Tech
  • USA
  • World
  • Latest News
BWE News – USA, World, Tech, AI, Finance, Sports & Entertainment Updates
Home » Google’s Gemini Omni turns images, audio, and text into video – and that’s just the beginning
AI

Google’s Gemini Omni turns images, audio, and text into video – and that’s just the beginning

adminBy adminMay 20, 2026No Comments5 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp VKontakte Email
Share
Facebook Twitter LinkedIn Pinterest Email


When Google launched Gemini three years ago, the goal was to build a multimodal large-scale language model: a single neural network that can be trained on text, images, audio, and video and generate content in any of these formats.

Today, the company took a concrete step toward its Gemini Omni goal at the Google I/O developer conference. Gemini Omni is a new family of multimodal models that, according to Google CEO Sundar Pichai, “can create anything from any input.”

Omni starts with a video. Users can now combine images, audio, video, and text, and instead of simply splicing these inputs together, Omni integrates them all to produce a coherent output. The result is high-quality videos that reflect an understanding of physics, culture, history, and science.

Omni also lets you edit photos using plain text commands, rather than complex editing software like Google’s Nano Banana.

Google already has a dedicated video model, Veo, which allows users to convert text and images into videos, as well as direct and customize avatars. But Nicole Brichtova, director of product management at Google DeepMind, says today’s release is more than an update to Veo: “It’s the next step in our advancement of combining the intelligence of Gemini with the rendering capabilities of our media models.”

Koray, Chief Engineer at DeepMind One example, Kavukcuoglu told reporters at Monday’s media briefing, is that when Omni was given a simple prompt like “claymation explainer of protein folding,” it rendered a stop-motion explainer video with a voiceover saying, “Proteins start as chains of amino acids. They fold into alpha-helix-like patterns and flat sections called beta sheets, forming a perfect three-dimensional structure.” shape. ”

Omni’s long-term vision is broader and includes models used to generate images from audio and audio from video.

“When we first announced Gemini, it was the first AI model that was natively multimodal,” Pichai said during the briefing. “We knew that by training a combination of text, code, audio, images, and video, we could gain a deeper understanding of the world. World models are moving AI from predicting text to simulating reality. Gemini Omni is the next step in that direction.”

As part of the release, users will also be able to create videos using their own digital avatars. This is what OpenAI popularized with Cameos in its now-defunct Sora app. According to Brichtova, to prevent deepfakes, users must go through dedicated product onboarding. This involves recording yourself and reading out a series of numbers. The avatar is then saved for future use.

Additionally, all videos created with Omni include Google’s SynthID watermark, so users can tell if a video was generated via a Gemini product.

The first model in this family is Gemini Omni Flash, which is rolling out today to Gemini apps, YouTube Shorts, and the AI ​​creative studio Flow. Flash will be able to render 10-second videos, but Brichtova said this was a decision based both on the desire to reach more users and the expectation that most users don’t want to create videos that long yet, rather than a model limitation. However, longer video times will be developed in the near future.

Google seems to be marketing Omni Flash more as a consumer tool. The examples of uses for digital avatars that Brichtova and Gabe Barth-Maron, a research engineer at DeepMind, cited in a conference call with TechCrunch were all personal, from creating videos of winning awards and going to the moon to removing passersby from the background of videos taken while on vacation.

Put more simply, Barth-Maron says, “They’re like personalized memes.”

“We definitely focused on making this easy for consumers to use,” Brichtova said. “There aren’t many video models that have been able to break through the gap with consumers, so this is what we’re doing.”

There are some caveats regarding ease of use. Brichtova and Barth-Maron noted that editing prompts need to be very specific. Otherwise, you run the risk of Omni over-editing or unintentionally changing elements that you want to keep. This is a problem that Nano Banana users may encounter.

Image credit: Google

Despite the near-term focus on consumers, it’s clear that Omni will have an impact on enterprise and creative, and Google plans to make Omni available via API in the coming weeks. The avatar generator is a feature currently available in Shorts that Google hopes content creators will adopt. But more broadly, end-to-end multimodal workflows could be transformative for advertisers and filmmakers.

Startup Luma AI is building something similar. This is an agent tool that leverages a unique “integration” model and can generate entire advertising campaigns based on short summaries and product images.

“We are actually very proud of the text rendering capabilities of this model, which is very useful for things like advertising,” Brichtova says. “If you want it to be a product somewhere, or even just a slogan, it needs to be accurate. We definitely expect filmmakers and other types of creators to use this model as well.”

More professional use cases may be better served by the Omni Pro model, which should provide better performance across all Omni tasks. Google hasn’t yet said when it will release a Pro version, but Brichtova said it will happen at a time when “we feel like we’re at a point where it’s going to be even more gradual than Flash.”

Check out the rest of the important news from Google IO 2026

As you know, Google search is dead

Google updates Gemini app to support ChatGPT and Claude

Google launches Gemini Spark, a 24/7 agent assistant integrated with Gmail

How to use Google’s new information agent

If you buy through links in our articles, we may earn a small commission. This does not affect editorial independence.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Email
Previous ArticlePolymarket launches private company deal to allow investors to speculate in Anthropic, OpenAI
Next Article New NATO member Sweden announces $4 billion in defense investment. Serve pop 5%
admin
  • Website

Related Posts

Google just declared itself an AI design candidate at IO 2026

May 20, 2026

You can now have a conversation with your Gmail inbox, as seen at Google IO 2026

May 20, 2026

Google adds voice-based prompts to Docs and Keep

May 20, 2026

Agent app coding has been upgraded with the release of Google’s Android CLI

May 19, 2026
Leave A Reply Cancel Reply

Our Picks

Newly freed hostages face long road to recovery after two years in captivity

October 15, 2025

Former Kenyan Prime Minister Raila Odinga dies at 80

October 15, 2025

New NATO member offers to buy more US weapons to Ukraine as Western aid dwindles

October 15, 2025

Russia expands drone targeting on Ukraine’s rail network

October 15, 2025
Don't Miss
Entertainment

Cardi B, Kendrick Lamar considered lead candidate

By adminMay 20, 20260

Sportswoman of the Year Award Aja Wilson — Basketball Angel Reese — Basketball Claressa Shields…

Brooke Averick reveals she will miss Phoebe Berman’s real-world influence

May 20, 2026

Hayden Panettiere talks about the dynamic change in Connie Britton’s fame in Nashville

May 20, 2026

Emily Henry addresses Patrick Schwarzenegger’s Beach Reid casting criticism

May 20, 2026
About Us
About Us

Welcome to BWE News – your trusted source for timely, reliable, and insightful news from around the globe.

At BWE News, we believe in keeping our readers informed with facts that matter. Our mission is to deliver clear, unbiased, and up-to-date news so you can stay ahead in an ever-changing world.

Our Picks

Thirty years ago, Cuba shot down a plane carrying Americans. Former President Raul Castro may also be indicted in this case.

May 19, 2026

Fidel Castro’s daughter has no love for Cuba’s regime, but warns not to underestimate it

May 19, 2026

Far-right Israeli Minister Bezalel Smotrich says ICC seeks arrest warrant

May 19, 2026

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Facebook X (Twitter) Instagram Pinterest
  • Home
  • About Us
  • Advertise With Us
  • Contact US
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2026 bwenews. Designed by bwenews.

Type above and press Enter to search. Press Esc to cancel.