Google’s Gemini Omni Wants to ‘Create Anything’ From AI Video Prompts

May 21, 2026 - 16:15

Updated: 17 hours ago

0 0

Google’s Gemini Omni Wants to ‘Create Anything’ From AI Video Prompts

Google has introduced Gemini Omni, a new family of multimodal AI models announced at Google I/O 2026, positioning it as a major step toward building systems that can “create anything from any input.”

The first release in the lineup is Gemini Omni Flash, which is already rolling out across Google’s consumer platforms, including the Gemini app, Google Flow, and YouTube Shorts.

While Google has previously explored AI-generated video through tools like Veo and image-generation systems such as Nano Banana, Omni takes the idea further by merging reasoning, creation, and editing into a single system.

Google CEO Sundar Pichai said during a media briefing that Gemini Omni represents a major technological paradigm shift. “With world models, AI is moving from predicting text to simulating reality. Gemini Omni is the next step in that direction,” Pichai said.

The technology achieves this by relying on an intuitive understanding of physical forces such as gravity, kinetic energy, and fluid dynamics, while simultaneously drawing on Gemini’s broad knowledge base of history, science, and cultural context.

Conversational video editing

Beyond creating videos from scratch, Gemini Omni introduces highly capable video editing driven entirely by natural language. Instead of using traditional, complex video editing software, users can edit videos across multiple conversational turns while the AI maintains character consistency, scene continuity, and physics.

According to Google, users can completely reimagine a scene’s action, swap out backgrounds, modify camera angles, or alter specific details without losing the thread of the original footage. However, Google warned that editing prompts currently need to be highly specific. If a prompt is too vague, the model could over-edit or unintentionally alter parts of the video the user wanted to preserve.

Digital avatars and guardrails

The rollout includes a personalized digital avatar feature that allows users to generate video likenesses of themselves that look and sound like them.

To address deepfake concerns and protect users from harm, Google DeepMind director of product management Nicole Brichtova noted that users must complete a dedicated product onboarding process. This requires individuals to record themselves speaking a series of numbers before an avatar can be created and stored.

Because Omni’s ability to seamlessly alter reality raises potential risks, Google is embedding its imperceptible SynthID digital watermark into every video generated by the model. These watermarks can be verified across the Gemini app, Google Search, and Chrome to ensure content transparency.

The company is also taking a cautious approach to audio editing. While users can build avatars using their own voice, more advanced features that change speech and audio elements are being withheld from public release as Google continues testing and safety evaluations.

Pricing and availability

The first model of the new ecosystem to debut is Gemini Omni Flash, which focuses on rendering 10-second video clips.

Google explained that the 10-second limit is not a technical limitation of the model, but a deliberate decision to make the tool easier to deploy for a broad consumer base.

Paid subscribers: Gemini Omni Flash is rolling out immediately to Google AI Plus, Pro, and Ultra subscribers within the Gemini app and Google Flow.
Free users: The model launches at no cost on YouTube Shorts and the YouTube Create app.
Developers and enterprise: Access via APIs will open up in the coming weeks.

The next test for Gemini Omni will be whether users treat it as a creative shortcut, a video editor, or something closer to a synthetic media platform. Google is pitching the model as a step toward AI that understands the physical world, but its broader impact may depend just as much on whether its guardrails can keep pace with what users create.

Also read: Gemini 3.5 Flash brings Google’s latest AI model into coding, Search, enterprise automation, and personal AI agents.

The post Google’s Gemini Omni Wants to ‘Create Anything’ From AI Video Prompts appeared first on eWEEK.