AInspiro
中文

Google Ships Gemini Omni Flash and Nano Banana 2 Lite: Multimodal AI Hits Production

Tech Trends
🤖 This article was generated by AI. Content is for informational purposes only.

On June 30, Google Dropped Two Models at Once

Honestly, Google's been moving fast.

On June 30, they shipped Nano Banana 2 Lite and Gemini Omni Flash — one for images, one for video. Both are available through the API, not just demo-page toys.

Nano Banana 2 Lite: Images in 4 Seconds

Two words sum up the pitch: fast and cheap.

  • Text-to-image in 4 seconds
  • $0.034 per image (at 1K resolution)
  • Reliable prompt adherence and legible in-image text rendering

Look, this is built for high-throughput pipelines. If you're generating thousands of images a day for A/B testing or asset drafting, this one won't hurt your wallet.

It's the "speed tier" of the Nano Banana family. Above it sit Nano Banana 2 (the generalist) and Nano Banana Pro (built for precision). Pick based on what you need.

Gemini Omni Flash: Conversational Video Editing

This one's more interesting. You tell it in plain language — "swap the background to a beach," "pan the camera left to right" — and it does it.

Pricing is $0.10 per second of video output, same as Veo 3.1 Fast.

It takes multimodal inputs: images, text, video, all combinable to keep scene consistency. It also taps into Gemini's knowledge base — it knows enough biology to generate a plausible animal, enough narrative logic to build a coherent scene.

Current Limitations

  • 10-second video generations only, longer durations "coming soon"
  • Audio references and scene extension not yet supported
  • Character consistency across scene changes is still spotty

The Two Models Chain Together

That's the real story. The official demo flow:

Generate an image fast with Nano Banana 2 Lite, then feed that image to Omni Flash to animate it into a video clip. Image to video, one pipeline.

Google shipped a few demo apps — upload a room photo, instantly restyle it, then watch a cinematic showcase. Selfie to landmark background to animated clip. All built on the same "generate image, then animate" pattern.

Watermarking and Provenance

Both models ship with SynthID watermarking built in. You can verify whether content is AI-generated through the Gemini app, Gemini in Chrome, or Search. Matters for commercial use — you don't want your assets getting flagged and not knowing why.


What This Actually Means

Multimodal AI just crossed from "cool demo" to "usable in production."

The old problem: these tools were too slow, too expensive, or too inconsistent for real pipelines. Nano Banana 2 Lite pushes image cost down to three cents apiece. Omni Flash lets you edit video by talking to it. Chain the two together and ad-asset generation, e-commerce product videos, content localization — these workflows can actually run now.

Sure, the 10-second limit is still there. Character consistency isn't fully solved. But the direction is clear.