Behind the scenes of Google's state-of-the-art "nano-banana" image model

Join host Logan Kilpatrick in discussion with some of the minds behind Google's new state-of-the-art image model, Gemini 2.5 Flash. Product and research leads from the Gemini team break down the technology behind its key capabilities, including interleaved generation for complex edits and new approaches to achieving character consistency and pixel-perfect control. With Nicole Brichtova, Kaushik Shivakumar, Mostafa Dehghani and Robert Riachi. Listen to this podcast: Apple Podcasts → https://goo.gle/3Bm7QzQ Spotify → https://goo.gle/3ZL3ADl Chapters: 0:37 - New model introduction 1:21 -Demo: Image editing 3:44 - Text rendering capabilities 4:44 Beyond human preference evals 6:44 - Text rendering as a proxy for quality 8:38 - Positive transfer between modalities 11:25 - Demo: multi-turn, context aware image generation 13:54 - Pixel-perfect editing and character consistency 15:51 - Interleaved image generation 17:59 - Specialized vs. native models 19:52 - Understanding nuanced prompts 20:59 - User feedback shaping model development 22:37 - Improvements in character consistency 24:17 - More natural looking images from team collaboration 26:41 - What’s next for image generation models Watch more Release Notes → https://goo.gle/4njokfg Subscribe to Google for Developers → https://goo.gle/developers Speakers: Logan Kilpatrick, Nicole Brichtova, Kaushik Shivakumar, Mostafa Dehghani, Robert Riachi Products Mentioned: Google AI, Gemini

Channel: Google for DevelopersGenerated by anonymousDuration: 30mPublished Aug 27, 2025
Thumbnail for Behind the scenes of Google's state-of-the-art "nano-banana" image model ▶ Watch on YouTube

Video Chapters

Original Output

0:00 Unveiling a revolutionary new image generation model
1:04 Experience a giant leap in image generation and editing quality
1:26 Watch as our product manager transforms into a giant banana!
4:00 Can AI truly write? Testing the model's text generation skills
6:55 Discover the secret sauce behind crafting superior AI models
13:50 Real-world magic: AI redesigns gardens, no banana costumes required
16:20 Mastering precision: Crafting super-complex prompts for stunning images
18:19 The ultimate creative partner: Building a multimodal AI for all your visions
19:53 The power of feedback: How "hot tags" continuously sculpt better AI models
22:38 The future is now: AI crafting personalized slide decks with perfect visuals

Timestamps by StampBot 🤖

Unprocessed Timestamp Content

0:00 Introduction to the team and the exciting new native image generation model
0:37 Logan introduces the brilliant minds behind Gemini's latest image features
1:04 A giant quality leap for image generation and editing capabilities
1:26 Our brave product manager transformed into a giant banana
2:44 Making our banana-suited hero nano-sized for peak cuteness
4:00 Can this model write text Let's try "Gemini nano"
5:27 The challenges of evaluating multi-modal AI; it's quite subjective
6:55 The secret sauce: how focused feedback led to better models
9:07 Image understanding and generation are like sisters, always evolving together
11:25 Behold! Our product manager gets 5 glorious 80s mall makeovers
13:50 Real-world applications: redesigning gardens with AI, no banana costumes needed
16:20 Breaking down complex prompts for super precise and consistent image generation
18:19 One model to rule them all: building a multimodal creative partner
19:53 How user feedback and "hot tags" fuel continuous model improvement
21:05 Why this model is "smarter" than you (and sometimes even better)
22:38 The future of AI: making personalized slide decks with perfect visuals

Timestamps by StampBot 🤖