Behind the scenes of Google's state-of-the-art "nano-banana" image model
Join host Logan Kilpatrick in discussion with some of the minds behind Google's new state-of-the-art image model, Gemini 2.5 Flash. Product and research leads from the Gemini team break down the technology behind its key capabilities, including interleaved generation for complex edits and new approaches to achieving character consistency and pixel-perfect control. With Nicole Brichtova, Kaushik Shivakumar, Mostafa Dehghani and Robert Riachi. Listen to this podcast: Apple Podcasts → https://goo.gle/3Bm7QzQ Spotify → https://goo.gle/3ZL3ADl Chapters: 0:37 - New model introduction 1:21 -Demo: Image editing 3:44 - Text rendering capabilities 4:44 Beyond human preference evals 6:44 - Text rendering as a proxy for quality 8:38 - Positive transfer between modalities 11:25 - Demo: multi-turn, context aware image generation 13:54 - Pixel-perfect editing and character consistency 15:51 - Interleaved image generation 17:59 - Specialized vs. native models 19:52 - Understanding nuanced prompts 20:59 - User feedback shaping model development 22:37 - Improvements in character consistency 24:17 - More natural looking images from team collaboration 26:41 - What’s next for image generation models Watch more Release Notes → https://goo.gle/4njokfg Subscribe to Google for Developers → https://goo.gle/developers Speakers: Logan Kilpatrick, Nicole Brichtova, Kaushik Shivakumar, Mostafa Dehghani, Robert Riachi Products Mentioned: Google AI, Gemini
Video Chapters
- 0:00 Unveiling a revolutionary new image generation model
- 1:04 Experience a giant leap in image generation and editing quality
- 1:26 Watch as our product manager transforms into a giant banana!
- 4:00 Can AI truly write? Testing the model's text generation skills
- 6:55 Discover the secret sauce behind crafting superior AI models
- 13:50 Real-world magic: AI redesigns gardens, no banana costumes required
- 16:20 Mastering precision: Crafting super-complex prompts for stunning images
- 18:19 The ultimate creative partner: Building a multimodal AI for all your visions
- 19:53 The power of feedback: How "hot tags" continuously sculpt better AI models
- 22:38 The future is now: AI crafting personalized slide decks with perfect visuals
Original Output
0:00 Unveiling a revolutionary new image generation model 1:04 Experience a giant leap in image generation and editing quality 1:26 Watch as our product manager transforms into a giant banana! 4:00 Can AI truly write? Testing the model's text generation skills 6:55 Discover the secret sauce behind crafting superior AI models 13:50 Real-world magic: AI redesigns gardens, no banana costumes required 16:20 Mastering precision: Crafting super-complex prompts for stunning images 18:19 The ultimate creative partner: Building a multimodal AI for all your visions 19:53 The power of feedback: How "hot tags" continuously sculpt better AI models 22:38 The future is now: AI crafting personalized slide decks with perfect visuals Timestamps by StampBot 🤖