Introducing EmbeddingGemma: The Best-in-Class Open Model for On-Device Embeddings

Discover EmbeddingGemma, a state-of-the-art 308 million parameter text embedding model designed to power generative AI experiences directly on your hardware. Ideal for mobile-first Al, EmbeddingGemma brings powerful capabilities to your applications, enabling features like semantic search, information retrieval, and custom classification – all while running efficiently on-device. In this video, Alice Lisak and Lucas Gonzalez from the Gemma team introduce EmbeddingGemma and explain how it works. Learn how you can run this model on less than 200MB of RAM with quantization, customize its output dimensions with Matryoshka Representation Learning (MRL), and build powerful offline Al features. Resources: Learn about EmbeddingGemma → https://developers.googleblog.com/en/introducing-embeddinggemma EmbeddingGemma documentation → https://ai.google.dev/gemma/docs/embeddinggemma Gemma Cookbook → https://github.com/google-gemini/gemma-cookbook Quickstart RAG notebook → https://github.com/google-gemini/gemma-cookbook/blob/main/Gemma/%5BGemma_3%5DRAG_with_EmbeddingGemma.ipynb Discover Gemma models → https://deepmind.google/models/gemma Chapters 0:00 - Intro 0:26 - Model overview 1:18 - Model features 2:29 - RAG 2:54 - Website embedding demo 3:23 - Tools and platforms 3:41 - Conclusion Subscribe to Google for Developers → https://goo.gle/developers Speaker:Alice Lisak Lucas Gonzalez Products Mentioned: Google AI, Gemma,Generative AI

Channel: Google for Developers•Generated by anonymous•Duration: 4m•Published Sep 04, 2025

Thumbnail for Introducing EmbeddingGemma: The Best-in-Class Open Model for On-Device Embeddings

▶ Watch on YouTube

Video Chapters

Original Output

0:00 Discover EmbeddingGemma: A new era of mobile AI
0:25 Unveiling the power: 300 million parameters strong
0:53 Tiny yet mighty: Only 300MB RAM footprint
1:19 Practical magic: Semantic search & information retrieval
1:40 Top-tier performance: Best-in-class MTEB scores
2:03 Your data, your rules: On-device privacy & offline mode
2:29 Next-gen AI: Empowering RAG on mobile
2:54 See it in action: Personalized search demo
3:23 Seamless integration: Customize across platforms
3:40 Get started now: Open, fast, and ready to deploy

Timestamps by StampBot 🤖

Unprocessed Timestamp Content

0:00 Welcome to the future: Introducing EmbeddingGemma, the mobile-first AI model
0:10 Meet the team from Google DeepMind, ready to talk about embeddings
0:25 EmbeddingGemma is a powerful 300 million parameter text embedding model
0:53 Small, fast, efficient: Running on just 300MB RAM, surprisingly tiny
1:19 Discover key use cases like semantic search and information retrieval with ease
1:40 EmbeddingGemma achieves best-in-class MTEB scores for its model size
2:03 Enjoy on-device privacy and offline capabilities; your data stays local
2:29 Build powerful mobile-first generative AI experiences with Retrieval Augmented Generation
2:54 See EmbeddingGemma in action, personalizing search with local web pages
3:23 Customize and integrate EmbeddingGemma across popular tools and platforms
3:40 Download EmbeddingGemma now, it's open, small, fast, and incredibly efficient

Timestamps by StampBot 🤖