Podcast Google AI: Release Notes by Google AI Episodes

Explore every episode of the podcast Google AI: Release Notes

Dive into the complete episode list for Google AI: Release Notes. Each episode is cataloged with detailed descriptions, making it easy to find and explore specific topics. Keep track of all episodes from your favorite podcast and never miss a moment of insightful content.

	Title	Pub. Date	Duration
	Launching Gemini 2.5	28 Mar 2025	00:27:55
Tulsee Doshi, Head of Product for Gemini Models joins host Logan Kilpatrick for an in-depth discussion on the latest Gemini 2.5 Pro experimental launch. Gemini 2.5 is a well-rounded, multimodal thinking model, designed to tackle increasingly complex problems. From enhanced reasoning to advanced coding, Gemini 2.5 can create impressive web applications and agentic code applications. Learn about the process of building Gemini 2.5 Pro experimental, the improvements made across the stack, and what’s next for Gemini 2.5. Chapters: 0:00 - Introduction 1:05 - Gemini 2.5 launch overview 3:19 - Academic evals vs. vibe checks 6:19 - The jump to 2.5 7:51 - Coordinating cross-stack improvements 11:48 - Role of pre/post-training vs. test-time compute 13:21 - Shipping Gemini 2.5 15:29 - Embedded safety process 17:28 - Multimodal reasoning with Gemini 2.5 18:55 - Benchmark deep dive 22:07 - What’s next for Gemini 24:49 - Dynamic thinking in Gemini 2.5 25:37 - The team effort behind the launch Resources: Gemini → https://goo.gle/41Yf72b Gemini 2.5 blog post → https://goo.gle/441SHiV Example of Gemini’s 2.5 Pro’s game design skills → https://goo.gle/43vxkq1 Demo: Gemini 2.5 Pro Experimental in Google AI Studio → https://goo.gle/4c5RbhE
	Gemini app: Canvas, Deep Research and Personalization	20 Mar 2025	00:36:53
Dave Citron, Senior Director Product Management, joins host Logan Kilpatrick for an in-depth discussion on the latest Gemini updates and demos. Learn more about Canvas for collaborative content creation, enhanced Deep Research with Thinking Models and Audio Overview and a new personalization feature. 0:00 - Introduction 0:59 - Recent Gemini app launches 2:00 - Introducing Canvas 5:12 - Canvas in action 8:46 - More Canvas examples 12:02 - Enhanced capabilities with Thinking Models 15:12 - Deep Research in action 20:27 - The future of agentic experiences 22:12 Deep Research and Audio Overviews 24:11 - Personalization in Gemini app 27:50 - Personalization in action 29:58 - How personalization works: user data and privacy 32:30 -The future of personalization
	Developing Google DeepMind's Thinking Models	24 Feb 2025	01:03:32
Jack Rae, Principal Scientist at Google DeepMind, joins host Logan Kilpatrick for an in-depth discussion on the development of Google’s thinking models. Learn more about practical applications of thinking models, the impact of increased 'thinking time' on model performance and the key role of long context. 01:14 - Defining Thinking Models 03:40 - Use Cases for Thinking Models 07:52 - Thinking Time Improves Answers 09:57 - Rapid Thinking Progress 20:11 - Long Context Is Key 27:41 - Tools for Thinking Models 29:44 - Incorporating Developer Feedback 35:11 - The Strawberry Counting Problem 39:15 - Thinking Model Development Timeline 42:30 - Towards a GA Thinking Model 49:24 - Thinking Models Powering AI Agents 54:14 - The Future of AI Model Evals
	Behind the Scenes of Gemini 2.0	11 Dec 2024	00:35:18
Tulsee Doshi, Gemini model product lead, joins host Logan Kilpatrick to go behind the scenes of Gemini 2.0, taking a deep dive into the model's multimodal capabilities and native tool use, and Google's approach to shipping experimental models. Watch on YouTube: https://www.youtube.com/watch?v=L7dw799vu5o Chapters: Meet Tulsee Doshi Gemini's Progress Over the Past Year Introducing Gemini 2.0 Shipping Experimental Models Gemini 2.0’s Native Tool Use Function Calling Multimodal Agents Rapid Fire Questions
	Smaller, Faster, Cheaper & The Story of Flash 8B	05 Dec 2024	00:43:20
Logan Kilpatrick sits down with Emanuel Taropa, a key figure in the development of Gemini to delve into the cutting edge of AI. Taropa provides insights into the technical challenges and triumphs of building and deploying large language models, focusing on the recent release of the Flash 8B Gemini model. Their conversation covers everything from the intricacies of model architecture and training to the practical challenges of shipping AI models at scale, and even speculates on the future of AI.
	Deep Dive into Long Context	02 May 2025	00:59:32
Explore the synergy between long context models and Retrieval Augmented Generation (RAG) in this episode of Release Notes. Join Google DeepMind's Nikolay Savinov as he discusses the importance of large context windows, how they enable Al agents, and what's next in the field. Chapters: 0:52 Introduction & defining tokens 5:27 Context window importance 9:53 RAG vs. Long Context 14:19 Scaling beyond 2 million tokens 18:41 Long context improvements since 1.5 Pro release 23:26 Difficulty of attending to the whole context 28:37 Evaluating long context: beyond needle-in-a-haystack 33:41 Integrating long context research 34:57 Reasoning and long outputs 40:54 Tips for using long context 48:51 The future of long context: near-perfect recall and cost reduction 54:42 The role of infrastructure 56:15 Long-context and agents
	Google I/O 2025 Recap with Josh Woodward and Tulsee Doshi	22 May 2025	00:40:15
Learn more AI Studio: https://aistudio.google.com/ Gemini Canvas: https://gemini.google.com/canvas Mariner: https://labs.google.com/mariner/ Gemini Ultra: https://one.google.com/about/google-a... Jules: https://jules.google/ Gemini Diffusion: https://deepmind.google/models/gemini... Flow: https://labs.google/flow/about Notebook LM: https://notebooklm.google.com/ Stitch: https://stitch.withgoogle.com/ Chapters 0:59 - I/O Day 1 Recap 02:48 - Envisioning I/O 2030 08:11 - AI for Scientific Breakthroughs 09:20 - Veo 3 & Flow 7:35 - Gemini Live & the Future of Proactive Assistants 20:30 - Gemini in Chrome & Future Apps 22:28 - New Gemini Models: DeepThink, Diffusion & 2.5 Flash/Pro Updates 27:19 - Developer Momentum & Feedback Loop 31:50 - New Developer Products: Jules, Stitch & CodeGen in AI Studio 37:44 - Evolving Product Development Process with AI 39:23 - Closing
	Building Gemini's Coding Capabilities	16 Jun 2025	01:00:27
Connie Fan, Product Lead for Gemini's coding capabilities, and Danny Tarlow, Research Lead for Gemini's coding capabilities, join host Logan Kilpatrick for an in-depth discussion on how the team built one of the world's leading AI coding models. Learn more about the early goals that shaped Gemini's approach to code, the rise of 'vibe coding' and its impact on development, strategies for tackling large codebases with long context and agents, and the future of programming languages in the age of AI. Watch on YouTube: ⁠https://www.youtube.com/watch?v=jwbG_m-X-gE⁠ Chapters: 0:00 - Intro 1:10 - Defining Early Coding Goals 6:23 - Ingredients of a Great Coding Model 9:28 - Adapting to Developer Workflows 11:40 - The Rise of Vibe Coding 14:43 - Code as a Reasoning Tool 17:20 - Code as a Universal Solver 20:47 - Evaluating Coding Models 24:30 - Leveraging Internal Googler Feedback 26:52 - Winning Over AI Skeptics 28:04 - Performance Across Programming Languages 33:05 - The Future of Programming Languages 36:16 - Strategies for Large Codebases 41:06 - Hill Climbing New Benchmarks 42:46 - Short-Term Improvements 44:42 - Model Style and Taste 47:43 - 2.5 Pro’s Breakthrough 51:06 - Early AI Coding Experiences 56:19 - Specialist vs. Generalist Models ⁠ ⁠
	Sergey Brin on the Future of AI & Gemini	16 Jun 2025	00:27:19
A conversation with Sergey Brin, co-founder of Google and computer scientist working on Gemini, in reaction to a year of progress with Gemini. Watch on YouTube: https://www.youtube.com/watch?v=o7U4DV9Fkc0 Chapters 0:20 - Initial reactions to I/O 2:00 - Focus on Gemini’s core text model 4:29 - Native audio in Gemini and Veo 3 8:34 - Insights from model training runs 10:07 - Surprises in current AI developments vs. past expectations 14:20 - Evolution of model training 16:40 - The future of reasoning and Deep Think 20:19 - Google’s startup culture and accelerating AI innovation 24:51 - Closing
	Gemini's Multimodality	02 Jul 2025	00:44:17
Ani Baddepudi, Gemini Model Behavior Product Lead, joins host Logan Kilpatrick for a deep dive into Gemini's multimodal capabilities. Their conversation explores why Gemini was built as a natively multimodal model from day one, the future of proactive AI assistants, and how we are moving towards a world where "everything is vision." Learn about the differences between video and image understanding and token representations, higher FPS video sampling, and more. Chapters: 0:00 - Intro 1:12 - Why Gemini is natively multimodal 2:23 - The technology behind multimodal models 5:15 - Video understanding with Gemini 2.5 9:25 - Deciding what to build next 13:23 - Building new product experiences with multimodal AI 17:15 - The vision for proactive assistants 24:13 - Improving video usability with variable FPS and frame tokenization 27:35 - What’s next for Gemini’s multimodal development 31:47 - Deep dive on Gemini’s document understanding capabilities 37:56 - The teamwork and collaboration behind Gemini 40:56 - What’s next with model behavior Watch on YouTube: https://www.youtube.com/watch?v=K4vXvaRV0dw
	Demis Hassabis on shipping momentum, better evals and world models	11 Aug 2025	00:31:09
Demis Hassabis, CEO of Google DeepMind, sits down with host Logan Kilpatrick. In this episode, learn about the evolution from game-playing AI to today's thinking models, how projects like Genie 3 are building world models to help AI understand reality and why new testing grounds like Kaggle’s Game Arena are needed to evaluate progress on the path to AGI. Watch on YouTube: https://www.youtube.com/watch?v=njDochQ2zHs Chapters: 00:00 - Intro 01:16 - Recent GDM momentum 02:07 - Deep Think and agent systems 04:11 - Jagged intelligence 07:02 - Genie 3 and world models 10:21 - Future applications of Genie 3 13:01 - The need for better benchmarks and Kaggle Game Arena 19:03 - Evals beyond games 21:47 - Tool use for expanding AI capabilities 24:52 - Shift from models to systems 27:38 - Roadmap for Genie 3 and the omni model 29:25 - The quadrillion token club
	Building real-time voice applications with Live API	06 Aug 2025	00:40:14
Shrestha Basu Mallick, one of the product leads for the Gemini API, joins host Logan Kilpatrick for a deep dive of Gemini Live API, Google’s real-time, multimodal interface for developers. Learn about how native audio alongside new capabilities like proactive audio and async function calling unlocks the unique power of audio as an interface. Watch on YouTube: https://www.youtube.com/watch?v=4xlwlU6h-wM 0:00 - Intro 1:18 - Live API Overview 3:36 - Why audio is a special modality 5:07 - Speed vs. precision in audio 6:17 - Controllable and promptable TTS 8:31 - What developers are building with the Live API 11:14 - URL context and async calling features 15:02 - Proactive audio and affective dialog 16:55 - Addressing developer feedback 21:54 - Live API roadmap 23:49 - The role of long context 24:57 - What’s next for the Live API 26:41 - State of the AI audio market 30:10 - Advice for developers getting started with the Live API 31:16 - Live API demo 38:10 - Demo wrap up and closing
	Building a frontier AI search experience	23 Jul 2025	00:43:16
Robby Stein, VP of Product for Google Search, joins host Logan Kilpatrick to explore how Search is evolving into a frontier AI product. Their conversation covers the shift from simple keywords to complex, conversational queries, the rise of agentic capabilities that can take action on your behalf, and the vision to help billions of users truly "ask anything." Learn more about the technology behind AI Overviews, AI Mode, Deep Search, and the future of multimodal interaction. Watch on YouTube: https://youtu.be/zUB5A_ezIOU Chapters 01:07 Search as a Frontier AI Product 02:38 Reaching 1.5 Billion Users 03:37 What Is AI Mode? 04:17 Understanding Query Fan-Out 05:18 Balancing Latency and performance with Gemini 2.5 Pro 06:51 How Deep Search works 09:08 Fine-tuning models for product experience 11:24 Shifting user behaviors 14:07 The rise of visual search 16:52 Speech and conversational AI in Search 18:36 Comparing Gemini and Search 20:04 Real-time tool use in Search 22:52 Evolving the Search interface 26:03 Making Search more personal 29:15 The agentic future of Search 31:15 Agents beyond booking tickets 37:11 On-the-fly software creation 38:06 Google DeepMind and Search collaboration 40:08 What's next for Search
	Sundar Pichai: Gemini 3, Vibe Coding and Google's Full Stack Strategy	26 Nov 2025	00:27:34
Logan Kilpatrick from Google DeepMind sits down with Sundar Pichai, CEO of Google and Alphabet to discuss the launch of Gemini 3, Nano Banana Pro and Google's overall AI momentum. They talk about Google’s long-term bets on infrastructure, what it’s actually like to ship SOTA models, and the rise of vibe coding. Sundar also shares his personal launch day rituals and thoughts on future moonshots like putting data centers in space. Watch on YouTube: https://www.youtube.com/watch?v=iFqDyWFuw1c Chapters: 0:00 - Intro 0:51 - Shipping Gemini 3 2:44 - Google's decade-long investment in AI 4:27 - The full stack advantage 5:43 - Scaling up compute and capacity 7:32 - Sim-shipping Gemini across products 9:35 - Nano Banana Pro 12:13 - Monitoring launch day 14:13 - Future model roadmap 16:05 - Launch day rituals 18:02 - The Blue Micro Kitchen 21:57 - Future moonshots 23:26 - The rise of vibe coding 26:50 - What’s next
	Nano Banana Pro: Hands-on with the World’s Most Powerful Image Model	26 Nov 2025	00:36:24
Introducing Nano Banana Pro, a powerful model built on Gemini 3 Pro, designed to enhance text rendering, infographics, and structured content generation. Tune in to learn about Nano Banana Pro’s advanced visual reasoning and multi-turn generation capabilities, and how this next-gen tool enables complex image edits and real-world applications. In this episode, we discuss how user feedback and continuous benchmarking drive model improvements, ensuring a superior experience for developers. Watch on YouTube: https://www.youtube.com/watch?v=hk6gwiZmSWA Chapters: 00:00 - Introducing Nano Banana Pro 02:00 - Enhanced world understanding 04:59 - Advanced text rendering 05:49 - Gemini 3 Pro's influence 09:30 - Multi-turn & infographics 14:04 - Text rendering comparison 16:26 - Multilingual text support 18:22 - Infographics for learning 24:00 - Multi-image input 26:38 - Resolution & fidelity 30:07 - Advanced editing & style 32:09 - Practical use cases 35:26 - Future outlook & thanks
	Koray Kavukcuoglu: “This Is How We Are Going to Build AGI”	25 Nov 2025	00:48:44
Join Logan Kilpatrick and Koray Kavukcuoglu, CTO of Google DeepMind and Chief AI Architect of Google, as they discuss Gemini 3 and the state of AI! Their conversation includes the reception of Gemini 3, the ongoing advancements in AI research, and the role of benchmarks in pushing new frontiers. They explore critical areas for Gemini's focus, emphasizing instruction following, tool calls, and internationalization, alongside Google's collaborative approach to AI development. Watch on YouTube: https://www.youtube.com/watch?v=fXtna7UrL44 Chapters: 0:00 - Intro 2:00 - Gemini 3 launch reception 4:16 - Continuous progress and innovation 6:47 - Key areas for Gemini improvement 11:45 - Product scaffolding for model improvement 13:56 - Chief AI architect role 17:04 - Engineering mindset and collaboration 18:37 - Future growth areas for Gemini 20:33 - From research to engineering mindset 23:22 - The rise of generative media 27:22 - Nano Banana Pro capabilities 29:31 - Towards unified model checkpoints 36:26 - Organizing for AI success 38:26 - Balancing exploration and scaling 41:40 - DeepMind's collaborative culture 45:21 - Innovating at Google 48:37 - Closing
	Google Antigravity: Hands on with our new agentic development platform	25 Nov 2025	00:44:49
Explore Antigravity, Google DeepMind’s innovative new AI developer coding product, with Varun Mohan on Release Notes. This episode dives into Antigravity as a powerful agent development platform, integrating a familiar IDE experience with browser verification and Gemini 3.0 capabilities. Discover how developers can orchestrate complex agentic workflows, leverage artifacts for task communication, and balance AI automation with human collaboration. Learn about the philosophy behind building next-gen agentic experiences, the platform's multimodal strengths, and its role in accelerating software development at scale. Watch on YouTube: https://www.youtube.com/watch?v=uzFOhkORVfk Chapters 00:00 - Introducing Google Antigravity 04:02 - Evolution of AI in coding 04:53 - Beyond writing code 06:21 - Ideal Google Antigravity user 09:48 - Evolving user personas 11:46 - Agents versus the IDE 14:46 - Human-agent collaboration 16:43 - Local versus server-side 18:50 - Self-improvement and knowledge 21:29 - Generalizing agent capabilities 24:20 - Naming Google Antigravity 27:04 - Integrating Google's AI models 27:59 - Demo: Airbnb for dogs 28:48 - Understanding artifacts 29:51 - Asynchronous user feedback 32:16 - Agent manager workflow 33:17 - Browser actuation demo 34:36 - Browser for research and testing 36:45 - Parallel agent conversations 41:04 - Agent task best practices 42:51 - Future of Google Antigravity
	Gemini 3: Launch day reactions	25 Nov 2025	00:42:16
Join us for a special episode of Release Notes as we unpack Gemini 3, Google’s latest AI model with key team members. Learn how Gemini 3 empowers developers with enhanced multimodal understanding, agentic capabilities for complex tasks, and generative interfaces that transform prompts into interactive applications. We discuss real-world use cases, the iterative development process driven by user feedback, and the strategic balance between model performance and broad accessibility across various Google platforms. Watch on YouTube: https://www.youtube.com/watch?v=mci0f2dy7G0 Chapters: 00:00 - Introducing Gemini 3 03:08 - Gemini 3 everywhere 04:13 - The product-model partnership 08:20 - Balancing speed and quality 11:40 - Gemini 3 'wow' moments 27:47 - Generative interfaces and UI 31:44 - Gemini's agentic capabilities 33:55 - Proactive AI and future 34:55 - Managing compute demand 39:32 - The Gemini 3 family 41:45 - Conclusion
	How a Moonshot Led to Google DeepMind's Veo 3	16 Oct 2025	00:48:10
Dumi Erhan, co-lead of the Veo project at Google DeepMind, joins host Logan Kilpatrick for a deep dive into the evolution of generative video models. They discuss the journey from early research in 2018 to the launch of state-of-the-art Veo 3 model with native audio generation. Learn about the technical hurdles in evaluating and scaling video models, the challenges of long-duration video coherence and how user feedback is shaping the future of AI-powered video creation. Chapter: 0:00 - Intro 0:47 - Veo project's beginnings 3:02 - Veo's origins in Google Brain 5:07 - Video prediction and robotics applications 7:45 - Early progress and evaluation challenges 10:30 - Physics-based evaluations and their limitations 12:18 - The launch of the original Veo model 14:06 - Scaling challenges for video models 16:02 - The leap from Veo1 to Veo2 19:40 - Veo 3’s viral audio moment 21:17 - User trends shaping Veo's roadmap 23:49 - Image-to-video vs. text-to-video complexity 26:00 - New prompting methods and user control 27:55 - Coherence in long video generation 31:03 - Genie 3 and world models 35:54 - The steerability challenge 41:59 - Capability transfer and image data's role 47:25 - Closing
	GDM’s Pushmeet Kohli on solving science's biggest challenges with AI	15 Sep 2025	00:37:28
Pushmeet Kohli, Head of Science and Strategic Initiatives at Google DeepMind, joins host Logan Kilpatrick to explore the intersection of AI and scientific discovery. Learn how the team's unique problem-solving framework led to innovations like AlphaFold and AlphaEvolve, and how new tools like AI Co-scientist aim to democratize these types of breakthroughs for everyone. Watch on YouTube: https://www.youtube.com/watch?v=o7mdsL6BHsk Chapters: 0:00 - Intro 1:04 - Recent Alpha launches 02:15 - Framework for selecting research domains 06:21 - Scientific, commercial and social impact 15:00 - Wielding AGI for breakthroughs 16:48 - Tech transfer and team collaboration 19:46 - IMO Gold Medal 21:42 - Evaluating math proofs 22:55 - From specialized models to Deep Think 24:22 - Do math skills generalize? 25:53 - Generalizing the IMO model 27:43 - Democratizing AI science tools 30:09 - AI Co-scientist 35:17 - An API for science?
	Behind the scenes of Google's state-of-the-art "nano-banana" image model	27 Aug 2025	00:30:32
Join host Logan Kilpatrick in discussion with some of the minds behind Google's new state-of-the-art image model, Gemini 2.5 Flash. Product and research leads from the Gemini team break down the technology behind its key capabilities, including interleaved generation for complex edits and new approaches to achieving character consistency and pixel-perfect control. With Nicole Brichtova, Kaushik Shivakumar, Mostafa Dehghani and Robert Riachi. Watch on YouTube: Chapters: 0:37 - New model introduction 1:21 -Demo - Image Editing 3:44 - Text rendering capabilities 4:44 Beyond human preference evals 6:44 - Text rendering as a proxy for quality 8:38 - Positive transfer between modalities 11:25 - Demo - Multi-turn, context aware image generation 13:54 - Pixel-perfect editing and character consistency 15:51 - Interleaved image generation 17:59 - Specialized vs. native models 19:52 - Understanding nuanced prompts 20:59 - User feedback shaping model development 22:37 - Improvements in character consistency 24:17 - More natural looking images from team collaboration 26:41 - What’s next for image generation models

About us Privacy Policy