Back

Explore every episode of the podcast Google AI: Release Notes

Dive into the complete episode list for Google AI: Release Notes. Each episode is cataloged with detailed descriptions, making it easy to find and explore specific topics. Keep track of all episodes from your favorite podcast and never miss a moment of insightful content.

Rows per page:

1–22 of 22

TitlePub. DateDuration
Launching Gemini 2.528 Mar 202500:27:55

Tulsee Doshi, Head of Product for Gemini Models joins host Logan Kilpatrick for an in-depth discussion on the latest Gemini 2.5 Pro experimental launch. Gemini 2.5 is a well-rounded, multimodal thinking model, designed to tackle increasingly complex problems. From enhanced reasoning to advanced coding, Gemini 2.5 can create impressive web applications and agentic code applications. Learn about the process of building Gemini 2.5 Pro experimental, the improvements made across the stack, and what’s next for Gemini 2.5.

 

Chapters:

0:00 - Introduction
1:05 - Gemini 2.5 launch overview
3:19 - Academic evals vs. vibe checks
6:19 - The jump to 2.5
7:51 - Coordinating cross-stack improvements
11:48 - Role of pre/post-training vs. test-time compute
13:21 - Shipping Gemini 2.5
15:29 - Embedded safety process
17:28 - Multimodal reasoning with Gemini 2.5
18:55 - Benchmark deep dive
22:07 - What’s next for Gemini
24:49 - Dynamic thinking in Gemini 2.5
25:37 - The team effort behind the launch

 

Resources:

  • Gemini → https://goo.gle/41Yf72b
  • Gemini 2.5 blog post → https://goo.gle/441SHiV
  • Example of Gemini’s 2.5 Pro’s game design skills →  https://goo.gle/43vxkq1
  • Demo: Gemini 2.5 Pro Experimental in Google AI Studio → https://goo.gle/4c5RbhE
Gemini app: Canvas, Deep Research and Personalization20 Mar 202500:36:53

Dave Citron, Senior Director Product Management, joins host Logan Kilpatrick for an in-depth discussion on the latest Gemini updates and demos. Learn more about Canvas for collaborative content creation, enhanced Deep Research with Thinking Models and Audio Overview and a new personalization feature.

0:00 - Introduction
0:59 - Recent Gemini app launches
2:00 - Introducing Canvas
5:12 - Canvas in action
8:46 - More Canvas examples
12:02 - Enhanced capabilities with Thinking Models
15:12 - Deep Research in action
20:27 - The future of agentic experiences
22:12 Deep Research and Audio Overviews
24:11 - Personalization in Gemini app
27:50 - Personalization in action
29:58 - How personalization works: user data and privacy
32:30 -The future of personalization

Developing Google DeepMind's Thinking Models24 Feb 202501:03:32

Jack Rae, Principal Scientist at Google DeepMind, joins host Logan Kilpatrick for an in-depth discussion on the development of Google’s thinking models. Learn more about practical applications of thinking models, the impact of increased 'thinking time' on model performance and the key role of long context.

01:14 - Defining Thinking Models
03:40 - Use Cases for Thinking Models
07:52 - Thinking Time Improves Answers
09:57 - Rapid Thinking Progress
20:11 - Long Context Is Key
27:41 - Tools for Thinking Models
29:44 - Incorporating Developer Feedback
35:11 - The Strawberry Counting Problem
39:15 - Thinking Model Development Timeline
42:30 - Towards a GA Thinking Model
49:24 - Thinking Models Powering AI Agents
54:14 - The Future of AI Model Evals

Behind the Scenes of Gemini 2.011 Dec 202400:35:18
Tulsee Doshi, Gemini model product lead, joins host Logan Kilpatrick to go behind the scenes of Gemini 2.0, taking a deep dive into the model's multimodal capabilities and native tool use, and Google's approach to shipping experimental models. Watch on YouTube: https://www.youtube.com/watch?v=L7dw799vu5o Chapters: Meet Tulsee Doshi Gemini's Progress Over the Past Year Introducing Gemini 2.0 Shipping Experimental Models Gemini 2.0’s Native Tool Use Function Calling Multimodal Agents Rapid Fire Questions
Smaller, Faster, Cheaper & The Story of Flash 8B05 Dec 202400:43:20
Logan Kilpatrick sits down with Emanuel Taropa, a key figure in the development of Gemini to delve into the cutting edge of AI. Taropa provides insights into the technical challenges and triumphs of building and deploying large language models, focusing on the recent release of the Flash 8B Gemini model. Their conversation covers everything from the intricacies of model architecture and training to the practical challenges of shipping AI models at scale, and even speculates on the future of AI.
Deep Dive into Long Context02 May 202500:59:32

Explore the synergy between long context models and Retrieval Augmented Generation (RAG) in this episode of Release Notes. Join Google DeepMind's Nikolay Savinov as he discusses the importance of large context windows, how they enable Al agents, and what's next in the field.

Chapters:
0:52 Introduction & defining tokens
5:27 Context window importance
9:53 RAG vs. Long Context
14:19 Scaling beyond 2 million tokens
18:41 Long context improvements since 1.5 Pro release
23:26 Difficulty of attending to the whole context
28:37 Evaluating long context: beyond needle-in-a-haystack
33:41 Integrating long context research
34:57 Reasoning and long outputs
40:54 Tips for using long context
48:51 The future of long context: near-perfect recall and cost reduction
54:42 The role of infrastructure
56:15 Long-context and agents

Google I/O 2025 Recap with Josh Woodward and Tulsee Doshi22 May 202500:40:15

Learn more

  • AI Studio: https://aistudio.google.com/
  • Gemini Canvas: https://gemini.google.com/canvas
  • Mariner: https://labs.google.com/mariner/
  • Gemini Ultra: https://one.google.com/about/google-a...
  • Jules: https://jules.google/
  • Gemini Diffusion: https://deepmind.google/models/gemini...
  • Flow: https://labs.google/flow/about
  • Notebook LM: https://notebooklm.google.com/
  • Stitch: https://stitch.withgoogle.com/

Chapters

  • 0:59 - I/O Day 1 Recap
  • 02:48 - Envisioning I/O 2030
  • 08:11 - AI for Scientific Breakthroughs
  • 09:20 - Veo 3 & Flow
  • 7:35 - Gemini Live & the Future of Proactive Assistants
  • 20:30 - Gemini in Chrome & Future Apps
  • 22:28 - New Gemini Models: DeepThink, Diffusion & 2.5 Flash/Pro Updates
  • 27:19 - Developer Momentum & Feedback Loop
  • 31:50 - New Developer Products: Jules, Stitch & CodeGen in AI Studio
  • 37:44 - Evolving Product Development Process with AI
  • 39:23 - Closing

 

 

 

 

 

Building Gemini's Coding Capabilities16 Jun 202501:00:27

Connie Fan, Product Lead for Gemini's coding capabilities, and Danny Tarlow, Research Lead for Gemini's coding capabilities, join host Logan Kilpatrick for an in-depth discussion on how the team built one of the world's leading AI coding models. Learn more about the early goals that shaped Gemini's approach to code, the rise of 'vibe coding' and its impact on development, strategies for tackling large codebases with long context and agents, and the future of programming languages in the age of AI.

Watch on YouTube: ⁠https://www.youtube.com/watch?v=jwbG_m-X-gE⁠

Chapters:

0:00 - Intro
1:10 - Defining Early Coding Goals
6:23 - Ingredients of a Great Coding Model
9:28 - Adapting to Developer Workflows
11:40 - The Rise of Vibe Coding
14:43 - Code as a Reasoning Tool
17:20 - Code as a Universal Solver
20:47 - Evaluating Coding Models
24:30 - Leveraging Internal Googler Feedback
26:52 - Winning Over AI Skeptics
28:04 - Performance Across Programming Languages
33:05 - The Future of Programming Languages
36:16 - Strategies for Large Codebases
41:06 - Hill Climbing New Benchmarks
42:46 - Short-Term Improvements
44:42 - Model Style and Taste
47:43 - 2.5 Pro’s Breakthrough
51:06 - Early AI Coding Experiences
56:19 - Specialist vs. Generalist Models

 

Sergey Brin on the Future of AI & Gemini16 Jun 202500:27:19

A conversation with Sergey Brin, co-founder of Google and computer scientist working on Gemini, in reaction to a year of progress with Gemini.

Watch on YouTube: https://www.youtube.com/watch?v=o7U4DV9Fkc0

Chapters

0:20 - Initial reactions to I/O
2:00 - Focus on Gemini’s core text model
4:29 - Native audio in Gemini and Veo 3
8:34 - Insights from model training runs
10:07 - Surprises in current AI developments vs. past expectations
14:20 - Evolution of model training
16:40 - The future of reasoning and Deep Think
20:19 - Google’s startup culture and accelerating AI innovation
24:51 - Closing

 

Gemini's Multimodality02 Jul 202500:44:17

Ani Baddepudi, Gemini Model Behavior Product Lead, joins host Logan Kilpatrick for a deep dive into Gemini's multimodal capabilities. Their conversation explores why Gemini was built as a natively multimodal model from day one, the future of proactive AI assistants, and how we are moving towards a world where "everything is vision." Learn about the differences between video and image understanding and token representations, higher FPS video sampling, and more.

 

Chapters:

0:00 - Intro
1:12 - Why Gemini is natively multimodal
2:23 - The technology behind multimodal models
5:15 - Video understanding with Gemini 2.5
9:25 - Deciding what to build next
13:23 - Building new product experiences with multimodal AI
17:15 - The vision for proactive assistants
24:13 - Improving video usability with variable FPS and frame tokenization
27:35 - What’s next for Gemini’s multimodal development
31:47 - Deep dive on Gemini’s document understanding capabilities
37:56 - The teamwork and collaboration behind Gemini
40:56 - What’s next with model behavior


Watch on YouTube: https://www.youtube.com/watch?v=K4vXvaRV0dw

Demis Hassabis on shipping momentum, better evals and world models11 Aug 202500:31:09

Demis Hassabis, CEO of Google DeepMind, sits down with host Logan Kilpatrick. In this episode, learn about the evolution from game-playing AI to today's thinking models, how projects like Genie 3 are building world models to help AI understand reality and why new testing grounds like Kaggle’s Game Arena are needed to evaluate progress on the path to AGI.

Watch on YouTube: https://www.youtube.com/watch?v=njDochQ2zHs

Chapters:
00:00 - Intro
01:16 - Recent GDM momentum
02:07 - Deep Think and agent systems
04:11 - Jagged intelligence
07:02 - Genie 3 and world models
10:21 - Future applications of Genie 3
13:01 - The need for better benchmarks and Kaggle Game Arena
19:03 - Evals beyond games
21:47 - Tool use for expanding AI capabilities
24:52 - Shift from models to systems
27:38 - Roadmap for Genie 3 and the omni model
29:25 - The quadrillion token club

 

Building real-time voice applications with Live API06 Aug 202500:40:14

Shrestha Basu Mallick, one of the product leads for the Gemini API, joins host Logan Kilpatrick for a deep dive of Gemini Live API, Google’s real-time, multimodal interface for developers. Learn about how native audio alongside new capabilities like proactive audio and async function calling unlocks the unique power of audio as an interface.

Watch on YouTube: https://www.youtube.com/watch?v=4xlwlU6h-wM

0:00 - Intro
1:18 - Live API Overview
3:36 - Why audio is a special modality
5:07 - Speed vs. precision in audio
6:17 - Controllable and promptable TTS
8:31 - What developers are building with the Live API
11:14 - URL context and async calling features
15:02 - Proactive audio and affective dialog
16:55 - Addressing developer feedback
21:54 - Live API roadmap
23:49 - The role of long context
24:57 - What’s next for the Live API
26:41 - State of the AI audio market
30:10 - Advice for developers getting started with the Live API
31:16 - Live API demo
38:10 - Demo wrap up and closing

 

Building a frontier AI search experience23 Jul 202500:43:16

Robby Stein, VP of Product for Google Search, joins host Logan Kilpatrick to explore how Search is evolving into a frontier AI product. Their conversation covers the shift from simple keywords to complex, conversational queries, the rise of agentic capabilities that can take action on your behalf, and the vision to help billions of users truly "ask anything." Learn more about the technology behind AI Overviews, AI Mode, Deep Search, and the future of multimodal interaction.

Watch on YouTube: https://youtu.be/zUB5A_ezIOU

Chapters
01:07 Search as a Frontier AI Product
02:38 Reaching 1.5 Billion Users
03:37 What Is AI Mode?
04:17 Understanding Query Fan-Out
05:18 Balancing Latency and performance with Gemini 2.5 Pro
06:51 How Deep Search works
09:08 Fine-tuning models for product experience
11:24 Shifting user behaviors
14:07 The rise of visual search
16:52 Speech and conversational AI in Search
18:36 Comparing Gemini and Search
20:04 Real-time tool use in Search
22:52 Evolving the Search interface
26:03 Making Search more personal
29:15 The agentic future of Search
31:15 Agents beyond booking tickets
37:11 On-the-fly software creation
38:06 Google DeepMind and Search collaboration
40:08 What's next for Search


 

Sundar Pichai: Gemini 3, Vibe Coding and Google's Full Stack Strategy26 Nov 202500:27:34

Logan Kilpatrick from Google DeepMind sits down with Sundar Pichai, CEO of Google and Alphabet to discuss the launch of Gemini 3, Nano Banana Pro and Google's overall AI momentum. They talk about Google’s long-term bets on infrastructure, what it’s actually like to ship SOTA models, and the rise of vibe coding. Sundar also shares his personal launch day rituals and thoughts on future moonshots like putting data centers in space.

Watch on YouTube: https://www.youtube.com/watch?v=iFqDyWFuw1c

Chapters:
0:00 - Intro
0:51 - Shipping Gemini 3
2:44 - Google's decade-long investment in AI
4:27 - The full stack advantage
5:43 - Scaling up compute and capacity
7:32 - Sim-shipping Gemini across products
9:35 - Nano Banana Pro
12:13 - Monitoring launch day
14:13 - Future model roadmap
16:05 - Launch day rituals
18:02 - The Blue Micro Kitchen
21:57 - Future moonshots
23:26 - The rise of vibe coding
26:50 - What’s next

Nano Banana Pro: Hands-on with the World’s Most Powerful Image Model26 Nov 202500:36:24

Introducing Nano Banana Pro, a powerful model built on Gemini 3 Pro, designed to enhance text rendering, infographics, and structured content generation. Tune in to learn about Nano Banana Pro’s advanced visual reasoning and multi-turn generation capabilities, and how this next-gen tool enables complex image edits and real-world applications. In this episode, we discuss how user feedback and continuous benchmarking drive model improvements, ensuring a superior experience for developers.

Watch on YouTube: https://www.youtube.com/watch?v=hk6gwiZmSWA

Chapters:
00:00 - Introducing Nano Banana Pro
02:00 - Enhanced world understanding
04:59 - Advanced text rendering
05:49 - Gemini 3 Pro's influence
09:30 - Multi-turn & infographics
14:04 - Text rendering comparison
16:26 - Multilingual text support
18:22 - Infographics for learning
24:00 - Multi-image input
26:38 - Resolution & fidelity
30:07 - Advanced editing & style
32:09 - Practical use cases
35:26 - Future outlook & thanks

Koray Kavukcuoglu: “This Is How We Are Going to Build AGI”25 Nov 202500:48:44

Join Logan Kilpatrick and Koray Kavukcuoglu, CTO of Google DeepMind and Chief AI Architect of Google, as they discuss Gemini 3 and the state of AI!

Their conversation includes the reception of Gemini 3, the ongoing advancements in AI research, and the role of benchmarks in pushing new frontiers. They explore critical areas for Gemini's focus, emphasizing instruction following, tool calls, and internationalization, alongside Google's collaborative approach to AI development.

Watch on YouTube: https://www.youtube.com/watch?v=fXtna7UrL44

Chapters:
0:00 - Intro
2:00 - Gemini 3 launch reception
4:16 - Continuous progress and innovation
6:47 - Key areas for Gemini improvement
11:45 - Product scaffolding for model improvement
13:56 - Chief AI architect role
17:04 - Engineering mindset and collaboration
18:37 - Future growth areas for Gemini
20:33 - From research to engineering mindset
23:22 - The rise of generative media
27:22 - Nano Banana Pro capabilities
29:31 - Towards unified model checkpoints
36:26 - Organizing for AI success
38:26 - Balancing exploration and scaling
41:40 - DeepMind's collaborative culture
45:21 - Innovating at Google
48:37 - Closing

Google Antigravity: Hands on with our new agentic development platform25 Nov 202500:44:49

Explore Antigravity, Google DeepMind’s innovative new AI developer coding product, with Varun Mohan on Release Notes. This episode dives into Antigravity as a powerful agent development platform, integrating a familiar IDE experience with browser verification and Gemini 3.0 capabilities. Discover how developers can orchestrate complex agentic workflows, leverage artifacts for task communication, and balance AI automation with human collaboration. Learn about the philosophy behind building next-gen agentic experiences, the platform's multimodal strengths, and its role in accelerating software development at scale.

Watch on YouTube: https://www.youtube.com/watch?v=uzFOhkORVfk

Chapters
00:00 - Introducing Google Antigravity
04:02 - Evolution of AI in coding
04:53 - Beyond writing code
06:21 - Ideal Google Antigravity user
09:48 - Evolving user personas
11:46 - Agents versus the IDE
14:46 - Human-agent collaboration
16:43 - Local versus server-side
18:50 - Self-improvement and knowledge
21:29 - Generalizing agent capabilities
24:20 - Naming Google Antigravity
27:04 - Integrating Google's AI models
27:59 - Demo: Airbnb for dogs
28:48 - Understanding artifacts
29:51 - Asynchronous user feedback
32:16 - Agent manager workflow
33:17 - Browser actuation demo
34:36 - Browser for research and testing
36:45 - Parallel agent conversations
41:04 - Agent task best practices
42:51 - Future of Google Antigravity

 

Gemini 3: Launch day reactions25 Nov 202500:42:16

Join us for a special episode of Release Notes as we unpack Gemini 3, Google’s latest AI model with key team members. Learn how Gemini 3 empowers developers with enhanced multimodal understanding, agentic capabilities for complex tasks, and generative interfaces that transform prompts into interactive applications. We discuss real-world use cases, the iterative development process driven by user feedback, and the strategic balance between model performance and broad accessibility across various Google platforms.

Watch on YouTube: https://www.youtube.com/watch?v=mci0f2dy7G0

Chapters:
00:00 - Introducing Gemini 3
03:08 - Gemini 3 everywhere
04:13 - The product-model partnership
08:20 - Balancing speed and quality
11:40 - Gemini 3 'wow' moments
27:47 - Generative interfaces and UI
31:44 - Gemini's agentic capabilities
33:55 - Proactive AI and future
34:55 - Managing compute demand
39:32 - The Gemini 3 family
41:45 - Conclusion

How a Moonshot Led to Google DeepMind's Veo 316 Oct 202500:48:10

Dumi Erhan, co-lead of the Veo project at Google DeepMind, joins host Logan Kilpatrick for a deep dive into the evolution of generative video models. They discuss the journey from early research in 2018 to the launch of state-of-the-art Veo 3 model with native audio generation. Learn about the technical hurdles in evaluating and scaling video models, the challenges of long-duration video coherence and how user feedback is shaping the future of AI-powered video creation.

Chapter: 
0:00 - Intro
0:47 - Veo project's beginnings
3:02 - Veo's origins in Google Brain
5:07 - Video prediction and robotics applications
7:45 - Early progress and evaluation challenges
10:30 - Physics-based evaluations and their limitations
12:18 - The launch of the original Veo model
14:06 - Scaling challenges for video models
16:02 - The leap from Veo1 to Veo2
19:40 - Veo 3’s viral audio moment
21:17 - User trends shaping Veo's roadmap
23:49 - Image-to-video vs. text-to-video complexity
26:00 - New prompting methods and user control
27:55 - Coherence in long video generation
31:03 - Genie 3 and world models
35:54 - The steerability challenge
41:59 - Capability transfer and image data's role
47:25 - Closing

 

GDM’s Pushmeet Kohli on solving science's biggest challenges with AI15 Sep 202500:37:28

Pushmeet Kohli, Head of Science and Strategic Initiatives at Google DeepMind, joins host Logan Kilpatrick to explore the intersection of AI and scientific discovery. Learn how the team's unique problem-solving framework led to innovations like AlphaFold and AlphaEvolve, and how new tools like AI Co-scientist aim to democratize these types of breakthroughs for everyone. 

Watch on YouTube: https://www.youtube.com/watch?v=o7mdsL6BHsk

Chapters: 
0:00 - Intro
1:04 - Recent Alpha launches
02:15 - Framework for selecting research domains
06:21  - Scientific, commercial and social impact
15:00 - Wielding AGI for breakthroughs
16:48 - Tech transfer and team collaboration
19:46  -  IMO Gold Medal
21:42  - Evaluating math proofs
22:55 - From specialized models to Deep Think
24:22 -  Do math skills generalize?
25:53 - Generalizing the IMO model
27:43 - Democratizing AI science tools
30:09 - AI Co-scientist
35:17 - An API for science?

Behind the scenes of Google's state-of-the-art "nano-banana" image model27 Aug 202500:30:32

Join host Logan Kilpatrick in discussion with some of the minds behind Google's new state-of-the-art image model, Gemini 2.5 Flash. Product and research leads from the Gemini team break down the technology behind its key capabilities, including interleaved generation for complex edits and new approaches to achieving character consistency and pixel-perfect control. With Nicole Brichtova, Kaushik Shivakumar, Mostafa Dehghani and Robert Riachi. 

Watch on YouTube: 

Chapters:
0:37 - New model introduction
1:21 -Demo - Image Editing
3:44 - Text rendering capabilities
4:44 Beyond human preference evals
6:44 - Text rendering as a proxy for quality
8:38 - Positive transfer between modalities
11:25 - Demo - Multi-turn, context aware image generation
13:54 - Pixel-perfect editing and character consistency
15:51 - Interleaved image generation
17:59 - Specialized vs. native models
19:52 - Understanding nuanced prompts
20:59 - User feedback shaping model development
22:37 - Improvements in character consistency
24:17 - More natural looking images from team collaboration
26:41 - What’s next for image generation models

© My Podcast Data