Podcast Platform Engineering Podcast by Cory O'Daniel, CEO of Massdriver Episodes

Explore every episode of the podcast Platform Engineering Podcast

Dive into the complete episode list for Platform Engineering Podcast. Each episode is cataloged with detailed descriptions, making it easy to find and explore specific topics. Keep track of all episodes from your favorite podcast and never miss a moment of insightful content.

	Title	Pub. Date	Duration
	What CVEs Did for Security, CREs Are Doing for Reliability	02 Jul 2025	00:47:36
Did you know that software engineers often "learn things the hard way" because they lack a standardized system to share knowledge about reliability issues? While security professionals have CVEs to catalog vulnerabilities, reliability engineers have been left to reinvent the wheel with each new bug or outage. Tony Meehan, co-founder and CTO of Prequel, introduces us to Common Reliability Enumerations (CREs) - an open-source approach that's doing for reliability what CVEs did for security. After spending a decade at the NSA hunting vulnerabilities, Tony recognized that the same community-driven approach could revolutionize how we handle reliability issues. This conversation covers: How CREs help developers detect and mitigate reliability issues before they cause outages The open-source tools Preq and CRE that allow teams to leverage community knowledge Practical ways to implement these tools in your development workflow (locally, in CI/CD, and production) How this approach can reduce cloud costs by identifying issues rather than over-provisioning Tips for debugging mysterious production issues when no CRE exists yet Guest: Tony Meehan, CTO at Prequel Tony is an engineering leader obsessed with bugs. He dedicated a decade to vulnerability and exploit development at the National Security Agency (NSA) before leading Engineering at Endgame and Elastic. In 2023, Tony co-founded Prequel to change the way application failure is detected and resolved. Tony Meehan, X prequel.dev github.com/prequel-dev Prequel, X Links to interesting things from this episode: Blog post about the partial outage at Endgame Common Reliability Enumeration (CRE) Preq XKCD: Standa rds Episode on security with Danny Allan from Snyk Brendan Gregg's blog
	From DevOps to 'Vibe Coding': Gene Kim on AI-Assisted Development and Platform Engineering	28 May 2025	00:56:32
What if you could turn a five-year software project into a one-month endeavor? Gene Kim, co-founder of IT Revolution and author of The Phoenix Project, reveals how AI-powered Vibe Coding is transforming the way developers work. Kim shares insights from his upcoming book about how developers are achieving unprecedented productivity, including how his co-author produces 12,000 lines of production-ready code daily using AI assistance. But it's not just about speed - learn how this approach enables developers to tackle previously impossible projects and explore larger design spaces. From DevOps evolution to practical AI implementation, Kim discusses: What Vibe Coding really means and how it differs from traditional development Real examples of AI accelerating development without sacrificing quality Common pitfalls to avoid when implementing AI in your development workflow How AI is making developers more ambitious rather than replacing them The critical role of testing and feedback loops in successful AI implementation Whether you're a seasoned developer or a tech leader wondering about AI's place in your development workflow, this conversation provides practical insights into the future of software development. Guest: Gene Kim, Author, Researcher, Vibe Coder, DevOps Enthusiast, Founder of IT Revolution Gene Kim has been studying high-performing technology organizations since 1999. He was the founder and CTO of Tripwire, Inc., an enterprise security software company, where he served for 13 years. His books have sold over 1 million copies—he is the WSJ bestselling author of Wiring the Winning Organization, The Unicorn Project, and co-author of The Phoenix Project, The DevOps Handbook, and the Shingo Publication Award-winning Accelerate. Since 2014, he has been the organizer of DevOps Enterprise Summit (now Enterprise Technology Leadership Summit), studying the technology transformations of large, complex organizations. Gene Kim, X IT Revolution The Phoenix Project The Unicorn Project Vibe Coding Links to interesting things from this episode: "DevOps is Bullshit" Accelerate “Decoding the DNA of the Toyota Production System” Wiring the Winning Organization “Organizational Learning And Competitiveness: Revisiting The “Allspaw/Hammond 10 Deploys Per Day At Flickr” Story” The DevOps Handbook Wyvern Sourcegraph Amp Pacific Rim
	Cloud Migration Strategies with Alex Voorhees of 66 Degrees	11 Dec 2024	00:52:52
Navigating cloud migrations and building modern platforms is challenging in the best of circumstances. Alex Voorhees, VP of Cloud Engineering at 66 Degrees, shares valuable lessons from helping organizations as they take on these challenges. Don’t miss his insights on: How to tackle the human and organizational challenges that come with cloud transformation Practical strategies for upskilling teams transitioning from traditional ops to cloud operations Key considerations when implementing platform engineering solutions across different organizational maturity levels Integrating AI capabilities into cloud architecture Common pitfalls to avoid when moving legacy applications to the cloud Approaches for balancing innovation with practical business needs during cloud migration Whether you're leading a cloud migration, building a platform team, or interested in the future of cloud operations, this episode offers concrete takeaways for navigating the technical and organizational challenges of modern infrastructure. Guest: Alex Voorhees, VP of Cloud Engineering at 66 Degrees Alex Voorhees is the Vice President of Cloud Engineering at 66degrees. Prior to joining 66degrees, he was the Vice President of Customer Engineering at Cloudbakers. Alex Voorhees - X 66 Degrees - Website 66 Degrees - X Links to interesting things from this episode: Backstage Spacelift Jenkins Anthropic Interview with Amey Patil from Google Lattice
	Democratizing Kubernetes: The Kubefirst Journey with John Dietz	28 Nov 2024	01:05:20
John Dietz, CEO and co-founder of Konstruct (formerly Kubefirst), joins us fresh from KubeCon North America to discuss the evolution of cloud-native platform adoption. John shares insights into Konstruct's mission to make Kubernetes and cloud-native technology more accessible, reducing the typical 18-month adoption timeline to minutes. The conversation explores Konstruct's two main products: Kubefirst, an open-source GitOps platform, and Colony, their new solution for bare metal and data center deployments. John discusses the company's philosophy on open-source licensing, the importance of building trust in platform engineering, and their unique approach to commercialization while maintaining core platform accessibility. Don’t miss our new segment: TrashOps! Guest: John Dietz, CEO and Co-founder of Konstruct John Dietz is the friendly CEO and relentless technical cofounder driving Konstruct (formerly Kubefirst). John and his cofounder, Jared Edwards, embarked on the Kubefirst venture with humble beginnings, building the product off-hours while navigating a year-long stealth enterprise pilot before open sourcing the project onto the CNCF landscape. Kubefirst was acquired by Kubeshop in 2022, and then again by the Civo Cloud in 2024. John’s entrepreneurial spirit sparked at a young age. At 22 he bootstrapped his own software and services company, a role he dedicated 19 years to before founding Kubefirst. During that time he doubled as engineering leads in the DevOps space in numerous successful startup and enterprise environments, gaining expertise in high scale software delivery, site reliability, platform architecture, and cloud engineering. His pivotal role in transitioning USA Today to the public cloud atop a hybrid cloud abstraction layer showcased his ability as a software platform architect of the early enterprise public cloud adopter. John has been at the forefront of Kubernetes adoption since 2017 and has assisted hundreds of companies in their transition to cloud native architectures with the Kubefirst open source platform. John Dietz - X Konstruct - Website Konstruct - GitHub Links to interesting things from this episode: Civo Cloud groundcover Infisical Flatcar OpenTofu Tinkerbell
	Executing Well in Healthcare with Jessica Kalinowski	13 Nov 2024	00:51:10
Jessica Kalinowski, VP of DevOps and Corporate IT at Connect RN, shares her journey from corporate IT to implementing DevOps and platform engineering in a startup environment. Jessica discusses the challenges and successes of applying tech strategies in healthcare. The episode covers strategies for platform adoption, including early engineer engagement and flexible implementation. Jessica discusses how automation has enabled efficient management with a small team, benefiting the entire organization. Guest: Jessica Kalinowski, Vice President of DevOps and Corporate IT at connectRN Vice President of DevOps and Corp IT with 15+ years’ experience in IT as well as a proven ability to promote and champion best-in-class cloud migrations. Expertise in leading and managing transformational/ cultural change comprehensively throughout multinational organizations. Strength areas include strategic thinking and IT/ cloud thought leadership, adaptability and resiliency, complex problem-solving, and optimal solution identification through collaboration and partnership. De-centralized leadership style with emphasis on building strong, self-sufficient teams. connectRN Links to interesting things from this episode: Office Space Foundations of the Cloud With Mark Burgess, CFEngine Foundations of The Cloud With Adam Jacob, Chef Foundations of The Cloud With Mitchell Hashimoto, Terraform Foundations of The Cloud With Brian Grant, Kubernetes
	Solomon Hykes on Docker, Dagger, and the Future of DevOps	30 Oct 2024	00:55:34
In this episode, Solomon Hykes discusses the journey from Docker's inception to its widespread adoption, the challenges faced in open-source development, and his current work with Dagger. He explains how Dagger aims to revolutionize continuous integration by making pipelines more modular and efficient, addressing the "push and pray" problem in software development. Hykes also shares insights on the evolution of DevOps, the complexities of open-source business models, and his vision for the future of software development workflows. Guest: Solomon Hykes, Co-founder of Docker and Dagger Solomon Hykes is the co-founder and CEO of Dagger.io, the first programmable CI/CD engine. Before that, he was the co-founder of Docker, where he served for 10 years as CEO then CTO, and a founding member of the CNCF Technical Oversight Committee. Solomon grew up in France, and now lives in San Francisco. Solomon Hykes - Twitter Docker Dagger.io Dagger - Discord Dagger - YouTube “The Future of Linux Containers” Links to interesting things from this episode: Firecracker Cloudflare Fastly WebAssembly Heroku Vercel Netlify Ansible Bazel Nix Jenkins Haskell Pre-commit Kubesimplify Dagger Workshop with Solomon Hykes: Hands-On Tutorial & Community Collaboration
	Security and Scalability with Justin Berman from Thirty Madison	16 Oct 2024	00:59:52
In this episode of the Platform Engineering Podcast, Cory O'Daniel sits down with Justin Berman, Vice President of Platform Engineering and Chief Information Security Officer at Thirty Madison. Justin shares his journey from software engineering to security leadership, discusses the challenges of building secure and scalable platforms, and offers insights into the future of platform engineering and security integration. Guest: Justin Berman, VP of Platform Engineering and Chief Information Security Officer at Thirty Madison Justin Berman is the VP of Platform Engineering and Chief Information Security Officer at Thirty Madison. Prior he was Head of Security at Dropbox, responsible for Dropbox’s information/cyber security, content safety, and platform abuse prevention capabilities, which provide unmatched protections to their users, staff and products. In this role, Justin and his team ensured that Dropbox enables storing, sharing and collaborating on various kinds of content in a trustworthy and secure manner, for all their customers worldwide, from large enterprise Dropbox business accounts to individual consumers. Prior to Dropbox, Justin was the CISO at Zenefits, where he was responsible for scaling the security and IT capabilities and developing the privacy and risk/compliance capabilities. Thirty Madison Links to interesting things from this episode: Datadog Splunk Foundations of The Cloud With Mitchell Hashimoto, Terraform Foundations of The Cloud With Brian Grant, Kubernetes Foundations of The Cloud With Adam Jacob, Chef From Netflix to the Cloud: Adrian Cockroft on DevOps, Microservices, and Sustainability
	Engineering Culture Change with Stack Overflow’s Peter O’Connor	02 Oct 2024	00:54:22
Join us as we learn about Stack Overflow's monumental shift from on-premises infrastructure to the cloud with Peter O'Connor, Director of Platform Engineering. Peter shares invaluable insights on navigating the complexities of migrating a beloved developer platform, balancing technical challenges with team dynamics, and fostering a culture of innovation. From the intricacies of lift-and-shift strategies to the nuances of adopting microservices, this episode offers a masterclass in modern platform engineering. Discover how Stack Overflow is revolutionizing its architecture while maintaining its core mission of serving developers worldwide. Guest: Peter O'Connor, Director of Platform Engineering at Stack Overflow Peter leads teams of engineers dedicated to innovating how they elevate their product engineering teams, enabling them to deploy services to the cloud with greater ease and confidence. Their focus spans the cloud platform, UI platform, and search platform. With a rich background in innovating new workflows, leveraging cutting-edge tools, and leading teams to achieve better outcomes, Peter has found creating platforms as products for their engineering teams to be a rewarding challenge in his career. Before transitioning into tech, he freelanced and taught Chemistry and Computer Science at both secondary and collegiate levels. This experience deepened his understanding of guiding teams, building the right products, and fostering continual improvement among team members. Peter O’Connor - Stack Overflow Stack Overflow Links to interesting things from this episode: Hacker News "Microservices — architecture nihilism in minimalism's clothes" Svelte Islands OpenCost Score Elasticsearch "Navigating AI And Platform Engineering With Amey Patil"
	Platform Engineering for Social Good with Code for America’s Grace Huntley	18 Sep 2024	00:44:18
Want your work to mean more? In this inspiring episode of the Platform Engineering Podcast, we welcome Grace Huntley, Director of Engineering for DevOps, InfoSec, and Safety Net Innovation Lab at Code for America. Grace shares her unconventional journey from licensed plumber to civic tech leader, driven by personal experiences and a passion for solving societal challenges through technology. Discover how Code for America is revolutionizing government services, making them more accessible and efficient for those in need. Grace discusses the transition from private sector to civic tech, the challenges of working with government systems, and the exciting potential of AI in improving public services. This episode offers valuable insights into the intersection of technology, social impact, and the future of civic engagement. Guest: Grace Huntley, Director of Software Engineering at Code for America Grace Huntley brings over 15 years of leadership experience, showcasing a versatile skill set that spans various levels of management and encompasses projects across the full technology stack and diverse industries. Her journey into the world of technology began professionally at the age of 13 when she devoted after-school hours to assembling XT’s and 286’s in a local computer store, sparking a lifelong passion for computing. Grace’s hands-on experience extends beyond the professional realm. Over the last 15 years, she has navigated the intricate challenges associated with accessing safety-net programs while caring for her daughter, who faces significant medical challenges. This personal experience has deepened her understanding of the importance of technology in addressing real-world issues and shaped her commitment to making a positive impact through her work. Outside the realm of software leadership, Grace is an active contributor to non-profit initiatives. Currently serving on the board of Openhouse SF, an organization dedicated to providing affordable housing and services for the aging LGBT community, she continues to advocate for inclusivity and community support. Beyond her professional and philanthropic endeavors, Grace finds joy in her role as an avid gardener and sustainability enthusiast. Her commitment to both personal and professional growth, combined with a genuine passion for making a difference, defines Grace Huntley’s impactful journey in the realms of technology, community service, and environmental consciousness. Code for America - Website Code for America - X Code for America - Instagram Code for America - LinkedIn Links to interesting things from this episode: Safety Net Information Supplemental Nutritional Assistance Program (SNAP) Get CalFresh Clear My Record Direct File NIST The Audacious Project Blue Meridian Partners
	Bridging Tech and Human Factors in Platform Engineering with Morteza Irdmousa	04 Sep 2024	00:46:42
Technical expertise is undoubtedly important when it comes to implementing Platform Engineering but unless you are also paying close attention to the human factor and the cultural aspects of building an effective team it will be much harder to succeed. Cory talks to Morteza Irdmousa, Head of Platform Engineering at Curative, about his experiences in Platform Engineering and DevOps. Join them as they explore the need for a holistic approach that considers both technical and cultural aspects of implementing Platform teams. Gain valuable insights and actionable takeaways as they discuss key issues such as team integration, psychological safety, team engagement, and messaging that truly conveys the business impact of platform engineering work. Guest: Morteza Irdmousa, Head of Platform Engineering at Curative Currently head of platform engineering at Curative in health and having served as Director of Engineering in Raytheon focused around platform in the defense space, Morteza Irdmousa considers himself fortunate to learn a lot along the way and to still be learning. MLOps, Data platforms and traditional platform team duties are his current focus areas. Curative Links to interesting things from this episode: Kessel Run Platfrom One DX Radical Candor: Be a Kick-Ass Boss Without Losing Your Humanity
	From Netflix to the Cloud: Adrian Cockroft on DevOps, Microservices, and Sustainability	22 Aug 2024	01:01:58
In this episode Cory sits down with Adrian Cockroft, a pioneering technologist who played a crucial role in Netflix's transition to cloud computing and microservices architecture. Adrian shares insights from his impressive career, including his work at Netflix, AWS, and beyond. He discusses the evolution of DevOps practices, the rise of microservices, and the challenges of platform engineering in today's complex cloud environments. Adrian also delves into the pressing issue of sustainability in tech, offering valuable perspectives on the environmental impact of AI and machine learning workloads. Whether you're a seasoned DevOps professional or just starting your journey in cloud computing, this episode offers a wealth of knowledge from one of the industry's most influential figures. Guest: Adrian Cockcroft, Tech Advisor Adrian Cockcroft is a technologist and strategist with broad experience from the bits to the boardroom, in both enterprise and consumer-oriented businesses, from startups to some of the largest companies in the world, equally at home with hardware and software, development and operations. He’s best known as the cloud architect for Netflix during their trailblazing migration to AWS and was a very early practitioner and advocate of DevOps, microservices, and chaos engineering, helping bring these concepts to the wider audience they have today. Adrian spent the last few years as a VP at Amazon deeply immersed in the dual challenges of helping Amazon itself – one of the largest companies in the world – become more sustainable, and via AWS – one of the largest technology suppliers in the world – helping its enterprise and public sector customers become more sustainable. Adrian Cockcroft - Mastodon Adrian Cockcroft - Medium Adrian Cockcroft - GitHub Adrian Cockcroft - Soundcloud Links to interesting things from this episode: Cloud for CEOs: Measure innovation with one metric Sun Performance and Tuning: Java and the Internet ACM Queue, “A Conversation with Werner Vogels” The Value Flywheel Effect: Power the Future and Accelerate Your Organization to the Modern Cloud That Will Never Work: The Birth of Netflix and the Amazing Life of an Idea Medium, “So many bad takes — What is there to learn from the Prime Video microservices to monolith story” Team Topologies: Organizing Business and Technology Teams for Fast Flow “How do Committees Invent?” by Melvin Conway Lambda Labs Medium, “Platform Engineering Teams Done Right…” AWS Well-Architected Framework - Sustainability Green Software Foundation Green Software Foundation, Real-Time Cloud Environmental Variables podcast
	Foundations of The Cloud With Brian Grant, Kubernetes	07 Aug 2024	00:56:39
It’s episode four of the Platform Engineering Podcast’s special series on the Foundations of The Cloud! This time Cory O'Daniel sits down with Brian Grant, the original lead architect of Kubernetes, to dive deep into the history and evolution of Kubernetes in cloud operations. Brian shares his journey from working in supercomputing to joining Google and helping develop Kubernetes. He also provides insights on the importance of Kubernetes' declarative model, managing complexity in cloud native environments, and the extensive impact and future potential of Kubernetes. Tune in to learn more about the intricate details of platform engineering and the revolutionary developments in cloud native infrastructure. Guest: Brian Grant, Original lead architect of Kubernetes and CTO/Co-founder of ConfigHub Brian Grant is a foundational architect and influential voice within the Kubernetes project and the broader cloud-native landscape. Brian Grant - X Brian Grant - GitHub Brian Grant - Medium Links to interesting things from this episode: "Kubernetes Documentation Declarative Application Management in Kubernetes Kubernetes the Hard Way by Kelsey Hightower Kubernetes Milestones and Tasks Checklist
	Snyk’s Danny Allan on Making Security Developer-Friendly	30 Apr 2025	00:45:26
Security often feels like a roadblock to developers, but what if it could be seamlessly integrated into the development process? As software delivery becomes increasingly automated and self-service, the traditional approach to security needs a major overhaul. Danny Allan, CTO at Snyk, shares practical insights on transforming security from a bottleneck into an enabler of developer productivity. Drawing from his extensive experience at IBM, VMware, and Veeam, Allan discusses how security teams can shift left effectively without creating friction. Key topics covered: Building successful security champions programs that cultivate curiosity rather than relying solely on senior developers Practical approaches to embedding security controls into development pipelines, from IDE integration to PR checks Strategies for measuring security team success beyond just vulnerability counts The role of pre-hardened containers and infrastructure-as-code scanning in platform security How AI is transforming both code generation and security tooling, including Snyk's approach to vulnerability detection Guest: Danny Allan , Chief Technology Officer at Snyk As CTO, Danny leads end-to-end ownership of Snyk’s current core offerings and roadmap, as well as the company’s near-term platform vision. Before joining Snyk, he was CTO at Veeam and Desktone (acquired by VMWare) and Director of Security Research at IBM. In his free time, he loves scuba diving, cycling, and hockey (like a true Canadian!) Snyk, website Snyk, X Snyk, YouTube Snyk, Github Snyk, Discord The Secure Developer Podcast by Snyk Links to interesting things from this episode: DistroList Chainguard Verizon Data Breach Investigation Report Hack This Site Model Context Protocol
	Foundations of The Cloud With Mitchell Hashimoto, Terraform	24 Jul 2024	00:53:36
In this third episode of the Platform Engineering Podcast’s special series on the Foundations of The Cloud, host Corey O'Daniel interviews Mitchell Hashimoto, co-founder of HashiCorp and creator of Terraform, Vault, and Nomad. They discuss the intricacies of platform engineering, the history and evolution of Terraform, the advent of infrastructure as code, and the challenges accompanying it. Mitchell also shares insights on his new project, Ghostly, a high-performance terminal emulator, and delves into how generative AI will transform operations engineering. Listen to this episode to get valuable lessons for both industry veterans and newcomers. Guest: Mitchell Hashimoto, Founder of HashiCorp, Creator of Vagrant, Packer, Consul, Terraform, Vault, Nomad, Waypoint, and more. Mitchell Hashimoto is a passionate engineer, professional speaker, and entrepreneur. Mitchell has been creating and contributing to open-source software for almost a decade. He has spoken at dozens of conferences about his work, such as VelocityConf, OSCON, FOSDEM, and more. Mitchell co-founded HashiCorp, where he held multiple roles including CEO and CTO. He left the company in 2023, but he was part of the initial engineering team behind most of their products, such as Vagrant, Packer, Consul, Terraform, Vault, Nomad, Waypoint, and more. Prior to HashiCorp, Mitchell spent five years as a web developer and another four as an operations engineer. Mitchell is also an FAA licensed private pilot with an instrument rating (PPL ASEL IR). He currently flies a Cirrus SF50 Vision Jet. Aviation overlaps with his passion for programming and he routinely works on aviation software for fun. Mitchell Hashimoto - X Mitchell Hashimoto - Personal Website Mitchell Hashimoto - Github Ghostty Links to interesting things from this episode: HashiCorp Purpose of Terraform State Systems Resurgent?, speech by Amod Malviya, Systems Distributed Conference 2024
	Foundations of The Cloud With Adam Jacob, Chef	10 Jul 2024	01:00:56
In this second episode of the Platform Engineering Podcast’s special series on the Foundations of The Cloud, Cory O’Daniel meets up with Adam Jacob, co-founder of Chef and System Initiative. They discuss his early interest in infrastructure and automation, the development and impact of Chef in the DevOps community, and his transition to becoming a CEO. The conversation emphasizes the community and technological advancements Chef brought to the industry and introduces the ambitious goals of his current project, System Initiative. Don't miss this insightful conversation—tune in now to discover the future of infrastructure and automation! Guest: Adam Jacob, CEO / Chairman / Co-Founder of System Initiative Adam is an engineering and product innovator, with decades of experience designing, building, and managing large production systems. Adam previously co-founded Chef Software, was the original author of Chef, served as CTO, and was on the board of directors. Adam Jacob - Twitter Adam Jacob - Github Adam Jacob - Medium System Initiative
	Foundations of the Cloud with Mark Burgess, CFEngine	26 Jun 2024	00:57:41
In this special episode of the Platform Engineering Podcast on Foundations of The Cloud, Cory O'Daniel chats with Mark Burgess, the creator of CFEngine and a pioneer in configuration management. They dive into Mark's journey from physics to computer science, the birth of CFEngine, and the dynamic world of system administration and DevOps. They also explore the evolution of IT operations, the challenges of managing modern infrastructures, and the future promises of AI and autonomous systems in tech. Join us for an insightful discussion filled with historical anecdotes, practical advice, and visionary ideas for the future of platform engineering. Love the show? Subscribe, rate, review, & share! http://platformengineeringpod.com/ Guest: Mark Burgess, Principal Founder of CFEngine Mark Burgess is a theoretician and practitioner in the area of information systems, whose work has focused largely on distributed information infrastructure. He wrote an early popular book on C programming, which is now open and available through the Free Software Foundation. He was an early contributor to the Free Software Foundation in 1993 with CFEngine, which remains GPL. He is known particularly for his work on Configuration Management and Promise Theory. He was the principal Founder of CFEngine, co-founder at Aljabr, and is now the founder of ChiTek-i. He is Emeritus Professor of Network and System Administration from Oslo University College. He is the author of numerous books, articles, and papers on topics from Physics, Networks and Systems, to fiction. He also writes a blog on issues of science and IT industry concerns. Today, he works as an advisor on science and technology matters all over the world. Mark Burgess - Website CFEngine Links to interesting things from this episode: Computer Immunology Promise Theory Smart Spacetime
	Navigating AI And Platform Engineering With Amey Patil	12 Jun 2024	00:55:03
In this episode of the Platform Engineering Podcast, Cory O’Daniel speaks with Amey Patil, Head of Platform Engineering at Google Ads and Google Analytics. They discuss the evolving landscape of platform engineering, the integration of AI and ML, and the human factors essential for leading successful platform teams. From managing legacy code migrations to fostering a culture of innovation, Amey shares insights and strategies that are vital for both seasoned professionals and newcomers to the field. Tune in to gain valuable knowledge and stay ahead in the dynamic world of platform engineering. Love the show? Subscribe, rate, review, & share! https://platformengineeringpod.com/ Guest: Amey Patil, Head of Platform Engineering at Google Ads and Google Analytics Amey is a seasoned technology leader with a proven track record at Google and VMware. He excels in driving innovation and spearheading high-performing engineering teams. Currently serving as the Head of Platform Engineering for Google Ads and Google Analytics, he oversees infrastructure supporting multiple high-revenue products, leading a global team.
	The Role Of Startups In Moving Platform Engineering Forward With Colton Dempsey Of Next 47	08 May 2024	00:58:11
Startups always bring refreshing perspectives and brand-new ideas to any space, and the platform engineering industry is no different. In this episode, Cory O'Daniel sits down with Colton Dempsey of Next47 to discuss the important role of startup ventures in the progress of platform engineering. He explains the five categories of startups, why many of them are building more diverse feature sets, and the most effective monetization approaches for open-source projects. Colton also delves into the rise of cloud-native technologies and Kubernetes, detailing how they influenced his investment strategies. Guest: Colton Dempsey, Partner at Next47 Colton joined Next47 in 2018 and is a Partner on the Palo Alto investment team. He helped lead the firm’s investments in Noname Security, Armorblox, DataGrail, Brilliant, Software.com, and Pando, among others. His primary areas of focus are IT infrastructure, cybersecurity, robotics, and connected devices. Colton Dempsey, X Next47 Links to interesting things from this episode: Logz.io Platform Engineering On Kubernetes Cypress.io
	Unlocking the Potential of Hybrid Cloud with Cory O'Daniel	10 Apr 2024	00:10:04
In this episode Chris Hill asks Cory O’Daniel, CEO and co-founder of Massdriver, to explain the concept and benefits of hybrid cloud, including the use of multiple cloud providers and container services for scalability and flexibility. The discussion also addresses the security and networking challenges that using hybrid cloud implies. Guest: Cory O'Daniel, CEO at Massdriver
	Infrastructure As Code: Business Continuity And Disaster Recovery With Cory O'Daniel	27 Mar 2024	00:10:22
Chris Hill sits down with Cory O'Daniel to talk about how Infrastructure as Code can help with disaster recovery and business continuity. From the technology and personnel challenges to scenarios such as losing one of your regions and the importance of backup plans, learn how IaC can be used to help ensure data and operations are not affected. Love the show? Subscribe, rate, review, & share! Guest: Cory O'Daniel, CEO at Massdriver
	DevOps & Platform Engineering - What's The Difference? With Dave Williams	13 Mar 2024	00:08:11
In this week's episode, Chris Hill gets Dave Williams to go deeper into how platform engineering is moving one step beyond DevOps. Listen in as they discuss the skill sets, principles, and mindset needed to enable this evolution in the field. Love the show? Subscribe, rate, review, & share! Guest: Dave Williams, CTO at Massdriver
	Is DevOps Bullshit? An Interview With Cory O’Daniel, Author Of The “DevOps Is Bullshit” Blog Post	28 Feb 2024	00:17:26
DevOps was conceived as a way to bring developers and operations together, but has it really worked? In this episode, Dave Williams sits down with Cory O’Daniel to discuss his controversial blog post “DevOps Is Bullshit.” Find out why Cory sees platform engineering as a solution. Discover its potential as an antidote to the unmet promises and inherent challenges within the current DevOps landscape. Tune in now! Love the show? Subscribe, rate, review, & share! Guest: Cory O'Daniel, CEO at Massdriver Links to interesting things from this episode: Massdriver Microsoft For Startups Article: DevOps is Bullshit
	What Is Platform Engineering? A Cloud Operation Engineer’s Perspective	14 Feb 2024	00:15:50
For teams ingrained in DevOps practices, Platform Engineering ushers in a broader horizon, a fresh perspective on managing infrastructure. But what does Platform Engineering offer, especially for those adept in cloud operations? In this episode, Cory O’Daniel talks to Chris Hill about platform engineering from the perspective of a cloud operations engineer. From the importance of security and compliance and the challenges faced by developers to the impact of Massdriver to deliver infrastructure management to engineers, Chris shares insights from his decade-long experience. Tune in for an exploration of platform engineering's evolution! Love the show? Subscribe, rate, review, & share! Guest: Chris Hill, COO at Massdriver
	vCluster with Lukas Gentele: Rethinking Kubernetes Multi-Tenancy	16 Apr 2025	00:40:56
Are your platform teams constantly saying "no" to requests for new Kubernetes clusters? The traditional approach to Kubernetes multi-tenancy forces organizations to choose between cluster sprawl or restrictive namespaces - neither of which fully meets the needs of modern development teams. Lukas Gentele, CEO and co-founder of Loft Labs, shares how vCluster is transforming the way organizations handle multi-tenancy in Kubernetes. By running virtual Kubernetes control planes inside namespaces, vCluster enables teams to experiment with different versions, operators, and configurations while maintaining efficient resource usage. Key topics covered: How vCluster solves the limitations of namespace-based multi-tenancy Running multiple Kubernetes versions in the same cluster for testing and gradual upgrades Managing bare metal GPU resources efficiently for AI/ML workloads Balancing standardization with developer autonomy in platform engineering Using virtual clusters for cost-effective testing across multiple Kubernetes versions Whether you're a platform engineer looking to say "yes" more often or a development team seeking greater autonomy within Kubernetes, this discussion offers practical insights into modern multi-tenancy approaches. Guest: Lukas Gentele, CEO & Co-Founder at LoftLabs Lukas Gentele is the CEO and Co-founder of Loft Labs, which delivers Kubernetes-native tools, functionality and frameworks purpose-built for platform engineers to manage, activate and optimize their platform stack. Gentele is a dynamic leader with wide-ranging expertise in enterprise architecture, distributed systems, and developer productivity solutions. Prior to Loft, Gentele served as the co-founder and CEO at covexo GmbH and Webmans. Gentele often speaks at conferences such as KubeCon, writes articles for leading industry journals, and likes to share his experiences at meetups. Gentele holds a Bachelor of Science in Computer Science and Information Systems, and a Master of Science in Computer Science & Management of Enterprise Information Systems, both from the University of Mannheim. Lukas Gentele, LinkedIn Loft Labs vCluster Slack channel Links to interesting things from this episode: "Kubernetes the Hard Way" by Kelsey Hightower DevSpace “Inception”
	What Is Platform Engineering? From A Developer's Perspective	07 Feb 2024	00:14:15
For teams ingrained in DevOps practices, this term ushers in a broader horizon, a fresh perspective on managing infrastructure and applications. But what does Platform Engineering offer? In this episode, Cory talks to Dave Williams about platform engineering and why he sees this as the future of running applications in the cloud. Guest: Dave Williams, CTO at Massdriver
	Building Real-World Platforms: Abby Bangser on CNCF, Kratix, & Syntasso	02 Apr 2025	00:53:54
When organizations grow beyond using third-party platforms, they face a critical challenge: how to build internal platforms that enable teams to work efficiently while maintaining security and compliance. Abby Bangser, founding principal engineer at Syntasso, shares insights on creating real-world platforms that strike the right balance between standardization and flexibility. Key Insights The shift from external platforms to internal ones often comes from specific business needs, like compliance requirements Successful platform engineering requires finding the right balance between prescriptive standards and flexible customization Platforms should offer multiple levels of abstraction - from simplified "paved paths" to advanced customization options Platform teams should watch how users interact with their services to identify emerging patterns and needs Guest: Abby Bangser, Founding Principal Engineer at Syntasso. A hands-on software delivery professional with a passion for using quality as the foundation for quick value-focused delivery. Abby truly embodies the benefits of being a specialising generalist with experience and interests across traditionally Business Analyst, Quality Analyst, Developer, DevOps, Platform Engineering, SRE, and Infrastructure Engineering job titles and effectively leveraging those skills to encourage a well-rounded approach to quality software delivery. The next step for software professionals is the ability to drive thoughtful creation and evaluation of data from the operation of live services. With this in mind, Abby is currently excited to be working in an environment where engineers build and run their own software and understand the value of not only monitoring and logging, but striving for an observable system. It is now her goal to use these tools to create even more collaboration across skills like user research, quality, operations, infrastructure and software development to identify unknown issues that our users face in innovative ways. Abby Bangser, Bluesky Syntasso CNCF Links to interesting things from this episode: ThoughtWorks Massdriver Kratix OpenTofu “Let a 1,000 flowers bloom. Then rip 999 of them out by the roots.” Charity Majors
	Smart TV Testing Made Simple with Dave Lucia of TV Labs	19 Mar 2025	00:47:50
Testing smart TV applications presents unique challenges that traditional web testing approaches can't solve. Dave Lucia, CTO and co-founder of TV Labs, shares how his team built a platform that virtualizes televisions and set-top boxes to help media companies test their smart TV apps on physical devices. Learn about TV Labs' innovative architecture and how they handle everything from camera-based testing systems to their custom Lua-based DSL for faster test execution. A key highlight is how choosing Elixir as their primary technology has enabled TV Labs to build a robust orchestration system. The language's built-in capabilities for fault tolerance, process isolation, and distributed computing make it particularly well-suited for managing concurrent connections and real-time state across multiple devices. The discussion also explores practical insights about system architecture, including how TV Labs leverages Phoenix presence for real-time device state tracking and achieves microsecond-level performance for message broadcasting. Guest: Dave Lucia, CTO & Co-Founder at TV Labs Dave is a technology leader with deep experience designing and scaling systems across industries including media, sports betting, finance, and developer tooling. He is a prominent member of the BEAM community, regularly speaking at conferences such as Code BEAM SF, ElixirConf, The Big Elixir, and RabbitMQ Summit. Dave Lucia, Website Dave Lucia, X Dave Lucia. Bluesky TV Labs TV Labs, LinkedIn Links to interesting things from this episode: Appium “The Road to 2 Million Websocket Connections in Phoenix” “From $erverless to Elixir” eBPF
	Trust, Lock-in, And Better Infrastructure Management	26 Feb 2025	01:03:05
Why do 70% of organizations still struggle to adopt infrastructure as code? Sören Martius, CPO and co-founder of Terramate, joins Cory O'Daniel to tackle the challenges of modern infrastructure management and the delicate balance between vendor trust and lock-in. The conversation explores practical solutions for common infrastructure challenges, from managing monolithic state files to orchestrating complex deployments. Martius shares insights on: When to maintain a monolithic state file versus breaking it into smaller units How infrastructure needs evolve as engineering teams grow beyond 100 people Why anti-lock-in features build trust with operations teams The role of AI in detecting and remediating infrastructure misconfigurations For teams wrestling with infrastructure complexity or evaluating new tools, this discussion offers practical perspectives on building scalable, maintainable infrastructure while avoiding common pitfalls around vendor lock-in and team adoption. Guest: Sören Martius, Founder at Terramate Sören is an entrepreneur and technologist who loves building and delivering digital products and managing and scaling engineering teams for various kinds of businesses. His interests in technologies lie with DLT’s, Distributed Networks, Machine Learning, Microservices, Serverless Compute, Docker (and Kubernetes), AWS, Spark, Scala, Go, Elixir & OTP, Python, Rust, and Typescript among many others. Sören likes simplicity, pragmatism and common sense while bridging business, product and technology. Sören Martius, X Terramate, website Terramate, GitHub Links to interesting things from this episode: Terragrunt Wiz Stakpak Reclaim AI Fyxer Cursor Windsurf
	Meeting Developers In Their Existing Workflows: The Terrateam Advantage	05 Feb 2025	00:57:02
Building infrastructure tooling doesn't require massive VC funding or a huge team - just ask Malcolm Matalka, co-founder of bootstrapped Terrateam. Malcolm shares his journey from real estate websites to investment banking to biotech, before landing in infrastructure automation. Learn how Terrateam takes a unique "libraries over frameworks" approach to development, prioritizing simplicity and control by carefully selecting dependencies and building critical components in-house. Malcolm explains how this philosophy leads to more maintainable code and better security outcomes. As an early participant in the OpenTofu fork, Malcolm provides insights into the community response and adoption challenges. He discusses how Terrateam helps teams streamline their infrastructure workflows by integrating directly with existing tools and processes rather than forcing new ones. For platform engineers looking to simplify their infrastructure management, Malcolm describes the ideal Terrateam user as someone who wants infrastructure changes to flow naturally through their existing development process without added complexity. Guest: Malcolm Matalka, Software Engineer, Co-Founder of Terrateam As a co-founder at Terrateam, Malcolm enables teams to deliver infrastructure faster with their tools and services. They leverage Terraform and OpenTofu to automate, manage, and scale cloud infrastructure for developers and organizations. With over 20 years of experience in software engineering, he has a strong background in cloud computing, distributed systems, and infrastructure. Malcolm is also passionate about aerospace and bioinformatics, which led him to found Cosmo Labs AB and work as a software consultant at Abiogenesis Computer Systems Lab. At Cosmo Labs AB, they provided software solutions for satellite communication, orbital mechanics, and data analysis. At Abiogenesis Computer Systems Lab, they tackled bioinformatics challenges such as genome sequencing, protein structure prediction, and drug discovery. Prior to that, he worked for Spotify managing storage solutions at scale. Malcolm Matalka - Reddit Terrateam Terrateam - LinkedIn Links to interesting things from this episode: Erlang Riak Mnesia OCaml Puppet Tokio
	Beyond GitOps: Rethinking Cloud Self-Service with Dave Williams	22 Jan 2025	01:16:17
Is GitOps holding your team back? In this thought-provoking conversation with Massdriver co-founder Dave Williams, we challenge conventional wisdom around cloud infrastructure management and explore why traditional approaches to compliance and self-service may be creating more problems than they solve. Discover how leading organizations are moving beyond ceremonial approval processes to create truly automated, self-service platforms that enhance developer productivity while maintaining security and control. Learn why treating infrastructure as code differently from application code could be the key to unlocking engineering velocity. Key topics covered: Why compliance doesn't require manual GitOps workflows Creating meaningful abstractions that codify operations expertise The shift from reactive to proactive infrastructure governance How platform teams can become strategic enablers rather than bottlenecks Whether you're a platform engineer, engineering leader, or developer frustrated with current infrastructure processes, this episode offers practical insights for evolving your approach to cloud operations. Guest: Dave Williams, DevOps Propagandist Massdriver Links to interesting things from this episode: Open Tofu Heroku Terratest “Elephant in the cloud: Bridging the cloud infrastructure talent gap with software”
	Breaking Down Healthcare Delivery Barriers with Joel Vasallo	08 Jan 2025	00:50:49
Feeling overwhelmed by the number of apps you need to manage while building developer trust, managing costs, and trying to create an extensible platform that teams actually want to use? Joel Vasallo shares practical insights from scaling TAG's platform engineering initiatives across multiple healthcare organizations. Learn how his team transformed deployment times from weeks to minutes while maintaining security and compliance. Joel breaks down the journey from initial Kubernetes adoption to managing 70+ applications. Listeners will gain actionable strategies for: Starting small with platform initiatives and building organic buy-in Balancing standardization with team autonomy Managing cloud costs across multiple organizations Building trust through visibility and auditability Whether you're in healthcare or any regulated industry, this conversation provides a practical roadmap for evolving your platform engineering practice. Guest: Joel Vasallo, Senior Director of Platform Engineering at TAG Joel is the Senior Director of Platform Engineering at TAG - The Aspen Group where he leads teams focused on DevOps, SRE, and Delivery Engineering. Together, these teams aim to build and architect highly available cloud environments, develop infrastructure and development tools, and empower developers through fully automated deployment pipelines. In his spare time, he runs monthly meetups in Chicago through the Google Developers program. When he isn’t working, he loves exploring Sweet Home Chicago! Joel Vasallo - X Joel Vasallo - Medium The Aspen Group (TAG) Links to interesting things from this episode: Argo Istio Solo Prometheus Karpenter “From $erverless to Elixir” Duckbill Corey Quinn Malört
	Building Better Platforms with Dapr: Abstractions, Portability, and Durable Systems with Mark Fussell	16 Jul 2025	00:48:39
Cloud lock-in isn't just about where your data lives—it's about how deeply cloud-specific code permeates your applications. Mark Fussell, co-creator of Dapr and CEO of Diagrid, joins Cory O'Daniel to explore how Dapr provides clean abstractions for common distributed system patterns, enabling teams to build portable applications without sacrificing cloud-native capabilities. The conversation covers: How Dapr creates a clean separation between application code and underlying infrastructure services like messaging, state management, and secrets Why platform teams struggle with tight coupling between applications and infrastructure, and how Dapr solves this problem The benefits of Dapr's sidecar architecture for local development, testing, and production environments How Dapr automatically handles cross-cutting concerns like security, observability, and resiliency without boilerplate code Introduction to Dapr's workflow engine for durable execution and the emerging world of stateful AI agents Whether you're a platform engineer struggling with cloud lock-in or a developer tired of rewriting code for different infrastructures, this conversation demonstrates how Dapr can simplify your distributed systems while maintaining access to the unique capabilities of each cloud provider. Guest: Mark Fussell, Co-founder of Dapr and CEO of Diagrid Mark Fussell is the CEO of Diagrid, a cutting-edge company that simplifies building and scaling cloud-native applications. As the co-founder of Dapr (Distributed Application Runtime), Mark has played a pivotal role in shaping the future of modern application development by empowering developers to build resilient, distributed systems with ease. With decades of experience in the software industry, Mark has been a driving force behind innovative solutions that bridge the gap between developers and complex infrastructure. Diagrid Dapr Links to interesting things from this episode: "XML Bible" by Elliotte Rusty Harold OpenTelemetry SPIFFE DataGalaxy case study Cloud Native Computing Foundation
	From React to Dagster: Pete Hunt on Data, Infra, and AI-Ready Platforms	30 Jul 2025	00:49:32
Is Postgres actually a better message queue than Kafka? This provocative question is just one of many insights Pete Hunt shares in this conversation about data orchestration, platform engineering, and the evolution of infrastructure. Pete Hunt, CEO of Dagster Labs and former React co-founder at Facebook, brings his unique perspective from working at tech giants like Instagram and Twitter to discuss how different platform team approaches impact product development. Having witnessed both Facebook's clear delineation between product and infrastructure teams and Twitter's DevOps-style ownership model, Pete offers valuable comparisons of these contrasting philosophies. The conversation explores: How Dagster provides a higher-level abstraction for data teams, making it easier to track and debug data assets rather than just managing workflows The challenges of modern data platforms and why many organizations struggle with complex, distributed systems that could be simplified A practical approach to migrating from Airflow to Dagster with their "Airlift" toolkit that allows for incremental, low-risk transitions How AI development is fueling demand for better data orchestration as companies build applications that rely on properly managed data pipelines Pete also shares his thoughtful approach to balancing technical debt and product development with a "quarter on, quarter off" cadence that allows teams to both ship features and clean up the inevitable corners that get cut under deadline pressure. For platform engineers, data teams, and technical leaders navigating the intersection of infrastructure and AI, this episode provides practical insights on creating abstractions that deliver real operational value without unnecessary complexity. Guest: Pete Hunt, CEO of Dagster Pete is the CEO of Dagster Labs, where he first joined as Head of Engineering in early 2022 and transitioned into the CEO role later that same year. Before Dagster, Pete co-founded Smyte, an anti-abuse startup acquired by Twitter, where he continued as a senior staff engineer. Earlier in his career, Pete was one of the first engineers to work on Instagram after its acquisition by Facebook in 2012. There, he led development on Instagram’s web and analytics teams and became a co-founder of the React.js project, helping transform an internal experiment into one of the most widely used front-end frameworks in the world. He was also part of the early community around GraphQL and has remained deeply engaged in open source and developer tooling. Pete brings a pragmatic, hands-on perspective to modern data infrastructure. Having been both a founder and an engineer, he focuses on reducing complexity and fatigue in data teams by building tools that actually work together. At Dagster, he remains close to the code and actively involved in technical decisions, combining leadership with deep technical fluency. Pete Hunt, X Dagster Dagster Pipes Dagster Airlift Links to interesting things from this episode: React “Postgres: a Better Message Queue than Kafka?” Airflow Kubeflow CAPES Fargate
	Beyond Cracking the Coding Interview with Mike Mroczka	20 Aug 2025	01:08:35
Ever wondered how many “perfect” candidates simply learned the test—or how many great engineers get filtered out by bad interview design? Mike Mroczka, interview coach and ex-Googler, shares what really goes on behind technical hiring and how to navigate it to your advantage. What you’ll learn: How leaked question banks and standardized puzzles can distort hiring signals - and where they still help Practical ways companies can make interviews fairer and harder to game, both on-site and remote A balanced take on data structures and algorithms: when they’re useful and when they’re noise Tactics to spot and reduce cheating without turning interviews into surveillance How to structure interviews for different seniority levels so you measure the right skills Salary negotiation playbook: timing, leverage, and common pitfalls that cost candidates real money Getting past the application black hole: skipping recruiters, networking that works, and coordinating offers Who this helps: Engineers tired of grinding puzzles who want a smarter prep plan Hiring managers looking to improve signal and reduce false negatives Anyone preparing to negotiate an offer with confidence Guest: *Mike Mroczka, Primary author of Beyond Cracking the Coding Interview, Ex-Google* Mike Mroczka, a former senior SWE (Google, Salesforce, GE), is now a tech consultant with a decade of experience helping engineers land their dream jobs. He’s a top-rated mentor (interviewing.io, Karat, Pathrise, Skilledinc) and the author of viral technical content on system design and technical interview strategies featured on HackerNews, Business Insider, and Wired. Mike Mroczka, website Beyond Cracking the Coding Interview Links to interesting things from this episode: Cracking the Coding Interview by Gayle Laakmann McDowell HackerOne Interviewing.io Cluely Google glass Ray-Ban HackerRank⁠ CodeSignal⁠
	GraphQL, MCP, and the Future of APIs with Apollo CEO Matt DeBergalis	10 Sep 2025	00:43:07
UPDATE - Apollo GraphQL has kindly offered us a few free passes to join them at the GraphQL Summit in San Francisco, October 6-8, 2025. If you are interested in going, the code is: PodcastSummit25 What if your API layer could help you ship faster today and make tomorrow’s AI workflows safer and easier to build? Apollo CEO Matt DeBergalis explains how GraphQL became a practical standard for unifying messy backends, why declarative schemas and strong types are the “bedrock” for agentic systems, and where MCP fits when you want agents to call business data safely. You’ll hear real examples of speeding up frontends, tightening observability, and running focused personalization without “fat” APIs. What you’ll learn: A plain-language model for GraphQL and why it decouples frontend needs from backend services How typing, schema docs, and field-level telemetry reduce risk and enable LLM-driven tooling Practical ways to expose queries as MCP tools and start with internal “agentic DevOps” Tactics for experiments and personalization that stay fast and measurable at scale Why an end-to-end approach (client and server) matters for reliability and speed Guest: Matt DeBergalis, CEO and Co-Founder of Apollo GraphQL Matt DeBergalis is the Chief Executive Officer and Co-Founder of Apollo GraphQL, focused on bringing the popular GraphQL technology to the enterprise. He previously served as Apollo's CTO, leading product and engineering. Matt's longtime focus has been in open source and platforms: he co-founded Meteor.js, which grew to become one of the most popular open-source projects in the world for developing full-stack web apps with JavaScript, as well as ActBlue, the American political fundraising platform that revolutionized grassroots political giving. He attended the Massachusetts Institute of Technology and resides in the San Francisco Bay Area with his family. In his spare time, Matt enjoys taking to the air and flying his 1966 Beechcraft Baron. Apollo GraphQL, website Apollo GraphQL, GitHub Apollo GraphQL, LinkedIn Apollo GraphQL, X Apollo GraphQL, YouTube Links to interesting things from this episode: Free Software Foundation Cursor Motley Fool podcast GraphQL Summit
	Guest Host: Kelsey Hightower — Why IaC Alone Isn’t Enough	08 Oct 2025	00:39:40
Ever wonder why strong Terraform modules still lead to long review queues and fragile pipelines? From hand-built scripts and early data center migrations to cloud sprawl and Kubernetes, configuration management has changed a lot - but the core struggle remains: too many decisions, not enough guardrails. Guest host Kelsey Hightower sits down with Cory O’Daniel to unpack where Infrastructure as Code succeeds and where teams get stuck. What you’ll learn: How to avoid “choice overload” in cloud configs by moving decisions upstream Practical ways to pair IaC with UX, policies, and SLAs to reduce toil When click-ops is a symptom, not the problem - and how to replace it safely Patterns for scaling platform practices beyond a handful of experts A simple mental model for mapping workflows across serverless, containers, and VMs Guest Host: Kelsey Hightower Kelsey has worn every hat possible throughout his career in tech and enjoys leadership roles focused on making things happen and shipping software. Prior to his retirement, he was a Distinguished Engineer at Google, where he worked on Google Cloud Platform. He is a strong open source advocate with a focus on building great software as well as great communities around them. He is also an accomplished author and keynote speaker with a knack for demystifying complex topics, doing live demos and enabling others to succeed. When he is not writing code, you can catch him giving technical workshops covering everything from programming to system administration. Guest: Cory O'Daniel, CEO and Co-Founder of Massdriver and Co-Founder of OpenTofu Cory has been a software architect and engineer for 20 years, leading up to the founding of MassDriver. He's also a husband and the father of two kids. Cory O'Daniel, X Cory O'Daniel, Medium Massdriver, website Massdriver, GitHub Massdriver, Youtube Open Tofu Links to interesting things from this episode: "The Phoenix Project: A Novel about IT, DevOps, and Helping Your Business Win" by Gene Kim "15 Years of Duct Tape - Why IaC Adoption Stalled at 30"
	How to Ship Faster with Feature Flags: Insights from Unleash	24 Sep 2025	00:43:58
Still freezing code before Black Friday and hoping nothing breaks? Feature flags can help you ship smaller, safer changes continuously—without the “big bang” risk or painful rollbacks. Cory O’Daniel talks with Unleash VP of Marketing Michael Ferranti about how modern teams use flags as a core delivery primitive alongside CI/CD and trunk-based development. They dig into kill switches for instant mitigation, progressive rollouts tied to real metrics, and why homegrown “if-statement” systems turn into hidden platforms you didn’t mean to build. They also cover the rising volume of AI‑assisted code and how flags provide the control layer to move faster while protecting reliability. What you’ll learn: How feature flags reduce risk for high-stakes periods like Black Friday by avoiding code freezes When to replace staging queues with progressive delivery and experiment-driven rollouts Practical uses: kill switches, trunk-based development, targeting, and cleanup strategies to manage flag debt Build vs. buy: why DIY flag systems become costly and how Unleash’s open source and on-prem options fit regulated or air‑gapped needs Using business, engineering, and customer signals to automate safe ramp-ups and ramp-backs Why AI increases code throughput, how it affects reliability, and how flags create the safety rails for agentic workflows Guest: Michael Ferranti, VP of Marketing at Unleash Michael Ferranti has held leadership roles at Teleport, Portworx, ClusterHQ, and Rackspace Technology, with a focus on go-to-market strategy in open-source and enterprise software. At Teleport he focused on shifting from legacy security models to developer-first, identity-driven access. At Portworx, he was building new GTM strategies for Kubernetes-native storage when everyone was still figuring out containers, and he helped scale the company from under $500K in revenue to a $370M acquisition by Pure Storage. His work has centered on supporting engineering leaders in delivering features, scaling infrastructure, and improving security without adding unnecessary blockers. Michael has spoken at industry events like KubeCon and theCUBE, sharing insights on platform org design, category creation, and growing open-source adoption. Unleash, website Unleash, GitHub Unleash, LinkedIn Unleash, X Unleash, Slack Unleash, YouTube UnleashCon 2025 Links to interesting things from this episode: React Bitbucket LaunchDarkly ServiceNow CockroachDB Red Hat OpenShift State of DevOps Report (DORA) "How to Win Friends & Influence People" Grafana REMINDER - Apollo GraphQL has kindly offered us a few free passes to join them at the GraphQL Summit in San Francisco, October 6-8, 2025. If you are interested in going, the code is: PodcastSummit25
	Guest Host: Kelsey Hightower - Are CI/CD and GitOps Just Making Things Harder?	22 Oct 2025	00:30:18
What if your production environment had a live, trustworthy blueprint you could zoom in and out of on demand? Kelsey Hightower guest-hosts a candid conversation with Cory about why CI/CD pipelines and GitOps often break down for cloud infrastructure. They explore a simpler operational model: treat infrastructure as data, lean on clear checkpoints instead of rigid “golden paths,” and make production legible for both developers and ops. You’ll learn: Where CI/CD adds friction for infra and what to do instead Why GitOps works for apps but hits limits for databases, networks, and multi-region realities How “living diagrams” help new teammates understand prod on day one Practical guardrails that evolve with your org without locking teams in Ways to reduce drift, surprise cloud costs, and Day Two chaos A mindset shift: databases for ops data, not shell-script archaeology Walk away with concrete patterns to make production understandable, auditable, and easier to change—without more YAML or bigger pipelines. Guest Host: Kelsey Hightower Kelsey has worn every hat possible throughout his career in tech and enjoys leadership roles focused on making things happen and shipping software. Prior to his retirement, he was a Distinguished Engineer at Google, where he worked on Google Cloud Platform. He is a strong open source advocate with a focus on building great software as well as great communities around them. He is also an accomplished author and keynote speaker with a knack for demystifying complex topics, doing live demos and enabling others to succeed. When he is not writing code, you can catch him giving technical workshops covering everything from programming to system administration. Guest: Cory O'Daniel, CEO and Co-Founder of Massdriver and Co-Founder of OpenTofu Cory has been a software architect and engineer for 20 years, leading up to the founding of MassDriver. He's also a husband and the father of two kids. Cory O'Daniel, X Cory O'Daniel, Medium Massdriver, website Massdriver, GitHub Massdriver, Youtube Open Tofu Links to interesting things from this episode: SigNoz “The $6,459 Terraform Lesson: Why Infrastructure Lifecycle Monitoring Matters” by Liz Fong-Jones "Gitopscracy" video
	Policy as Code: Kyverno and Securing Kubernetes at Scale with Jim Bugwadia	19 Nov 2025	00:42:21
Most Kubernetes security breaches don't come from zero-day exploits - they come from misconfigurations. While your team runs scanners and reviews reports, containers are already running as root, network policies are missing, and compliance violations are piling up across dozens of repositories. Jim Bugwadia, co-founder and CEO of Nirmata and creator of Kyverno, joins Cory to talk about a different approach: policy as code. Instead of asking developers to remember security best practices across every repo, what if your cluster automatically enforced secure defaults and blocked non-compliant deployments before they ever reached production? You'll learn how to start using Kyverno today without breaking your production environment - from running your first audit scan (no installation required) to implementing enforcement mode with exceptions. Jim explains why micro-segmentation matters more than ever, how to automate network policies for every namespace, and why platform teams are using Kyverno for everything from security to cost optimization. Whether you're running one cluster or managing Kubernetes at scale, this conversation offers practical strategies for making security a byproduct of your platform - not an afterthought. Topics covered: Why shift-left security fails and what "shift-down" means for platform teams How to implement Kubernetes policy enforcement without grinding deployments to a halt Automating secure defaults: network policies, resource quotas, and role bindings The crawl-walk-run approach to rolling out policies in existing clusters Real-world use cases beyond security: cost optimization and resource management Guest: Jim Bugwadia, Co-Founder & CEO of Nirmata and creator of Kyverno Jim Bugwadia is the Co-founder and CEO of Nirmata, a Kubernetes management platform built for enterprises to simplify and scale cloud-native operations across clouds, data centers, edge, and connected devices. With a mission to democratize cloud-native best practices, Jim brings deep expertise in building large-scale software products and leading high-performing teams. Before founding Nirmata, he led a global consulting team at Cisco, guiding enterprises and service providers on their cloud computing journeys. Earlier in his career, he contributed to innovative products at startups and major companies including Trapeze Networks, Pano Logic, Jetstream, Lucent, and Motorola. A hands-on technologist, Jim continues to code in Go, Java, and JavaScript, reflecting his passion for building in the rapidly evolving world of software. Jim Bugwadia, X Nirmata Kyverno Links to interesting things from this episode: Kyverno Community Repository “Shift-Down Security” Paper OpenReports Policy Reporter “The Shai-Hulud npm malware attack: A wake-up call for supply chain security” Kyverno Slack Channel
	Guest Host: Kelsey Hightower - Beyond Pipelines: Infrastructure As Data	05 Nov 2025	00:48:51
Is your Git repo really the source of truth for infrastructure - or just a suggestion? Guest host Kelsey Hightower sits down with Cory O’Daniel to unpack why many teams hit dead ends with CI/CD for provisioning, where GitOps struggles with drift, and when TicketOps helps or hurts. They explore a different model: infrastructure as data with typed contracts, shared artifacts, and workflows that embed policy, validation, and upgrades from the start. You’ll hear practical ways to reduce cognitive load for developers while giving operations reliable control and better day‑2 levers. You’ll learn: Why pipelines are a poor fit for infra provisioning and what to do instead How to reason about drift as a three‑way merge with reality When reconciliation helps, and when it breaks production firefights How typed contracts and artifacts connect modules and teams without glue scripts Ways to present safer self‑service without requiring everyone to learn Terraform A simple mental model for treating TicketOps as a surface, not the workflow Guest Host: Kelsey Hightower Kelsey has worn every hat possible throughout his career in tech and enjoys leadership roles focused on making things happen and shipping software. Prior to his retirement, he was a Distinguished Engineer at Google, where he worked on Google Cloud Platform. He is a strong open source advocate with a focus on building great software as well as great communities around them. He is also an accomplished author and keynote speaker with a knack for demystifying complex topics, doing live demos and enabling others to succeed. When he is not writing code, you can catch him giving technical workshops covering everything from programming to system administration. Guest: Cory O'Daniel, CEO and Co-Founder of Massdriver and Co-Founder of OpenTofu Cory has been a software architect and engineer for 20 years, leading up to the founding of MassDriver. He's also a husband and the father of two kids. Cory O'Daniel, X Cory O'Daniel, Medium Massdriver, website Massdriver, GitHub Massdriver, Youtube Open Tofu Links to interesting things from this episode: "Gitopscracy" video
	Simplicity at Scale: Cleaning House for Platform Teams with Brian Childress	17 Dec 2025	00:40:46
Why do so many “modern” platforms feel slow, fragile, and painful to work on? Platform engineer and fractional CTO Brian Childress joins Cory to discuss how over-engineering, resume‑driven development, and scattered tooling quietly block teams from shipping value. They explore why simplicity is a competitive advantage for platform teams, especially as AI becomes part of everyday development. You’ll learn: How to design a simple platform MVP that developers actually like using What a good local‑to‑prod story looks like (and why it’s the real scaling superpower) Practical ways to onboard humans and AI tools so both can contribute faster Where teams introduce unnecessary complexity with Kubernetes, microservices, and NoSQL How to think about scaling in three dimensions: users, developers, and features Why good architecture, docs, and decision records make AI more useful, not less How to spot and avoid resume‑driven development before it explodes your platform Whether you’re cleaning up a messy stack or trying to keep a young platform from drifting into chaos, this conversation gives you concrete patterns for keeping things simple while still scaling teams, systems, and features. Guest: Brian Childress, Platform engineer and fractional CTO Brian Childress is an accomplished Software Engineer, Architect and Fractional CTO. For over a decade Brian has developed applications in healthcare, finance, and consumer products. Brian has spoken internationally on topics such as application security and developer tooling. Brian spends his free time researching and teaching the latest in application and API security design and best practices. Brian Childress, website Brian Childress, X Links to interesting things from this episode: Replit Lovable
	Using Feature Flags to Tame Complexity with Mike Zorn	03 Dec 2025	00:43:29
What if changing a single flag could save you from a failed migration, a broken API, or a late-night rollback? Join us as we dive into how feature flags become a practical tool for changing application behavior at runtime, not just toggling UI elements. Cory talks Mike Zorn about real stories from LaunchDarkly and Rippling, covering how teams use flags to ship safely, debug faster, and simplify complex systems. You’ll hear about: Using feature flags to avoid staging overload and ship directly to production Migrating critical systems and databases with minimal downtime and risk Controlling log levels and rate limits for specific customers on the fly Managing flag sprawl so teams do not drown in half-rolled-out features Experimenting with AI features, prompts, and models without fully committing If you’re working on a platform, running critical infrastructure, or just trying to ship faster without breaking everything, this conversation offers concrete patterns you can start using right away. Guest: Mike Zorn, Senior Software Engineer at Rippling Mike’s software engineering journey began with an early interest in problem-solving and programming, starting with creating programs on a TI-83 calculator in middle school. After studying mathematics in college, he transitioned into software through an applied math project that required coding, which sparked his interest in engineering as a career. Professionally, he has worked at several product and SaaS companies, including one that was an early LaunchDarkly customer, where they experienced firsthand the challenges of managing feature flags internally. That experience led him to appreciate the value of tools like LaunchDarkly, eventually joining the company himself. Since then, he has contributed across various areas, including focusing on how LaunchDarkly can best adopt its own platform internally to streamline releases and help engineers work more efficiently. His latest adventure has been joining Rippling as a Senior Staff Software Engineer. Mike Zorn, GitHub Mike Zorn, Email Rippling LaunchDarkly Links to interesting things from this episode: SigNoz Signadot Open Container Initiative “Using Feature Flags to Avoid Downtime During Migrations” Apache Iceberg
	Observability in the AI Era with New Relic's Nic Benders	18 Feb 2026	00:50:37
What happens when nobody wrote the code running in your production environment? As AI-generated software becomes standard practice, platform engineers face a new challenge: operating systems without experts to consult. Nic Benders, Chief Technical Strategist at New Relic, has spent 15 years watching observability evolve from basic server monitoring to understanding complex distributed systems. Now he's tackling the next frontier: how to maintain and operate software when there's no human author to ask why something was built a certain way. The conversation covers the shift from instrumentation being the hard problem to understanding being the bottleneck. Nic explains why inventory matters more than you think, how to approach AI-generated code as a black box that needs testing and telemetry, and why "garbage in, safety out" should be your new mantra. You'll learn practical strategies for instrumenting modern systems with OpenTelemetry, why your observability hierarchy needs to start with knowing what's actually running, and how to build platforms that make safe deployment easier than risky shortcuts. Nic also shares his perspective on technical drift versus technical debt and what changes when your best troubleshooting tool - institutional knowledge - no longer exists. Whether you're drowning in observability data or just starting to instrument your systems, this conversation offers concrete approaches for building understanding into your platform engineering practice. Guest: Nic Benders, Chief Technical Strategist at New Relic Nic Benders is New Relic's Chief Technical Strategist. Part of the Engineering team since the early days of the company, Nic has been involved with everything from Agents to ZooKeeper and all the pieces and products in between. As New Relic's Chief Technical Strategist, he now looks after the long-term technical strategy behind the product and the experience of all the engineering teams who build it. Before New Relic, he worked in the mobile space, managing back-end messaging and commerce systems powering some of the largest carriers in the world. New Relic, website New Relic, Blog Links to interesting things from this episode: OpenClaw (aka Moltbot, aka Clawdbot) Moltbook
	Why Extend Went All-In on Serverless Platform Engineering	04 Mar 2026	01:02:28
Billions of requests a month on AWS Lambda can cost less than a single engineer’s laptop budget, but only if the architecture and developer workflow are designed for it. Justin Masse, Senior Platform DevOps Engineer at Extend, shares how Extend committed early to a serverless-first approach and built a platform that prioritizes developer speed and low operational toil. The conversation breaks down what it takes to run active-active, multi-region systems in a serverless world, how the team keeps services small and fast, and why asynchronous, event-driven design changes both reliability and cost. You’ll also hear how Extend treats developer experience as a core platform responsibility: templated microservices, fast deployment pipelines, ephemeral environments for pull requests, and infrastructure that developers can own without becoming cloud specialists. A big theme is using AWS CDK and internal abstractions to keep infrastructure close to the application code, so teams can move quickly while keeping platform standards consistent. Finally, the discussion gets practical about tradeoffs that show up after the “serverless is easy” pitch: local development challenges, the real cost center (observability), and where AI is helping today, including an internal agent that diagnoses failed deployments and suggests fixes. What you’ll learn Why Extend avoids servers and VPC complexity, and what they use instead Patterns for active-active, multi-region thinking in a serverless architecture How DevEx practices like templates and ephemeral environments reduce friction A pragmatic approach to IaC with CDK and reusable internal constructs Where serverless costs stay low, and why observability often dominates the bill How AI is being applied to platform workflows without skipping engineering judgment Guest: Jusin Masse, Senior Platform DevOps Engineer at Extend Justin Masse is a self-proclaimed lead chaos engineer, recognized within niche engineering communities for his expertise Chaos Engineering and Infrastructure & DevOps. The father of three young kids, a husband, a recent MBA graduate, recent cancer survivor, and competitive powerlifter, he still finds time to actively contribute to the platform engineering community. Justin Masse, website Justin Masse, GitHub Extend, website Links to interesting things from this episode: Episode with Adrian Cockroft “From $erverless to Elixir” by Cory O’Daniel
	Infrastructure as Code's Hidden Problem with Pavlo Baron	18 Mar 2026	00:57:35
Terraform drift, state wrangling, and a growing “tools for tools” stack are still daily work for many platform teams - despite a decade of DevOps talk and cloud maturity. Why does ops automation so often feel like it needs babysitting? Pavlo Baron breaks down where Infrastructure as Code tends to break down in real organizations: manual drift management, low-level state complexity, and a lack of practical abstractions that let developers self-serve without inheriting the entire ops burden. The conversation digs into what a more use-case-driven approach could look like - where teams can choose when to enforce desired state, when to accept emergency changes, and how to build “guardrails” that reduce mistakes without slowing delivery. Pavlo also explains why type safety and constrained interfaces matter (especially as AI starts generating more code and infrastructure changes), and why the future of platform engineering depends less on slogans and more on systems that reduce toil. Guest: Pavlo Baron, Co-Founder and CEO of Platform Engineering Labs Pavlo Baron is Co-Founder and CEO of Platform Engineering Labs, who are crafting tools to remove the toil from the operations work, with a current focus on infrastructure. He is a veteran in the space, having served in all kinds of roles throughout his career that spans more than 35 years. Previously, he was co-founder, CTO, and major inventor at an observability startup, Instana, that was acquired by IBM in 2020. Pavlo is a frequent conference speaker and author of several books. Pavlo Baron, X https://pavlobaron.medium.com/ https://github.com/platform-engineering-labs https://www.linkedin.com/company/platform-engineering-labs https://x.com/plateng_labs https://bsky.app/profile/platform.engineering https://mastodon.social/@plateng_labs https://www.youtube.com/@plateng-labs Links to interesting things from this episode: The Pkl Primer formae formae quick start "10+ Deploys Per Day: Dev and Ops Cooperation at Flickr" “Where everyone is responsible, no one is really responsible.” Albert Bandura JPL “Visions of the Future” “Fallout: New Vegas”
	AI-Native Ops: Making AI Safe for Production with William Collins	01 Apr 2026	01:03:00
What happens when your “coworker” can generate code and changes faster than your team can review them, and production still has to stay up? William Collins breaks down what AI-Native Ops looks like when you take reliability seriously: where reasoning should stop, where deterministic automation should begin, and how guardrails like compliance checks, version pinning, and controlled workflows keep AI from turning into outage fuel. Cory and William also dig into why context windows and tool sprawl matter in real systems, how protocols like MCP and agent-to-agent communication are shaping day-to-day automation, and why regulated environments can’t adopt new tech with hype-driven shortcuts. If you’re a platform engineer trying to balance speed with safety, this conversation offers a practical way to use AI for the work that drags teams down, without giving up operational discipline. Guest: William Collins, Director of Technical Evangelism at Itential, AWS community builder, and the co-host of the Cloud Gambit podcast William Collins is a strategic thinker and catalyst for innovation. Over his career, he has helped enterprises build large-scale networks, driven modernization through cloud adoption, and excels at optimizing complex environments through good design practices and automation. Today, William works as Director of Technical Evangelism for Itential, where he focuses on evangelizing the Itential Platform, fostering strong relationships with customers to fully realize their goals, engaging with community, and advocating for the successful future of network, security, and automation infrastructure. As a content creator, William hosts The Cloud Gambit Podcast with Eyvonne Sharp, a show that unravels the state of cloud computing, markets, strategy, and emerging trends with industry experts. He is also a LinkedIn Learning Instructor (Automation, Cloud, and Network Engineering Content), AWS Community Builder (Network & Content Delivery), and is a group organizer for the USNUA - Kentucky User Group (KYNUG). Prior to Itential, William worked as a Principal Cloud Architect and Director of Technical Evangelism for Alkira where he helped grow the company from lean beginnings to being ranked 25th Fastest-Growing Company in North America and 6th in the Bay Area on the 2024 Deloitte Technology Fast 500. He also held various senior technical roles across the enterprise space in Financial Services and Healthcare, most recently at Humana as Director of Cloud Architecture. Outside of tech, his time is spent with family, woodworking, ice hockey, and guitar. Opinions expressed are solely his own and do not express the views or opinions of his employer. William Collins, Blog William Collins, YouTube William Collins, X William Collins, Instagram William Collins, TikTok William Collins, GitHub Itential “The Cloud Gambit” podcast Links to interesting things from this episode: Ghostty “Harness design for long-running application development” by Anthropic
	Green CI and Merge Queue Mastery with Trunk’s Eli Schleifer	15 Apr 2026	00:49:35
When a flaky test can stall a merge queue, “just rerun CI” stops scaling fast. Cory talks with Trunk co-founder and CEO Eli Schleifer about the outer loop problems that show up as teams ship more code - especially with AI-assisted development increasing PR volume. They break down what a merge queue is, why logical merge conflicts happen even when individual PRs are green, and how predictive testing helps protect main without forcing constant retesting. Eli also explains how Trunk approaches flaky tests: collecting JUnit results, using quarantines so known flakes don’t block delivery, and fingerprinting failures to tell the difference between “this always times out” and “this was just broken by a recent change.” The conversation closes on how review and quality practices may shift as code generation accelerates - and what still needs strong guardrails like tests, security checks, and reliable CI signals. Guest: Eli Schleifer, co-founder and CEO of Trunk Eli Schleifer leads Trunk’s technical vision and product strategy, focused on closing the gap between AI-speed code generation and human-speed delivery by removing the bottlenecks that slow modern engineering teams. Trunk’s platform eliminates flaky tests, resolves merge queue constraints, and redesigns CI systems to enable high-throughput, continuous delivery. Prior to founding Trunk, Eli was CTO at Directr, which was acquired by Google, and has served in engineering leadership roles at YouTube, Uber, and Microsoft. Eli Schleifer, X Trunk, website Trunk, Slack Trunk, Github Trunk, X Links to interesting things from this episode: Balsamiq “Code First Engineering” by Eli Schleifer
	You Need AI Sysadmins Can Trust, With Cribl's Nikhil Mungel	13 May 2026	00:55:16
What happens when a non-deterministic AI system is asked to touch production telemetry or generate changes for an SRE pipeline? The cost of being “close enough” can be lost data, downtime, or a security incident. Cribl’s Nikhil Mungel joins Cory to break down what it takes to build AI that sysadmins can actually trust. The conversation digs into harness engineering and the practical guardrails that turn probabilistic models into repeatable, verifiable outcomes. They cover why breaking work into small chunks matters, how validation and testing become the real leverage point for AI-native development, and what “code factories” mean for review, CI, and platform reliability when teams can generate a thousand PRs an hour. Platform engineers will also hear a pragmatic take on the future of the job. The focus shifts away from typing code and toward building systems for verification, simulation, and safe deployment at scale, plus clearer ways to decide what needs human scrutiny and what can ship automatically. Guest: Nikhil Mungel, Head of AI R&D at Cribl Nikhil Mungel is the Head of AI R&D at Cribl, where he's building LLM-powered systems for IT and Security data transformation and analysis. Before Cribl, he spent over a decade developing distributed systems across the observability and consumer social tech landscape. He lives in San Francisco with his wife and two kids. His current focus is applying AI to make complex infrastructure more intuitive and explainable. Nikhil Mungel, Website Nikhil Mungel, X Cribl, Website Cribl, LinkedIn Links to interesting things from this episode: Cribl Guard “Open source died in March. It just doesn't know it yet.” by Dan Lorenc, CEO of Chainguard
	Durable Execution for Real‑World Failures with Temporal’s Cornelia Davis	27 May 2026	00:46:27
A lot of infrastructure and automation fails for ordinary reasons: rate limits, flaky networks, partial permissions, long-running jobs, and retries that vanish when the process restarts. Durable execution is a way to design systems that keep going anyway - without rebuilding a maze of queues, cron jobs, and manual cleanup. Cornelia Davis breaks down how durable execution works in practice: writing “normal” code while the runtime provides durable retries, state management, and the ability to pause work, wait for a human or external change (like a quota increase), and resume right where things left off. The conversation connects these ideas to platform engineering realities - Terraform workflows, long provisioning times, and “orphan” resources - and explains how Temporal workflows and activities help teams model failure handling as a first-class part of the system. You’ll also hear why this approach is showing up in AI engineering: long-running agent workflows, frequent rate limiting, and the need to avoid re-running expensive LLM calls when something breaks near the end. Guest: Cornelia Davis, Developer Advocate at Temporal Technologies and author of “Cloud Native Patterns” Cornelia Davis is a Developer Advocate at Temporal, where she brings more than three decades of experience as a software technologist to help engineers build resilient, scalable systems. Known for her pragmatic blend of hands-on coding, technical strategy, and customer collaboration, Cornelia is passionate about helping developers unlock the full potential of modern cloud-native architectures. Previously, she served as VP of Technology at Pivotal, where she played a key role in shaping Cloud Foundry and enabling enterprise cloud transformations. Whether she’s writing code, presenting at conferences, or whiteboarding with teams, Cornelia is driven by a singular goal: empowering developers to build better software. Outside of tech, she recharges on the yoga mat or in the kitchen, where she brings the same creativity and focus to her practice. Temporal, Website Temporal, GitHub Temporal Community, GitHub Temporal’s AI-assisted development tools Links to interesting things from this episode: Temporal Developer Skill “Cloud Native Patterns” by Cornelia Davis

About us Privacy Policy