Back

Explore every episode of the podcast The Python Podcast.__init__

Dive into the complete episode list for The Python Podcast.__init__. Each episode is cataloged with detailed descriptions, making it easy to find and explore specific topics. Keep track of all episodes from your favorite podcast and never miss a moment of insightful content.

Rows per page:

1–50 of 389

TitlePub. DateDuration
Update Your Model's View Of The World In Real Time With Streaming Machine Learning Using River12 Dec 202201:16:23
Preamble

This is a cross-over episode from our new show The Machine Learning Podcast, the show about going from idea to production with machine learning.

Summary

The majority of machine learning projects that you read about or work on are built around batch processes. The model is trained, and then validated, and then deployed, with each step being a discrete and isolated task. Unfortunately, the real world is rarely static, leading to concept drift and model failures. River is a framework for building streaming machine learning projects that can constantly adapt to new information. In this episode Max Halford explains how the project works, why you might (or might not) want to consider streaming ML, and how to get started building with River.

Announcements
  • Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
  • Building good ML models is hard, but testing them properly is even harder. At Deepchecks, they built an open-source testing framework that follows best practices, ensuring that your models behave as expected. Get started quickly using their built-in library of checks for testing and validating your model’s behavior and performance, and extend it to meet your specific needs as your model evolves. Accelerate your machine learning projects by building trust in your models and automating the testing that you used to do manually. Go to themachinelearningpodcast.com/deepchecks today to get started!
  • Your host is Tobias Macey and today I’m interviewing Max Halford about River, a Python toolkit for streaming and online machine learning
Interview
  • Introduction
  • How did you get involved in machine learning?
  • Can you describe what River is and the story behind it?
  • What is "online" machine learning?
    • What are the practical differences with batch ML?
    • Why is batch learning so predominant?
    • What are the cases where someone would want/need to use online or streaming ML?
  • The prevailing pattern for batch ML model lifecycles is to train, deploy, monitor, repeat. What does the ongoing maintenance for a streaming ML model look like?
    • Concept drift is typically due to a discrepancy between the data used to train a model and the actual data being observed. How does the use of online learning affect the incidence of drift?
  • Can you describe how the River framework is implemented?
    • How have the design and goals of the project changed since you started working on it?
  • How do the internal representations of the model differ from batch learning to allow for incremental updates to the model state?
  • In the documentation you note the use of Python dictionaries for state management and the flexibility offered by that choice. What are the benefits and potential pitfalls of that decision?
  • Can you describe the process of using River to design, implement, and validate a streaming ML model?
    • What are the operational requirements for deploying and serving the model once it has been developed?
  • What are some of the challenges that users of River might run into if they are coming from a batch learning background?
  • What are the most interesting, innovative, or unexpected ways that you have seen River used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on River?
  • When is River the wrong choice?
  • What do you have planned for the future of River?
Contact Info Parting Question
  • From your perspective, what is the biggest barrier to adoption of machine learning today?
Closing Announcements
  • Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@themachinelearningpodcast.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Links

The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Sponsored By:

Declarative Machine Learning For High Performance Deep Learning Models With Predibase05 Dec 202200:59:22
Preamble

This is a cross-over episode from our new show The Machine Learning Podcast, the show about going from idea to production with machine learning.

Summary

Deep learning is a revolutionary category of machine learning that accelerates our ability to build powerful inference models. Along with that power comes a great deal of complexity in determining what neural architectures are best suited to a given task, engineering features, scaling computation, etc. Predibase is building on the successes of the Ludwig framework for declarative deep learning and Horovod for horizontally distributing model training. In this episode CTO and co-founder of Predibase, Travis Addair, explains how they are reducing the burden of model development even further with their managed service for declarative and low-code ML and how they are integrating with the growing ecosystem of solutions for the full ML lifecycle.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great!
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host is Tobias Macey and today I’m interviewing Travis Addair about Predibase, a low-code platform for building ML models in a declarative format
Interview
  • Introduction
  • How did you get involved in machine learning?
  • Can you describe what Predibase is and the story behind it?
  • Who is your target audience and how does that focus influence your user experience and feature development priorities?
  • How would you describe the semantic differences between your chosen terminology of "declarative ML" and the "autoML" nomenclature that many projects and products have adopted?
    • Another platform that launched recently with a promise of "declarative ML" is Continual. How would you characterize your relative strengths?
  • Can you describe how the Predibase platform is implemented?
    • How have the design and goals of the product changed as you worked through the initial implementation and started working with early customers?
    • The operational aspects of the ML lifecycle are still fairly nascent. How have you thought about the boundaries for your product to avoid getting drawn into scope creep while providing a happy path to delivery?
  • Ludwig is a core element of your platform. What are the other capabilities that you are layering around and on top of it to build a differentiated product?
  • In addition to the existing interfaces for Ludwig you created a new language in the form of PQL. What was the motivation for that decision?
    • How did you approach the semantic and syntactic design of the dialect?
    • What is your vision for PQL in the space of "declarative ML" that you are working to define?
  • Can you describe the available workflows for an individual or team that is using Predibase for prototyping and validating an ML model?
    • Once a model has been deemed satisfactory, what is the path to production?
  • How are you approaching governance and sustainability of Ludwig and Horovod while balancing your reliance on them in Predibase?
  • What are some of the notable investments/improvements that you have made in Ludwig during your work of building Predibase?
  • What are the most interesting, innovative, or unexpected ways that you have seen Predibase used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Predibase?
  • When is Predibase the wrong choice?
  • What do you have planned for the future of Predibase?
Contact Info Parting Question
  • From your perspective, what is the biggest barrier to adoption of machine learning today?
Closing Announcements
  • Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Links

The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Catching Up With Pyre, A Fast Type Checker For Python19 Sep 202200:51:45
Summary

Static typing versus dynamic typing is one of the oldest debates in software development. In recent years a number of dynamic languages have worked toward a middle ground by adding support for type hints. Python’s type annotations have given rise to an ecosystem of tools that use that type information to validate the correctness of programs and help identify potential bugs. At Instagram they created the Pyre project with a focus on speed to allow for scaling to huge Python projects. In this episode Shannon Zhu discusses how it is implemented, how to use it in your development process, and how it compares to other type checkers in the Python ecosystem.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Shannon Zhu about Pyre, a type checker for Python 3 built from the ground up to support gradual typing and deliver responsive incremental checks
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you describe what Pyre is and the story behind it?
  • There have been a number of tools created to support various aspects of typing for Python. How would you describe the various goals that they support and how Pyre fits in that ecosystem?
  • What are the core goals and notable features of Pyre?
  • Can you describe how Pyre is implemented?
    • How have the design and goals of the project changed/evolved since you started working on it?
  • What are the different ways that Pyre is used in the development workflow for a team or individual?
  • What are some of the challenges/roadblocks that people run into when adopting type definitions in their Python projects?
  • How has the evolution of type annotations and overall support for them affected your work on Pyre?
  • As someone who is working closely with type systems, what are the strongest aspects of Python’s implementation and opportunities for improvement?
  • What are the most interesting, innovative, or unexpected ways that you have seen Pyre used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Pyre?
  • When is Pyre the wrong choice?
  • What do you have planned for the future of Pyre?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Making The Case For A (Semi) Formal Specification Of CPython10 Nov 202000:36:41
Summary

The CPython implementation has grown and evolved significantly over the past ~25 years. In that time there have been many other projects to create compatible runtimes for your Python code. One of the challenges for these other projects is the lack of a fully documented specification of how and why everything works the way that it does. In the most recent Python language summit Mark Shannon proposed implementing a formal specification for CPython, and in this episode he shares his reasoning for why that would be helpful and what is involved in making it a reality.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Do you want to get better at Python? Now is an excellent time to take an online course. Whether you’re just learning Python or you’re looking for deep dives on topics like APIs, memory mangement, async and await, and more, our friends at Talk Python Training have a top-notch course for you. If you’re just getting started, be sure to check out the Python for Absolute Beginners course. It’s like the first year of computer science that you never took compressed into 10 fun hours of Python coding and problem solving. Go to pythonpodcast.com/talkpython today and get 10% off the course that will help you find your next level. That’s pythonpodcast.com/talkpython, and don’t forget to thank them for supporting the show.
  • Your host as usual is Tobias Macey and today I’m interviewing Mark Shannon about his efforts to create a formal specification for the CPython interpreter
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing the current state of how the Python language and the CPython runtime are defined?
  • What is your motivation in advocating for a specification?
    • After ~25 years of the language, why is now the time to pursue this effort?
    • How does the history of the language and the scope of the ecosystem and community impact the effort required to make this a reality?
  • What is involved in creating the specification and where would it be located once complete?
    • What are some examples of languages that are formally specified?
  • What are the possible benefits of creating a specification for the CPython virtual machine?
    • What is the distinction between a specification for the VM as opposed to a specification for the language?
  • What are some potential downsides to having a (semi-)formal specification become part of the definition of the interpreter?
  • Can you describe the process of doing the work to create the specification?
  • How are you approaching the actual definition of the specification (e.g. prose vs programmatic)?
    • What are the tradeoffs of prose vs. an executable specification (e.g. TLA+, Alloy)?
  • How does this work tie into your goals of improving the speed of the CPython interpreter?
  • What are some of the most interesting, unexpected, or challenging aspects of your efforts to bring this specification to CPython?
  • How can the community contribute to this effort?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Bringing Artificial Intelligence Projects From Idea To Production03 Nov 202000:47:49
Summary

Artificial intelligence applications can provide dramatic benefits to a business, but only if you can bring them from idea to production. Henrik Landgren was behind the original efforts at Spotify to leverage data for new product features, and in his current role he works on an AI system to evaluate new businesses to invest in. In this episode he shares advice on how to identify opportunities for leveraging AI to improve your business, the capabilities necessary to enable aa successful project, and some of the pitfalls to watch out for. If you are curious about how to get started with AI, or what to consider as you build a project, then this is definitely worth a listen.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Do you want to get better at Python? Now is an excellent time to take an online course. Whether you’re just learning Python or you’re looking for deep dives on topics like APIs, memory mangement, async and await, and more, our friends at Talk Python Training have a top-notch course for you. If you’re just getting started, be sure to check out the Python for Absolute Beginners course. It’s like the first year of computer science that you never took compressed into 10 fun hours of Python coding and problem solving. Go to pythonpodcast.com/talkpython today and get 10% off the course that will help you find your next level. That’s pythonpodcast.com/talkpython, and don’t forget to thank them for supporting the show.
  • Equalum’s end to end data ingestion platform is relied upon by enterprises across industries to seamlessly stream data to operational, real-time analytics and machine learning environments. Equalum combines streaming Change Data Capture, replication, complex transformations, batch processing and full data management using a no-code UI. Equalum also leverages open source data frameworks by orchestrating Apache Spark, Kafka and others under the hood. Tool consolidation and linear scalability without the legacy platform price tag. Go to pythonpodcast.com/equalum today to start a free 2 week test run of their platform, and don’t forget to tell them that we sent you.
  • Your host as usual is Tobias Macey and today I’m interviewing Henrik Landgren about his experiences building AI platforms to transform business capabilities.
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by sharing your thoughts on when, where, and how AI/ML are useful tools for a business?
  • What has been your experience in building AI platforms?
  • For organizations who are considering investing in AI capabilities, what are some alternative strategies that they might consider first?
  • What are the cases where AI is likely to be a wasted effort, or will fail to create a return on investment?
  • In order to be succesful in bringing AI products to production, what are the foundational capabilities that are necessary?
    • What have you found to be a useful composition of roles and skills for building AI products?
  • There are various statistics that all point to a remarkably low success rate for bringing AI into production. What are some of the pitfalls that organizations and engineers should be aware of when undertaking such a project?
  • What is your strategy for identifying opportunities for a successful AI product?
    • Once you have determined the possible utility for such a project, how do you approach the work of making it a reality?
  • What are the common factors in what you built at Spotify and EQT ventures?
    • Where do the two efforts diverge?
  • Your work on Motherbrain is interesting because of the fact that it is dealing in what seems to be intangible or unpredictable forces. What kinds of input are you relying on to generate useful predictions?
  • What are some of the most interesting, innovative, or unexpected uses of AI that you have seen?
  • What are some of the biggest failures of AI that you are aware of?
  • In your work at Spotify and EQT ventures, what are the most interesting, unexpected, or challenging lessons that you have learned?
  • What advice or recommendations do you have for anyone who wants to learn more about the potential for AI and the work involved in bringing it to production?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Power Up Your Java Using Python With JPype26 Oct 202000:48:40
Summary

Python and Java are two of the most popular programming languages in the world, and have both been around for over 20 years. In that time there have been numerous attempts to provide interoperability between them, with varying methods and levels of success. One such project is JPype, which allows you to use Java classes in your Python code. In this episode the current lead developer, Karl Nelson, explains why he chose it as his preferred tool for combining these ecosystems, how he and his team are using it, and when and how you might want to use it for your own projects. He also discusses the work he has done to enable use of JPype on Android, and what is in store for the future of the project. If you have ever wanted to use a library or module from Java, but the rest of your project is already in Python, then this episode is definitely worth a listen.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Karl Nelson about JPype, a language bridge that lets you use Java classes in your Python programs
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by giving an overview of what JPype is?
    • What was your motivation for becoming such a regular contributor to the project?
  • Why might someone want to be able to call into the Java ecosystem from a Python program?
  • There have been a number of other projects aiming to combine the capabilities of Java and Python, such as Jython and PyJNIus. What are the relative tradeoffs between the different options?
    • Many of those other projects have stalled or stopped altogether. What about JPype has allowed it to survive for so long?
  • Can you explain how JPype is implemented?
    • How has the design and implementation of the project evolved since it was first implemented?
    • How do the relative language versions influence the compatibility of programs on either side of the bridge?
  • What is involved in creating a project that uses JPype?
    • How are dependencies, packaging, distribution, etc. handled across the Java and Python portions of the code?
  • What are some of the ways that JPype can be used for Android applications?
  • What are some of the sharp edges or pitfalls that users of JPype should be aware of?
  • What are some of the most interesting, innovative, or unexpected ways that you have seen JPype used?
  • What have you found to be the most interesting or challenging aspects of building JPype?
  • When is JPype the wrong choice?
  • What is in store for the future of the project?
Keep In Touch Picks Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

The Journey To Replace Python's Parser And What It Means For The Future19 Oct 202001:05:49
Summary

The release of Python 3.9 introduced a new parser that paves the way for brand new features. Every programming language has its own specific syntax for representing the logic that you are trying to express. The way that the rules of the language are defined and validated is with a grammar definition, which in turn is processed by a parser. The parser that the Python language has relied on for the past 25 years has begun to show its age through mounting technical debt and a lack of flexibility in defining new syntax. In this episode Pablo Galindo and Lysandros Nikolaou explain how, together with Python’s creator Guido van Rossum, they replaced the original parser implementation with one that is more flexible and maintainable, why now was the time to make the change, and how it will influence the future evolution of the language.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Pablo Galindo and Lysandros Nikolaou about their work on replacing the parser in CPython and what that means for the language
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by discussing the role of the parser in the lifecycle of a Python program?
  • What were the limitations of the previous parser, and how did that contribute to complexity and technical debt in the CPython runtime?
  • What are the options for styles of parsers, and what are the benefits of using a PEG style grammar?
  • How does the new parser impact the approachability of the CPython code for new contributors?
  • What was the process for reimplementing the parser and guarding against regressions in the syntax?
  • As developers switch to the 3.9 release, what potential edge cases/bugs might they see from introducing the new parser?
  • What new syntax options does this parser provide for the Python language?
    • Are there any specific features that are planned for implementation in the 3.10 release that are enabled by the new parser grammar?
  • As the language evolves due to new capabilities offered by the updated parser, how will that impact other implementations such as PyPy?
  • What were the most interesting, unexpected, or challenging aspects of this project?
  • What other aspects of the CPython code do you think should be reconsidered or reimplemented in light of the changes in computing and the usage of the language?
Keep In Touch Picks Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Cloud Native Application Delivery Using GitOps12 Oct 202000:53:44
Summary

The way that applications are being built and delivered has changed dramatically in recent years with the growing trend toward cloud native software. As part of this movement toward the infrastructure and orchestration that powers your project being defined in software, a new approach to operations is gaining prominence. Commonly called GitOps, the main principle is that all of your automation code lives in version control and is executed automatically as changes are merged. In this episode Victor Farcic shares details on how that workflow brings together developers and operations engineers, the challenges that it poses, and how it influences the architecture of your software. This was an interesting look at an emerging pattern in the development and release cycle of modern applications.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Tree Schema is a data catalog that is making metadata management accessible to everyone. With Tree Schema you can create your data catalog and have it fully populated in under five minutes when using one of the many automated adapters that can connect directly to your data stores. Tree Schema includes essential cataloging features such as first class support for both tabular and unstructured data, data lineage, rich text documentation, asset tagging and more. Built from the ground up with a focus on the intersection of people and data, your entire team will find it easier to foster collaboration around your data. With the most transparent pricing in the industry – $99/mo for your entire company – and a money-back guarantee for excellent service, you’ll love Tree Schema as much as you love your data. Go to pythonpodcast.com/treeschema today to get your first month free, and mention this podcast to get %50 off your first three months after the trial.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Victor Farcic about using GitOps practices to manage your application and your infrastructure in the same workflow
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by giving an overview of what GitOps is?
  • What are the architectural or design elements that developers need to incorporate to make their applications work well in a GitOps workflow?
  • What are some of the tools that facilitate a GitOps approach to managing applications and their target environments?
  • What are some useful strategies for managing local developer environments to maintain parity with how production deployments are architected?
  • As developers acquire more resonsibility for building the automation to provision the production environment for their applications, what are some of the operations principles that they need to understand?
  • What are some of the development principles that operators and systems administrators need to acquire to be effective in contributing to an environment that is managed by GitOps?
  • What are the areas for collaboration and dividing lines of responsibility between developers and platform engineers in a GitOps environment?
  • Beyond the application development and deployment, what are some of the additional concerns that need to be built into an application in order for it to be manageable and maintainable once it is in production?
  • What are some of the organizational principles that contribute to a successful implementation of GitOps?
  • What are some of the most interesting, innovative, or unexpected ways that you have seen GitOps employed?
  • What have you found to be the most challenging aspects of creating a scalable and maintainable GitOps practice?
  • When is GitOps the wrong choice, and what are the alternatives?
  • What resources do you recommend for anyone who wants to dig deeper into this subject?
Keep In Touch Picks Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Threading The Needle Of Interesting And Informative While You Learn To Code06 Oct 202000:56:30
Summary

Learning to code is a neverending journey, which is why it’s important to find a way to stay motivated. A common refrain is to just find a project that you’re interested in building and use that goal to keep you on track. The problem with that advice is that as a new programmer, you don’t have the knowledge required to know which projects are reasonable, which are difficult, and which are effectively impossible. Steven Lott has been sharing his programming expertise as a consultant, author, and trainer for years. In this episode he shares his insights on how to help readers, students, and colleagues interested enough to learn the fundamentals without losing sight of the long term gains. He also uses his own difficulties in learning to maintain, repair, and captain his sailboat as relatable examples of the learning process and how the lessons he has learned can be translated to the process of learning a new technology or skill. This was a great conversation about the various aspects of how to learn, how to stay motivated, and how to help newcomers bridge the gap between what they want to create and what is within their grasp.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Steven F. Lott about finding a project that you care about to aid in learning to program
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by outlining your experiences working with and teaching Python?
  • Does your day-to-day experience at work suggest ways to help newcomers learn about Python?
  • How have your experiences as an author influenced your perspective on how to help newcomers become motivated to learn programming?
  • One of the common pieces of advice that I and others have given to people learning Python or other languages is to find a project that they want to build, but that’s not necessarily a practical approach. What are some of the difficulties that might come of that approach?
    • What are some strategies that you have tried for helping learners identify what kinds of project are possible and practical?
  • Beyond the difficulty of understanding what is possible and what is going to require a dedicated team of engineers to even attempt, there is the question of remaining motivated for long enough to follow through on a project in the face of syntax errors and design challenges. What can language developers and ecosystems do to improve the newcomer experience in exploring possibilities?
    • How can we make syntax errors educational and recoverable, rather than needing accrued knowledge, or hours of web searches?
  • As an author, there are complementary goals that may lead to conflict in the form of wanting to provide structured guidance and progression while allowing for creativity and experimentation. How have you approached those objectives in your books?
  • What are some of the projects that have motivated you to learn new skills?
  • What advice do you have for anyone who is working on or considering writing a book to teach a technical skill?
  • What advice do you have for anyone who is trying to learn programming or acquire a skill in a new language, platform, or framework?
  • Why are both of you movie picks black and white? Are you a film noir fan?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Solving Python Package Creation For End User Applications With PyOxidizer29 Sep 202000:49:40
Summary

Python is a powerful and expressive programming language with a vast ecosystem of incredible applications. Unfortunately, it has always been challenging to share those applications with non-technical end users. Gregory Szorc set out to solve the problem of how to put your code on someone else’s computer and have it run without having to rely on extra systems such as virtualenvs or Docker. In this episode he shares his work on PyOxidizer and how it allows you to build a self-contained Python runtime along with statically linked dependencies and the software that you want to run. He also digs into some of the edge cases in the Python language and its ecosystem that make this a challenging problem to solve, and some of the lessons that he has learned in the process. PyOxidizer is an exciting step forward in the evolution of packaging and distribution for the Python language and community.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Gregory Szorc about his work on PyOxidizer, a revolutionary new approach to building and distributing self-contained Python applications
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by giving an overview on the shortcomings of the current state of the art for distributing Python projects, both for deployment and end-user consumption?
  • What is PyOxidizer and what motivated you to create it?
  • How does PyOxidizer differ from projects such as CxFreeze, Py2Exe, or Shiv?
  • What are the characteristics of CPython and the packaging ecosystem that make it so challenging to easily distribute self-contained applications?
  • For someone using PyOxidizer, what is their workflow for building an executable that they can share with end users?
    • What are some of the edge cases or special considerations that they need to be aware of?
  • How is PyOxidizer implemented?
    • How has the design or direction evolved since you first began working on it?
  • From your experience in working on PyOxidizer, what changes would you like to see in the Python language or the CPython reference implementation?
  • What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on PyOxidizer?
  • What do you have planned for the future of PyOxidizer?
  • What are the ways that listeners can contribute to PyOxidizer?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Flexible Network Security Detection And Response With Grapl22 Sep 202000:53:32
Summary

Servers and services that have any exposure to the public internet are under a constant barrage of attacks. Network security engineers are tasked with discovering and addressing any potential breaches to their systems, which is a never-ending task as attackers continually evolve their tactics. In order to gain better visibility into complex exploits Colin O’Brien built the Grapl platform, using graph database technology to more easily discover relationships between activities within and across servers. In this episode he shares his motivations for creating a new system to discover potential security breaches, how its design simplifies the work of identifying complex attacks without relying on brittle rules, and how you can start using it to monitor your own systems today.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Colin O’Brien about Grapl, an open source platform for detection and response of system security incidents
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Grapl is and the problem that you are trying to solve with it?
    • What was your original motivation to create it?
  • What were the existing options for security detection and response, and how is Grapl differentiated from them?
  • Who is the target audience for the Grapl project?
  • How is the Grapl system architected?
    • How has the design of the system evolved since you first began working on it?
    • How much effort would it be to separate the Grapl architecture from AWS to migrate it to other environments?
  • What have you found to be the benefits of splitting the implementation of the system between Rust for the system and Python for the exploration?
    • What challenges have you faced as a result of working across those languages?
  • What data sources does Grapl use to build its graph of events within a system?
  • Can you talk through the overall workflow for someone using Grapl?
  • What are some examples of the types of exploits that you can identify with Grapl?
  • What are some of the most interesting, unexpected, or innovative ways that you have seen Grapl used?
  • What are some of the most interesting, unexpected, or challenging lessons that you have learned while building it?
  • When is Grapl the wrong choice?
  • What do you have planned for the future of Grapl?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Simplified Data Extraction And Analysis For Current Events With Newspaper15 Sep 202000:43:28
Summary

News media is an important source of information for understanding the context of the world. To make it easier to access and process the contents of news sites Lucas Ou-Yang built the Newspaper library that aids in automatic retrieval of articles and prepare it for analysis. In this episode he shares how the project got started, how it is implemented, and how you can get started with it today. He also discusses how recent improvements in the utility and ease of use of deep learning libraries open new possibilities for future iterations of the project.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Lucas Ou-Yang about Newspaper, a framework for easily extracting and processing online articles.
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what the Newspaper project is and your motivations for creating it?
  • What are the main use cases that Newspaper is built for?
    • What are some libraries or tools that Newspaper might replace?
  • What are the common structures in news sites that allow you to abstract across them for content extraction?
    • What are some ways of determining whether a site will be a good candidate for using with Newspaper?
  • Can you talk through the developer workflow of someone using Newspaper?
    • What are some of the other libraries or tools that are commonly used alongside Newspaper?
  • How is Newspaper implemented?
    • How has the design of he project evolved since you first began working on it?
    • What are some of the most complex or challenging aspects of building an automated article extraction tool?
  • What are some of the most interesting, unexpected, or innovative projects that you have seen built with Newspaper?
  • What keeps you interested in the ongoing support and maintenance of the project?
  • What do you have planned for the future of Newspaper?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Digging Into Dagster: An Opinionated Open Source Framework For Data Orchestration07 Sep 202000:59:28
Summary

Data applications are complex and continually evolving, often requiring collaboration across multiple teams. In order to keep everyone on the same page a high level abstraction is needed to facilitate a cross-cutting view of the data orchestration across integration, transformation, analytics, and machine learning. Dagster is an innovative new framework that leans on the power and flexibility of Python to provide an extensible interface to the complete lifecycle of data projects. In this episode Nick Schrock explains how he designed the Dagster project to allow for integration with the entire data ecosystem while providing an opinionated structure for connecting the different stages of computation. He also discusses how he is working to grow an open ecosystem around the Dagster project, and his thoughts on building a sustainable business on top of it without compromising the integrity of the community. This was a great conversation about playing the long game when building a business while providing a valuable utility to a complex problem domain.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Nick Schrock about Dagster, an open source data orchestrator for powering data engineering, analytics, and machine learning
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Dagster is and how it got started?
  • What are the most common difficulties that organizations face when working with data projects?
    • How does Dagster help in addressing those challenges?
  • There are a number of workflow orchestration platforms, spanning a few generations of tooling. What do you see as the defining characteristics of the various options, and how does Dagster fit in that ecosystem?
  • What are the assumptions that you made at the start of building Dagster and how have they been challenged, updated, or invalidated over the past year of working with end users?
  • How are the internals of Dagster implemented?
    • How has the design changed or evolved since you first began working on it?
  • For someone who is building on top of Dagster, what is their workflow from first steps through to production?
  • What are your guiding principles for desigining the user facing API?
  • What are the available extension points for Dagster?
  • What was your reason for implementing Dagster as a Python framework?
    • With the benefit of hindsight, would you make the same decision today?
  • What are some of the most interesting, innovative, or unexpected ways that you have seen Dagster used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while building Dagster and working to grow its ecosystem?
  • When is Dagster the wrong choice?
  • As you continue to build Dagster, what is your vision for it and its ecosystem?
    • What are the next steps that you are taking to achieve that vision?
Keep In Touch Picks
  • Tobias
  • Nick
Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Standardizing On Python For All Software Projects At Ascend.io13 Sep 202200:50:26
Summary

Every software project is subject to a series of decisions and tradeoffs. One of the first decisions to make is which programming language to use. For companies where their product is software, this is a decision that can have significant impact on their overall success. In this episode Sean Knapp discusses the languages that his team at Ascend use for building a service that powers complex and business critical data workflows. He also explains his motivation to standardize on Python for all layers of their system to improve developer productivity.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Sean Knapp about his motivations and experiences standardizing on Python for development at Ascend
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you describe what Ascend is and the story behind it?
  • How many engineers work at Ascend?
    • What are their different areas of focus?
  • What are your policies for selecting which technologies (e.g. languages, frameworks, dev tooling, deployment, etc.) are supported at Ascend?
    • What does it mean for a technology to be supported?
  • You recently started standardizing on Python as the default language for development. How has Python been used up to now?
    • What other languages are in common use at Ascend?
    • What are some of the challenges/difficulties that motivated you to establish this policy?
  • What are some of the tradeoffs that you have seen in the adoption of Python in place of your other adopted languages?
    • How are you managing ongoing maintenance of projects/products that are not written in Python?
  • What are some of the potential pitfalls/risks that you are guarding against in your investment in Python?
  • What are the most interesting, innovative, or unexpected ways that you have seen Python used where it was previously a different technology?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on aligning all of your development on a single language?
  • When is Python the wrong choice?
  • What do you have planned for the future of engineering practices at Ascend?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

When, Why, and How To Use Web Scraping In A Nutshell01 Sep 202000:41:52
Summary

The internet is a rich source of information, but a majority of it isn’t accessible programmatically through APIs or databases. To address that shortcoming there are a variety of web scraping frameworks that aid in extracting structured data from web pages. In this episode Attila Tóth shares the challenges of web data extraction, the ways that you can use it, and how Scrapy and ScrapingHub can help you with your projects.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Attila Tóth about doing data extraction with web scraping.
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what web scraping is and when you might want to use it?
    • How did you first get started with web scraping?
  • There are a number of options for web scraping tools in Python, as well as other languages. What are the characteristics of the Scrapy project and community that have made it stand out and retain such widespread popularity?
  • One of the perpetual questions with web scraping is that of copyright and content ownership. What should we all be aware of when scraping a given website?
  • What are some of the most challenging aspects of crawling and scraping the web?
    • What are some of the features of Scrapy that aid in those challenges?
  • Once you have retrieved the content from a site, what are some of the considerations for storing and processing the data that we should be thinking about?
  • How can we guard against a scraper breaking due to changes in the layout of a site, or simple updates that weren’t accounted for in the initial implementation?
  • What are some of the most complicated aspects of scaling web scrapers?
  • For someone who is interested in using Scrapy, what are some of the common pitfalls that they should be aware of?
  • What are some of the most interesting, innovative, or unexpected projects that are built with Scrapy and ScrapingHub?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working with web scrapers and ScrapingHub?
  • What resources would you recommend to anyone who is looking to learn more about web scraping?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Working In The Code Mines: Mining Software Repositories With PyDriller25 Aug 202000:40:03
Summary

A large portion of the software industry has standardized on Git as the version control sytem of choice. But have you thought about all of the information that you are generating with your branches, commits, and code changes? Davide Spadini created the PyDriller framework to simplify the work of mining software repositories to perform research on the technical and social aspects of software engineering. In this episode he shares some of the insights that you can gain by exploring the history of your code, the complexities of building a framework to interact with Git, and some of the interesting ways that PyDriller can be used to inform your own development practices.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Davide Spadini about PyDriller, a framework for mining software repositories
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what PyDriller is and how the project got started?
    • How is Pydriller different from other Git frameworks?
  • What kinds of information can you discover by mining a software repository?
    • Where and how might the collected information be used?
  • What are the limitations of the capabilities offered by Git for investigating the repository?
  • What are the additional metrics that you are able to extract using PyDriller?
  • Can you describe how PyDriller itself is implemented?
    • How has the project evolved since you first began working on it?
  • I noticed that for testing PyDriller you crafted a set of repositories to serve as test cases. What has been the most complex or challenging aspect of writing meaningful tests to ensure a reasonable coverage of this problem domain?
  • What would be required to add support for other version control systems?
  • How have you used PyDriller in your own research?
  • What are some of the most interesting, unexpected, or innovative ways that you have seen PyDriller used?
  • What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on and with PyDriller?
  • What do you have planned for the future of PyDriller?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Building The Open Data Ecosystem For Music And More At Metabrainz17 Aug 202000:48:06
Summary

The Musicbrainz project was an early entry in the movement to build an open data ecosystem. In recent years, the Metabrainz Foundation has fostered a growing ecosystem of projects to support the contribution of, and access to, metadata, listening habits, and review of music. The majority of those projects are written in Python, and in this episode Param Singh explains how they are built, how they fit together, and how they support the goals of the Metabrains Foundation. This was an interesting exporation of the work involved in building an ecosystem of open data, the challenges of making it sustainable, and the benefits of building for the long term rather than trying to achieve a quick win.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Before you put your code into production you need to make sure that it passes all of the tests, that it has been packaged with all of the dependencies, and that you haven’t introduced any security issues. Instead of running all of that on your laptop, let Codefresh handle it automatically with their continuous integration and continuous delivery platform. Built for the modern era of cloud-native computing, they make publishing to Kubernetes, serverless platforms, and virtual machines fast and seamless. With a growing library of pre-made steps, a flexible pipeline definition, and unlimited scale Codefresh lets you ship faster and safer than ever. Go to pythonpodcast.com/codefresh today to get unlimited builds on your free account.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Param Singh about the ways that Python is being used across the various Metabrainz projects
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by giving an overview of what the Metabrainz organization is and the various projects that it encompasses?
    • What are the motivations for creating those projects and some of the origin story for Metabrainz?
  • The Musicbrainz server is the longest running project and is written in Perl. What was the reason for switching to Python for all of the other *brainz projects?
  • How does the MetaBrainz Foundation sustain itself? Where do the funds come from?
    • How do you determine where and how to allocate the funding that you receive?
  • Which of the *brainz projects is the most complex or challenging to build, whether due to technical or sociological reasons?
  • How do you source and manage the information that powers all of the Metabrainz projects?
  • How is development of the various projects organized?
    • How does that influence the amount of code sharing that is possible between them?
  • Of the projects that you have been involved in, how are they architected?
    • What are the main ways that the projects differ in how they are implemented?
  • What are some of the ways that you are using Python in support of the various projects that you work on?
  • What are some of the most interesting, innovative, or unexpected ways that you have seen the projects or data built by Metabrainz being used?
  • What are some of the most interesting, unexpected, or challenging lessons that you have learned while working as a contributor and maintainer of the Metabrainz projects?
  • What is in store for the future of the existing Metabrainz projects?
  • What are the next domains that are being considered for building a Metabrainz platform for?
Keep In Touch Picks Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Growing Dask To Make Scaling Python Data Science Easier At Coiled10 Aug 202000:52:07
Summary

Python is a leading choice for data science due to the immense number of libraries and frameworks readily available to support it, but it is still difficult to scale. Dask is a framework designed to transparently run your data analysis across multiple CPU cores and multiple servers. Using Dask lifts a limitation for scaling your analytical workloads, but brings with it the complexity of server administration, deployment, and security. In this episode Matthew Rocklin and Hugo Bowne-Anderson discuss their recently formed company Coiled and how they are working to make use and maintenance of Dask in production. The share the goals for the business, their approach to building a profitable company based on open source, and the difficulties they face while growing a new team during a global pandemic.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Matthew Rocklin and Hugo Bowne-Anderson about their work building a business around the Dask ecosystem at Coiled
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you give a quick overview of what Dask is and your motivations for creating it?
    • How has Dask changed or evolved in the past 3 1/2 years since we last talked about it?
  • How has the rest of the ecosystem changed in that time?
  • After working on Dask for the past few years, what led you to the decision to build a business around it?
  • What are the sharp edges of programming for Dask that users are looking for help on solving?
  • What are the difficulties that users face in deploying and maintaining a production installation of Dask?
  • What are the limitations of Dask when scaling both up and down?
  • What are you building at Coiled to improve the user experience for users of Python and Dask?
    • What are your thoughts on the pros and cons of orienting your messaging around the scalability of Python, as opposed to focusing on a specific industry or problem domain?
  • What are the challenges that you are facing in managing the tensions between the open source and proprietary work that you are doing?
  • How are you handling the ongoing governance of the Dask project?
  • What are some of the most interesting, unexpected, or challenging lessons that you have learned while building and launching a company based on an open source project?
  • What do you have planned for the future of both Coiled and Dask?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Supporting The Full Lifecycle Of Machine Learning Projects With Metaflow04 Aug 202000:44:45
Summary

Netflix uses machine learning to power every aspect of their business. To do this effectively they have had to build extensive expertise and tooling to support their engineers. In this episode Savin Goyal discusses the work that he and his team are doing on the open source machine learning operations platform Metaflow. He shares the inspiration for building an opinionated framework for the full lifecycle of machine learning projects, how it is implemented, and how they have designed it to be extensible to allow for easy adoption by users inside and outside of Netflix. This was a great conversation about the challenges of building machine learning projects and the work being done to make it more achievable.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Savin Goyal about Netflix’s infrastructure for machine learning
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing the work you are doing at Netflix to support their machine learning workloads?
  • How are you addressing the impedance mismatch of machine learning/data science work between local experimentation and production deployment?
  • What was the motivation for building Metaflow?
    • How does Metaflow compare to other tools in the ecosystem such as MLFlow?
    • What was missing in the other available tools that made Metaflow necessary?
  • workflow for someone using Metaflow
  • How do you approach the design of the developer interface to make it approachable to machine learning engineers?
  • level of coupling with overall Netflix data stack
  • How is Metaflow implemented?
    • How has the architecture and design of the system evolved since you first began working on it?
  • supporting infrastructure/integration points
  • motivation/benefits of releasing it as open source
  • What are some of the most interesting, unexpected, or challenging lessons that you have learned while building infrastructure and tooling for machine learning?
  • When is Metaflow the wrong choice?
  • What do you have planned for the future of Metaflow and
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Learning To Program By Building Tiny Python Projects28 Jul 202000:55:00
Summary

One of the best methods for learning programming is to just build a project and see how things work first-hand. With that in mind, Ken Youens-Clark wrote a whole book of Tiny Python Projects that you can use to get started on your journey. In this episode he shares his inspiration for the book, his thoughts on the benefits of teaching testing principles and the use of linting and formatting tools, as well as the benefits of trying variations on a working program to see how it behaves. This was a great conversation about useful strategies for supporting new programmers in their efforts to learn a valuable skill.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Ken Youens-Clark about his book Tiny Python Projects
Interview
  • Introductions
  • How did you get introduced to Python?
  • What is your goal with your book of Tiny Python Projects?
    • What motivated you to start writing it?
  • Who is the target audience that you wrote the book for?
  • One of the notable aspects of the book is the fact that you introduce linting and testing in the first chapter. Why is that a useful subject for the first steps of someone getting started in Python?
    • What are some of the problems that users experience if they are introduced to these tools after they have already established a set of habits?
  • How did you approach the structure of the book to be approachable by newcomers to Python?
  • What was your process for deciding on the scope of the information to include in the book?
  • What are some of the challenges that you faced in identifying self-contained projects that could fit into a single chapter?
  • As a book that is intended to serve as a learning resource, what was your process for soliciting feedback to determine if your tone and structure is effective in teaching the reader?
  • What elements of the Python language and ecosystem did you consciously leave out to avoid overwhelming the readers?
  • What are some of the most interesting, unexpected, or challenging lessons that you learned while working on the book?
  • What are your thoughts on useful resources and next steps for readers who are interested in progressing in their use of Python?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Idiomatic Functional Programming With DRY Python21 Jul 202000:47:43
Summary

Python is an intuitive and flexible language, but that versatility can also lead to problematic designs if you’re not careful. Nikita Sobolev is the CTO of Wemake Services where he works on open source projects that encourage clean coding practices and maintainable architectures. In this episode he discusses his work on the DRY Python set of libraries and how they provide an accessible interface to functional programming patterns while maintaining an idiomatic Python interface. He also shares the story behind the wemake Python styleguide plugin for Flake8 and the benefits of strict linting rules to engender good development habits. This was a great conversation about useful practices to build software that will be easy and fun to work on.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Nikita Sobolev about his work with DRY Python and Wemake Services
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by sharing your overarching philosophies or design aesthetics for writing maintainable software?
  • What is your process for starting a new project, beginning at the design phase?
  • What are some of the challenges or shortcomings that you see in the "default" way that most developers write Python?
  • What is DRY Python is and how does it help in addressing those concerns?
    • What was your motivation for creating these projects?
  • There are a number of different projects that are being built under the DRY Python umbrella. Can you list the ones that are currently active and outline how they fit together?
  • What are some of the initial challenges that newcomers to the DRY Python libraries encounter?
  • How do you approach the design of the API and developer experience to make these development approaches more accessible?
  • What have you seen in terms of real world impact on the maintainability and extensibility of projects that you have built on top of the DRY Python components?
  • In addition to DRY Python you are also involved with development of the wemake-python-styleguide. Can you describe that projects goal and how it got started?
    • If you make the linting too restrictive then developers are likely to just ignore or disable it. What have you found to be the right balance to which rules will fail a build and which are just informational?
    • Why do you push the responsibility for things like formatting onto the developer, rather than an autoformatter such as YAPF or Black?
  • What are some of the other supporting technologies that you rely on during your development workflow?
  • What are some of the elements that you think are missing in the common toolbox for Python developers?
    • What tools are we lacking entirely?
  • What are the cases where DRY Python is the wrong choice?
  • What are your goals and plans for the future of DRY Python and the various Wemake libraries?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

The Past, Present, And Future Of The FLUFL: Barry Warsaw Shares His History With Python13 Jul 202000:51:40
Summary

Barry Warsaw has been a member of the Python community since the very beginning. His contributions to the growth of the language and its ecosystem are innumerable and diverse, earning him the title of Friendly Language Uncle For Life. In this episode he reminisces on his experiences as a core developer, a member of the Python Steering Committee, and his roles at Canonical and LinkedIn supporting the use of Python at those companies. In order to know where you are going it is always important to understand where you have been and this was a great conversation to get a sense of the history of how Python has gotten to where it is today.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • This episode of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Barry Warsaw about his role in the Python community, past, present, and future.
Interview
  • Introductions
  • How did you get introduced to Python?
  • For anyone who isn’t familiar with you, how would you characterize your role in the Python language and community?
  • What have been your main areas of focus in your role as a core developer?
    • What are some of the other forms that your contributions to the language and community have taken?
  • What are the contributions to Python that you are most proud of?
  • Looking back at the past 25 years of Python, what do you find most interesting/surprising/exciting?
  • How has the focus of the community changed or evolved since you first began using it?
  • What are you currently focused on in your role in the steering council?
  • What are the aspects of the language and community that you think need greater attention?
  • What are the core strengths of the language and community that you believe will carry it through the next 25 years?
  • In your current and previous roles you acted as a guiding force for Python. What are the main use cases for Python at LinkedIn?
    • What kinds of projects are you involved with to support the other engineers in their use of Python?
  • How much of an impact has the invisible hand of the PSU had on the overall trajectory of Python?
  • Outside of Python, what are the programming languages or communities that you look to for inspiration?
  • What are your personal goals for the future of Python?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Pure Python Configuration Management With PyInfra06 Jul 202000:43:09
Summary

Building and managing servers is a challenging task. Configuration management tools provide a framework for handling the various tasks involved, but many of them require learning a specific syntax and toolchain. PyInfra is a configuration management framework that embraces the familiarity of Pure Python, allowing you to build your own integrations easily and package it all up using the same tools that you rely on for your applications. In this episode Nick Barrett explains why he built it, how it is implemented, and the ways that you can start using it today. He also shares his vision for the future of the project and you can get involved. If you are tired of writing mountains of YAML to set up your servers then give PyInfra a try today.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • This portion of Podcast.__init__ is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Nick Barrett about PyInfra, a pure Python framework for agentless configuration management
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what PyInfra is and its origin story?
  • There are a number of options for configuration management of various levels of complexity and language options. What are the features of PyInfra that might lead someone to choose it over other systems?
  • What do you see as the major pain points in dealing with infrastructure today?
  • For someone who is using PyInfra to manage their servers, what is the workflow for building and testing deployments?
  • How do you handle enforcement of idempotency in the operations being performed?
  • Can you describe how PyInfra is implemented?
    • How has its design or focus evolved since you first began working on it?
    • What are some of the initial assumptions that you had at the outset which have been challenged or updated as it has grown?
  • The library of available operations seems to have a good baseline for deploying and managing services. What is involved in extending or adding operations to PyInfra?
  • With the focus of the project being on its use of pure Python and the easy integration of external libraries, how do you handle execution of python functions on remote hosts that requires external dependencies?
  • What are some of the other options for interfacing with or extending PyInfra?
  • What are some of the edge cases or points of confusion that users of PyInfra should be aware of?
  • What has been the community response from developers who first encounter and trial PyInfra?
  • What have you found to be the most interesting, unexpected, or challenging aspects of building and maintaining PyInfra?
  • When is PyInfra the wrong choice for managing infrastructure?
  • What do you have planned for the future of the project?
Keep In Touch Picks Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Build Your Own Domain Specific Language in Python With textX30 Jun 202000:54:18
Summary

Programming languages are a powerful tool and can be used to create all manner of applications, however sometimes their syntax is more cumbersome than necessary. For some industries or subject areas there is already an agreed upon set of concepts that can be used to express your logic. For those cases you can create a Domain Specific Language, or DSL to make it easier to write programs that can express the necessary logic with a custom syntax. In this episode Igor Dejanović shares his work on textX and how you can use it to build your own DSLs with Python. He explains his motivations for creating it, how it compares to other tools in the Python ecosystem for building parsers, and how you can use it to build your own custom languages.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Igor Dejanović about textX, a meta-language for building domain specific languges in Python
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what a domain specific language is and some examples of when you might need one?
  • What is textX and what was your motivation for creating it?
  • There are a number of other libraries in the Python ecosystem for building parsers, and for creating DSLs. What are the features of textX that might lead someone to choose it over the other options?
  • What are some of the challenges that face language designers when constructing the syntax of their DSL?
  • Beyond being able to parse and process an arbitrary syntax, there are other concerns for consumers of the definition in terms of tooling. How does textX provide support to those end users?
  • How is textX implemented?
    • How has the design or goals of textX changed since you first began working on it?
  • What is the workflow for someone using textX to build their own DSL?
    • Once they have defined the grammar, how do they distribute the generated interpreter for others to use?
  • What are some of the common challenges that users of textX face when trying to define their DSL?
  • What are some of the cases where a PEG parser is unable to unambiguously process a defined grammar?
  • What are some of the most interesting/innovative/unexpected ways that you have seen textX used?
  • What have you found to be the most interesting, unexpected, or challenging lessons that you have learned while building and maintaining textX and its associated projects?
  • While preparing for this interview I noticed that you have another parser library in the form of Parglare. How has your experience working with textX informed your designs of that project?
    • What lessons have you taken back from Parglare into textX?
  • When is textX the wrong choice, and someone might be better served by another DSL library, different style of parser, or just hand-crafting a simple parser with a regex?
  • What do you have planned for the future of textX?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Exploring The Process And Practice Of Building Better Software Through Code Reviews05 Sep 202200:57:24
Summary

Writing code is only one piece of creating good software. Code reviews are an important step in the process of building applications that are maintainable and sustainable. In this episode On Freund shares his thoughts on the myriad purposes that code reviews serve, as well as exploring some of the patterns and anti-patterns that grow up around a seemingly simple process.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing On Freund about the intricacies and importance of code reviews
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by giving us your description of what a code review is?
    • What is the purpose of the code review?
  • At face value a code review appears to be a simple task. What are some of the subtleties that become evident with time and experience?
  • What are some of the ways that code reviews can go wrong?
  • What are some common anti-patterns that get applied to code reviews?
  • What are the elements of code review that are useful to automate?
    • What are some of the risks/bad habits that can result from overdoing automated checks/fixes or over-reliance on those tools in code reviews?
  • identifying who can/should do a review for a piece of code
  • how to use code reviews as a teaching tool for new/junior engineers
  • how to use code reviews for avoiding siloed experience/promoting cross-training
  • PR templates for capturing relevant context
  • What are the most interesting, innovative, or unexpected ways that you have seen code reviews used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while leading and supporting engineering teams?
  • What are some resources that you recommend for anyone who wants to learn more about code review strategies and how to use them to scale their teams?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Adding Observability To Your Python Applications With OpenTelemetry23 Jun 202000:53:44
Summary

Once you release an application into production it can be difficult to understand all of the ways that it is interacting with the systems that it integrates with. The OpenTracing project and its accompanying ecosystem of technologies aims to make observability of your systems more accessible. In this episode Austin Parker and Alex Boten explain how the correlation of tracing and metrics collection improves visibility of how your software is behaving, how you can use the Python SDK to automatically instrument your applications, and their vision for the future of observability as the OpenTelemetry standard gains broader adoption.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Austin Parker and Alex Boten about the OpenTelemetry project and its efforts to standardize the collection and analysis of observability data for your applications
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what OpenTelemetry is and some of the story behind it?
  • How do you define observability and in what ways is it separate from the "traditional" approach to monitoring?
  • What are the goals of the OpenTelemetry project?
  • For someone who wants to begin using OpenTelemetry clients in their Python application, what is the process of integrating it into their application?
  • How does the definition and adoption of a cross-language standard for telemetry data benefit the broader software community?
    • How do you avoid the trap of limiting the whole ecosystem to the lowest common denominator?
  • What types of information are you focused on collecting and analyzing to gain insights into the behavior of applications and systems?
    • What are some of the challenges that are commonly faced in interpreting the collected data?
  • With so many implementations of the specification, how are you addressing issues of feature parity?
  • For the Python SDK, how is it implemented?
    • What are some of the initial designs or assumptions that have had to be revised or reconsidered as it gains adoption?
  • What is your approach to integration with the broader ecosystem of tools and frameworks in the Python community?
  • What are some of the interesting or unexpected challenges that you have faced or lessons that you have learned while working on instrumentation of Python projects?
  • Once an application is instrumented, what are the options for delivering and storing the collected data?
  • What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on and with the OpenTelemetry ecosystem?
  • What are some of the most interesting, innovative, or unexpected ways that you have seen components in the OpenTelemetry ecosystem used?
  • When is OpenTelemetry the wrong choice?
  • What is in store for the future of the OpenTelemetry project?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Build A Personal Knowledge Store With Topic Modeling In Contextualize15 Jun 202000:58:06
Summary

Our thought patterns are rarely linear or hierarchical, instead following threads of related topics in unpredictable directions. Topic modeling is an approach to knowledge management which allows for forming a graph of associations to make capturing and organizing your thoughts more natural. In this episode Brett Kromkamp shares his work on the Contextualize project and how you can use it for building your own topic models. He explains why he wrote a new topic modeling engine, how it is architected, and how it compares to other systems for organizing information. Once you are done listening you can take Contextualize for a test run for free with his hosted instance.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Brett Kromkamp about Contextualise, a topic modeling application that helps you build a mind map for information-heavy projects
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Contextualize is and some of the types of projects that it can be used for?
    • What was your motivation for creating it?
  • How do you use topic maps in your own work and creative endeavors?
  • The space of personal note-taking and knowledge management is vast and varied. What does Contextualize do well that you have been unable to find or implement in other tools?
  • For someone using Contextualize, what does that workflow look like?
  • How are you approaching integration with different creative contexts (e.g. text editors, graphics editors, word processing, etc.)?
  • Can you describe how Contextualize is implemented?
    • How has the design evolved since you first began working on it?
  • In the documentation for Contextualize it mentions that this is the latest in a string of topic mapping platforms that you have built. What are some of the lessons that you have learned from previous efforts that have influenced the design of this one?
  • One of the challenges with many knowledge management tools is that they are proscriptive in how to work with them. In what ways has your own preference for how to interact with information influenced the direction of Contextualize?
    • Being an open source application, how has its exposure to the public directed your software and user design?
  • How do you approach the challenge of reducing friction in adding content and relations while allowing for flexibility and context management?
  • What are some of the projects that you are using Contextualize for?
  • What are your thoughts on the utility of something like Contextualize for capturing and organizing the collective knowledge of a team of collaborators, whether in a work or casual context?
  • What have you found to be the most interesting, complex, or complicated aspects of building a topic mapping platform?
  • When is Contextualize the wrong choice?
  • What do you have planned for the future of the project?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Open Source Product Analytics With PostHog08 Jun 202000:49:09
Summary

You spend a lot of time and energy on building a great application, but do you know how it’s actually being used? Using a product analytics tool lets you gain visibility into what your users find helpful so that you can prioritize feature development and optimize customer experience. In this episode PostHog CTO Tim Glaser shares his experience building an open source product analytics platform to make it easier and more accessible to understand your product. He shares the story of how and why PostHog was created, how to incorporate it into your projects, the benefits of providing it as open source, and how it is implemented. If you are tired of fighting with your user analytics tools, or unwilling to entrust your data to a third party, then have a listen and then test out PostHog for yourself.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • You listen to this show because you love Python and want to keep your skills up to date, and machine learning is finding its way into every aspect of software engineering. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. Podcast.__init__ is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to pythonpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll.
  • Your host as usual is Tobias Macey and today I’m interviewing Tim Glaser about PostHog, an open source platform for product analytics
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what PostHog is and what motivated you to build it?
  • What are the goals of PostHog and who are the target audience?
  • In the description of PostHog it mentions being a product focused analytics platform, as opposed to session based. What are the meaningful differences between the two?
  • Customer analytics is a rather crowded market, with a large number of both commercial and open source offerings (e.g. Google Analytics, Heap, Matomo, Snowplow, etc.). How does PostHog fit in that landscape and what are the differentiating factors that would lead someone to select it over the alternativs?
  • For anyone interested in using PostHog, do you offer a migration path from other platforms?
  • necessary features for a customer analytics tool
  • privacy and security issues around analytics
  • How is PostHog implemented and how has its design evolved since you first began building it?
    • reason for choosing Python
    • benefits of Django
  • thoughts on introducing Channels
  • option to include it as a pluggable Django app
  • integration points
  • data lake integration
  • challenges of providing understandable statistics and exposing options for detailed analysis
  • Having data about how users are interacting with your site or application is interesting, but how does it help in determining the useful actions to drive success?
  • business model and project governance
  • What are the most complex, complicated, or misunderstood aspects of building a product analytics platform?
  • What have you found to be the most interesting, unexpected, or challenging lessons that you have learned in the process of building PostHog?
  • When is PostHog the wrong choice?
  • What do you have planned for the future of PostHog?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Extending The Life Of Python 2 Projects With Tauthon02 Jun 202000:33:08
Summary

The divide between Python 2 and 3 lasted a long time, and in recent years all of the new features were added to version 3. To help bridge the gap and extend the viability of version 2 Naftali Harris created Tauthon, a fork of Python 2 that backports features from Python 3. In this episode he explains his motivation for creating it, the process of maintaining it and backporting features, and the ways that it is being used by developers who are unable to make the leap. This was an interesting look at how things might have been if the elusive Python 2.8 had been created as a more gentle transition.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • You listen to this show because you love Python and want to keep your skills up to date, and machine learning is finding its way into every aspect of software engineering. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. Podcast.__init__ is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to pythonpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll.
  • Your host as usual is Tobias Macey and today I’m interviewing Naftali Harris about his work on Tauthon, a fork of Python 2 that backports features from Python 3
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Tauthon is and your motivations for creating it?
    • What’s the story behind the name?
  • What types of applications and environments are you using Tauthon in?
  • How much adoption of Tauthon have you seen?
    • What are some of the different ways that your users are employing it?
  • Is this the missing "2.8" release? In other words, is this intended to be a bridge for simplifying the migration of existing Python 2 code to Python 3, or as an extended support window for Python 2?
  • What features have you backported from Python 3?
    • What is your process for identifying and prioritizing features to bring into Tauthon?
  • What is your workflow for implementing the backported functionality in Tauthon?
  • What are some of the cases where you have had to compromise on the functionality or syntax of a feature that you have backported in order to fit into Python 2?
    • What is your governing philosophy for how to manage syntax or behavior differences between Python 2 and 3?
    • What have been the most challenging features to backport and maintain?
    • What are some of the ways that Tauthon might break existing Python 2 code?
  • What is the story for compatibility with libraries that are Python 3 only?
  • What have you seen in terms of adoption of Tauthon?
    • Do you have any sense of the commonalities among those users?
  • What are some of the ecosystem challenges that faces users of Tauthon? (e.g. Pip support, package compatibility, etc.)
  • What are some of the most interesting, unexpected, or challenging lessons that you have learned in the process of creating and maintaining Tauthon?
  • What are your long-term plans for Tauthon, and how have they changed since you first started working on it?
Keep In Touch Picks Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Dependency Management Improvements In Pip's Resolver25 May 202001:16:31
Summary

Dependency management in Python has taken a long and winding path, which has led to the current dominance of Pip. One of the remaining shortcomings is the lack of a robust mechanism for resolving the package and version constraints that are necessary to produce a working system. Thankfully, the Python Software Foundation has funded an effort to upgrade the dependency resolution algorithm and user experience of Pip. In this episode the engineers working on these improvements, Pradyun Gedam, Tzu-Ping Chung, and Paul Moore, discuss the history of Pip, the challenges of dependency management in Python, and the benefits that surrounding projects will gain from a more robust resolution algorithm. This is an exciting development for the Python ecosystem, so listen now and then provide feedback on how the new resolver is working for you.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show because you love Python and want to keep your skills up to date, and machine learning is finding its way into every aspect of software engineering. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. Podcast.__init__ is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to pythonpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll.
  • Your host as usual is Tobias Macey and today I’m interviewing Tzu-ping Chung, Pradyun Gedam, and Paul Moore about their work to improve the dependency resolution capabilities of Pip and its user experience
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing the focus of the work that you are doing?
    • What is the scope of the work, and what is the established criteria for when it is considered complete?
  • What is your history with working on the Pip source code and what interests you most about this project?
  • What are the main sources or manifestations of technical debt that exist in Pip as of today?
    • How does it currently handle dependency resolution?
  • What are some of the workarounds that developers have had to resort to in the absence of a robust dependency resolver in Pip?
  • How is the new dependency resolver implemented?
    • How has your initial design evolved or shifted as you have gotten further along in its implementation?
  • What are the pieces of information that the resolver will rely on for determining which packages and versions to install? (e.g. will it install setuptools > 45.x in a Python 2 virtualenv?)
  • What are the new capabilities in Pip that will be enabled by this upgrade to the dependency resolver?
  • What projects or features in the encompassing ecosystem will be unblocked with the introduction of this upgrade?
  • What are some of the changes that users will need to make to adopt the updated Pip?
  • How do you anticipate the changes in Pip impacting the viability or adoption of Python and its ecosystem within different communities or industries?
  • What are some of the additional changes or improvements that you would like to see in Pip or other core elements of the Python landscape?
  • What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on these updates to Pip?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Easy Data Validation For Your Python Projects With Pydantic18 May 202000:47:15
Summary

One of the most common causes of bugs is incorrect data being passed throughout your program. Pydantic is a library that provides runtime checking and validation of the information that you rely on in your code. In this episode Samuel Colvin explains why he created it, the interesting and useful ways that it can be used, and how to integrate it into your own projects. If you are tired of unhelpful errors due to bad data then listen now and try it out today.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show because you love Python and want to keep your skills up to date. Machine learning is finding its way into every aspect of software engineering. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. Podcast.__init__ is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to pythonpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll.
  • Your host as usual is Tobias Macey and today I’m interviewing Samuel Colvin about Pydantic, a library for enforcing type hints at runtime
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Pydantic is and what motivated you to create it?
  • What are the main use cases that benefit from Pydantic?
  • There are a number of libraries in the Python ecosystem to handle various conventions or "best practices" for settings management. How does pydantic fit in that category and why might someone choose to use it over the other options?
  • There are also a number of libraries for defining data schemas or validation such as Marshmallow and Cerberus. How does Pydantic compare to the available options for those cases?
    • What are some of the challenges, whether technical or conceptual, that you face in building a library to address both of these areas?
  • The 3.7 release of Python added built in support for dataclasses as a means of building containers for data with type validation. What are the tradeoffs of pydantic vs the built in dataclass functionality?
  • How much overhead does pydantic add for doing runtime validation of the modelled data?
  • In the documentation there is a nuanced point that you make about parsing vs validation and your choices as to what to support in pydantic. Why is that a necessary distinction to make?
    • What are the limitations in terms of usage that you are accepting by choosing to allow for implicit conversion or potentially silent loss of precision in the parsed data?
    • What are the benefits of punting on the strict validation of data out of the box?
  • What has been your design philosophy for constructing the user facing API?
  • How is Pydantic implemented and how has the overall architecture evolved since you first began working on it?
    • What have you found to be the most challenging aspects of building a library for managing the consistency of data structures in a dynamic language?
      • What are some of the strengths and weaknesses of Python’s type system?
  • What is the workflow for a developer who is using Pydantic in their code?
    • What are some of the pitfalls or edge cases that they might run into?
  • What is involved in integrating with other libraries/frameworks such as Django for web development or Dagster for building data pipelines?
  • What are some of the more advanced capabilities or use cases of Pydantic that are less obvious?
  • What are some of the features or capabilities of Pydantic that are often overlooked which you think should be used more frequently?
  • What are some of the most interesting, innovative, or unexpected ways that you have seen Pydantic used?
  • What are some of the most interesting, challenging, or unexpected lessons that you have learned through your work on or with Pydantic?
  • When is Pydantic the wrong choice?
  • What do you have planned for the future of the project?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Managing Distributed Teams In The Age Of Remote Work11 May 202000:48:45
Summary

More of us are working remotely than ever before, many with no prior experience with a remote work environment. In this episode Quinn Slack discusses his thoughts and experience of running Sourcegraph as a fully distributed company. He covers the lessons that he has learned in moving from partially to fully remote, the practices that have worked well in managing a distributed workforce, and the challenges that he has faced in the process. If you are struggling with your remote work situation then this conversation has some useful tips and references for further reading to help you be successful in the current environment.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You monitor your website to make sure that you’re the first to know when something goes wrong, but what about your data? Tidy Data is the DataOps monitoring platform that you’ve been missing. With real time alerts for problems in your databases, ETL pipelines, or data warehouse, and integrations with Slack, Pagerduty, and custom webhooks you can fix the errors before they become a problem. Go to pythonpodcast.com/tidydata today and get started for free with no credit card required.
  • Your host as usual is Tobias Macey and today I’m interviewing Quinn Slack about his experience managing a fully remote company and useful tips for remote work
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by giving an overview of the team structure at Sourcegraph?
  • You recently moved to being fully remote. What was the motivating factor and how has it changed your personal workflow?
    • What is your prior history with working remote?
  • team practices for visibility of progress
  • impact of remote teams on how code is written and organized
    • reducing review burden by writing clearer code
  • structuring meetings when remote
  • points of friction for remote developer teams
  • benefits of being fully remote
  • incentivizing documentation
  • compensation structure
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Maintainable Infrastructure As Code In Pure Python With Pulumi04 May 202001:00:55
Summary

After you write your application, you need a way to make it available to your users. These days, that usually means deploying it to a cloud provider, whether that’s a virtual server, a serverless platform, or a Kubernetes cluster. To manage the increasingly dynamic and flexible options for running software in production, we have turned to building infrastructure as code. Pulumi is an open source framework that lets you use your favorite language to build scalable and maintainable systems out of cloud infrastructure. In this episode Luke Hoban, CTO of Pulumi, explains how it differs from other frameworks for interacting with infrastructure platforms, the benefits of using a full programming language for treating infrastructure as code, and how you can get started with it today. If you are getting frustrated with switching contexts when working between the application you are building and the systems that it runs on, then listen now and then give Pulumi a try.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You monitor your website to make sure that you’re the first to know when something goes wrong, but what about your data? Tidy Data is the DataOps monitoring platform that you’ve been missing. With real time alerts for problems in your databases, ETL pipelines, or data warehouse, and integrations with Slack, Pagerduty, and custom webhooks you can fix the errors before they become a problem. Go to pythonpodcast.com/tidydata today and get started for free with no credit card required.
  • Your host as usual is Tobias Macey and today I’m interviewing Luke Hoban about building and maintaining infrastructure as code with Pulumi
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing the concept of "infrastructure as code"?
  • What is Pulumi and what is the story behind it?
    • Where does the name come from?
    • How does Pulumi compare to other infrastructure as code frameworks, such as Terraform?
  • What are some of the common challenges in managing infrastructure as code?
    • How does use of a full programming language help in addressing those challenges?
    • What are some of the dangers of using a full language to manage infrastructure?
      • How does Pulumi work to avoid those dangers?
  • Why is maintaining a record of the provisioned state of your infrastructure necessary, as opposed to relying on the state contained by the infrastructure provider?
    • What are some of the design principles and constraints that developers should be considering as they architect their infrastructure with Pulumi?
  • Can you describe how Pulumi is implemented?
    • How does Pulumi manage support for multiple languages while maintaining feature parity across them?
    • How do you manage testing and validation of the different providers?
  • The strength of any tool is largely measured in the ecosystem that exists around it, which is one of the reasons that Terraform has been so successful. How are you approaching the problem of bootstrapping the community and prioritizing platform support?
  • Can you talk through the workflow of working with Pulumi to build and maintain a proper infrastructure?
  • What are some of the ways to approach testing of infrastructure code?
    • What does the CI/CD lifecycle for infrastructure look like?
  • What are the limitations of infrastructure as code?
    • How do configuration management tools fit with frameworks such as Pulumi?
  • The core framework of Pulumi is open source, and your business model is focused around a managed platform for tracking state. How are you approaching governance of the project to ensure its continued viability and growth?
  • What are some of the most interesting, innovative, or unexpected design patterns that you have seen your users include in their infrastructure projects?
  • When is Pulumi the wrong choice?
  • What do you have planned for the future of Pulumi?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Teaching Python Machine Learning28 Apr 202000:49:25
Summary

Python has become a major player in the machine learning industry, with a variety of widely used frameworks. In addition to the technical resources that make it easy to build powerful models, there is also a sizable library of educational resources to help you get up to speed. Sebastian Raschka’s contribution of the Python Machine Learning book has come to be widely regarded as one of the best references for newcomers to the field. In this episode he shares his experiences as an author, his views on why Python is the right language for building machine learning applications, and the insights that he has gained from teaching and contributing to the field.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Sebastian Raschka about his experiences writing the popular Python Machine Learning book
Interview
  • Introductions
  • How did you get introduced to Python?
  • How did you get started in machine learning?
    • What were the concepts that you found most difficult in your career with statistics and machine learning?
  • One of your notable contributions to the field is your book "Python Machine Learning". What inspired you to write the initial version?
    • How did you approach the challenge of striking the right balance of depth, breadth, and accessibility for the content?
    • What was your process for determining which aspects of machine learning to include?
  • You have made 3 editions of the book from 2015 through December of 2019. In what ways has the book changed?
    • What are the biggest changes to the ecosystem and approaches to ML in that timeframe?
  • What are the fundamental challenges of developing machine learning projects that continue to present themselves?
    • What new difficulties have arisen with the introduction of new technologies and the rise of deep learning?
  • What are some of the ways that the Python language lends itself to analytical work?
    • What are its shortcomings and how has the community worked around them?
    • What do you see as the biggest risks to the popularity of Python in the data and analytics space?
  • What are some of the common pitfalls that your readers and students face while learning about different aspects of machine learning?
  • What are some of the industries that can benefit most from applications of machine learning?
  • What are you most excited about in the applications or capabilities of machine learning?
    • What are you most worried about?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Build The Next Generation Of Python Web Applications With FastAPI20 Apr 202000:58:34
Summary

Python has an embarrasment of riches when it comes to web frameworks, each with their own particular strengths. FastAPI is a new entrant that has been quickly gaining popularity as a performant and easy to use toolchain for building RESTful web services. In this episode Sebastián Ramirez shares the story of the frustrations that led him to create a new framework, how he put in the extra effort to make the developer experience as smooth and painless as possible, and how he embraces extensability with lightweight dependency injection and a straightforward plugin interface. If you are starting a new web application today then FastAPI should be at the top of your list.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Sebastián Ramirez about FastAPI, a framework for building production ready APIs in Python 3
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what FastAPI is?
    • What are the main frustrations that you ran into with other frameworks that motivated you to create an entirely new one?
  • What are some of the main use cases that FastAPI is designed for?
  • Many web frameworks focus on managing the end-to-end functionality of a website, including the UI. Why did you focus on just API capabilities?
    • What are the benefits of building an API only framework?
    • If you wanted to integrate a presentation layer, what would be involved in that effort?
  • What API formats does FastAPI support?
    • What would be involved in adding support for additional specifications such as GraphQL or JSON-LD?
  • There are a huge number of web frameworks available just in the Python ecosystem. How does FastAPI fit into that landscape and why might someone choose it over the other options?
  • Can you share your design philosophy for the project?
    • What are your main sources of inspiration for the framework?
    • You have also built the Typer CLI library which you refer to as the little sibling of FastAPI. How have your experiences building these two projects influenced their counterpart’s evolution?
  • What are the benefits of incorporating type annotations into a web framework and in what ways do they manifest in its functionality?
  • What is the workflow for a developer building a complex application in FastAPI?
  • Can you describe how FastAPI itself is architected and how its design has evolved since you first began working on it?
    • What are the extension points that are available for someone to build plugins for FastAPI?
  • What are some of the challenges that you have faced in building an async framework that is leveraging the new ASGI specification?
  • What are some sharp edges that users should keep an eye out for?
  • What are some unique or underutilized features of FastAPI that users might not be aware of?
  • What are some of the most interesting, unexpected, or innovative ways that you have seen FastAPI used?
  • When is FastAPI the wrong choice?
  • What are some of the most interesting, unexpected, or challenging lessons that you have learned in the process of building and maintaining FastAPI?
  • What do you have planned for the future of the project?
Keep In Touch

@tiangolo on Twitter. @tiangolo on GitHub.

Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Ship With Confidence By Automating Quality Assurance28 Aug 202201:09:05
Summary

Quality assurance in the software industry has become a shared responsibility in most organizations. Given the rapid pace of development and delivery it can be challenging to ensure that your application is still working the way it’s supposed to with each release. In this episode Jonathon Wright discusses the role of quality assurance in modern software teams and how automation can help.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Jonathon Wright about the role of automation in your testing and QA strategies
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you share your relationship with software testing/QA and automation?
  • What are the main categories of how companies and software teams address testing and validation of their applications?
    • What are some of the notable tradeoffs/challenges among those approaches?
  • With the increased adoption of agile practices and the "shift left" mentality of DevOps, who is responsible for software quality?
    • What are some of the cases where a discrete QA role or team becomes necessary? (or is it always necessary?)
  • With testing and validation being a shared responsibility, competing with other priorities, what role does automation play?
    • What are some of the ways that automation manifests in software quality and testing?
    • How is automation distinct from software tests and CI/CD?
  • For teams who are investing in automation for their applications, what are the questions they should be asking to identify what solutions to adopt? (what are the decision points in the build vs. buy equation?)
  • At what stage(s) of the software lifecycle does automation live?
  • What is the process for identifying which capabilities and interactions to target during the initial application of automation for QA and validation?
  • One of the perennial challenges with any software testing, particularly for anything in the UI, is that it is a constantly moving target. What are some of the patterns and techniques, both from a developer and tooling perspective, that increase the robustness of automated validation?
  • What are the most interesting, innovative, or unexpected ways that you have seen automation used for QA?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on QA and automation?
  • When is automation the wrong choice?
  • What are some of the resources that you recommend for anyone who wants to learn more about this topic?
Keep In Touch Picks Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Distributed Computing In Python Made Easy With Ray14 Apr 202000:41:00
Summary

Distributed computing is a powerful tool for increasing the speed and performance of your applications, but it is also a complex and difficult undertaking. While performing research for his PhD, Robert Nishihara ran up against this reality. Rather than cobbling together another single purpose system, he built what ultimately became Ray to make scaling Python projects to multiple cores and across machines easy. In this episode he explains how Ray allows you to scale your code easily, how to use it in your own projects, and his ambitions to power the next wave of distributed systems at Anyscale. If you are running into scaling limitations in your Python projects for machine learning, scientific computing, or anything else, then give this a listen and then try it out!

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Robert Nishihara about Ray, a framework for building and running distributed applications and machine learning
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Ray is and how the project got started?
    • How did the environment of the RISE lab factor into the early design and development of Ray?
  • What are some of the main use cases that you were initially targeting with Ray?
    • Now that it has been publicly available for some time, what are some of the ways that it is being used which you didn’t originally anticipate?
  • What are the limitations for the types of workloads that can be run with Ray, or any edge cases that developers should be aware of?
  • For someone who is building on top of ray, what is involved in either converting an existing application to take advantage of Ray’s parallelism, or creating a greenfield project with it?
  • Can you describe how Ray itself is implemented and how it has evolved since you first began working on it?
  • How does the clustering and task distriubtion mechanism in Ray work?
  • How does the increased parallelism that Ray offers help with machine learning workloads?
    • Are there any types of ML/AI that are easier to do in this context?
  • What are some of the additional layers or libraries that have been built on top of the functionality of Ray?
  • What are some of the most interesting, challenging, or complex aspects of building and maintaining Ray?
  • You and your co-founders recently announced the formation of Anyscale to support the future development of Ray. What is your business model and how are you approaching the governance of Ray and its ecosystem?
  • What are some of the most interesting or unexpected projects that you have seen built with Ray?
  • What are some cases where Ray is the wrong choice?
  • What do you have planned for the future of Ray and Anyscale?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Building The Seq Language For Bioinformatics07 Apr 202000:36:25
Summary

Bioinformatics is a complex and computationally demanding domain. The intuitive syntax of Python and extensive set of libraries make it a great language for bioinformatics projects, but it is hampered by the need for computational efficiency. Ariya Shajii created the Seq language to bridge the divide between the performance of languages like C and C++ and the ecosystem of Python with built-in support for commonly used genomics algorithms. In this episode he describes his motivation for creating a new language, how it is implemented, and how it is being used in the life sciences. If you are interested in experimenting with sequencing data then give this a listen and then give Seq a try!

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on great conferences. And now, the events are coming to you, with no travel necessary! We have partnered with organizations such as ODSC, and Data Council. Upcoming events include the Observe 20/20 virtual conference on April 6th and ODSC East which has also gone virtual starting April 16th. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Ariya Shajii about Seq, a programming language built for bioinformatics and inspired by Python
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Seq is and your motivation for creating it?
    • What was lacking in other languages or libraries for your use case that is made easier by creating a custom language?
    • If someone is already working in Python, possibly using BioPython, what might motivate them to consider migrating their work to Seq?
  • Can you give an impression of the scope and nature of the tasks or projects that a biologist or geneticist might build with Seq?
  • What was your process for identifying and prioritizing features and algorithms that would be beneficial to the target audience?
  • For someone using Seq can you describe their workflow and how it might differ from performing the same task in Python?
  • How is Seq implemented?
    • What are some of the features that are included to simplify the work of bioinformatics?
    • What was your process of designing the language and runtime?
    • How has the scope or direction of the project evolved since it was first conceived?
  • What impact do you anticipate Seq having on the domain of bioinformatics and genomics?
  • What have you found to be the most interesting, unexpected, and/or challenging aspects of building a language for this problem domain?
  • What is in store for the future of Seq?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

An Open Source Toolchain For Natural Language Processing From Explosion AI30 Mar 202000:51:20
Summary

The state of the art in natural language processing is a constantly moving target. With the rise of deep learning, previously cutting edge techniques have given way to robust language models. Through it all the team at Explosion AI have built a strong presence with the trifecta of SpaCy, Thinc, and Prodigy to support fast and flexible data labeling to feed deep learning models and performant and scalable text processing. In this episode founder and open source author Matthew Honnibal shares his experience growing a business around cutting edge open source libraries for the machine learning developent process.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on great conferences. And now, the events are coming to you, with no travel necessary! We have partnered with organizations such as ODSC, and Data Council. Upcoming events include the Observe 20/20 virtual conference on April 6th and ODSC East which has also gone virtual starting April 16th. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Matthew Honnibal about the Thinc and Prodigy tools and an update on SpaCy
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by giving an overview of your mission at Explosion?
  • We spoke previously about your work on SpaCy. What has changed in the past 3 1/2 years?
    • How have recent innovations in language models such as BERT and GPT-2 influenced the direction or implementation of the project?
  • When I last looked SpaCy only supported English and German, but you have added several new languages. What are the most challenging aspects of building the additional models?
    • What would be required for supporting symbolic or right-to-left languages?
  • How has the ecosystem for language processing in Python shifted or evolved since you first introduced SpaCy?
  • Another project that you have released is Prodigy to support labelling of datasets. Can you talk through the motivation for creating it and describe the workflow for someone using it?
    • What was lacking in the other annotation tools that you have worked with that you are trying to solve for in Prodigy?
  • What are some of the most challenging or problematic aspects of labelling data sets for use in machine learning projects?
    • What is a typical scale of data that can be reasonably handled by an individual or small team working with Prodigy?
      • At what point do you find that it makes sense to use a labeling service rather than generating the labels yourself?
  • Your most recent project is Thinc for building and using deep learning models. What was the motivation for creating it and what problem does it solve in the ecosystem?
    • How does its design and usage compare to other deep learning frameworks such as PyTorch and Tensorflow?
    • How does it compare to projects such as Keras that abstract across those frameworks?
  • How do the SpaCy, Prodigy, and Thinc libraries work together?
  • What are some of the biggest challenges that you are facing in building open source tools to meet the needs of data scientists and machine learning engineers?
  • What are some of the most interesting or impressive projects that you have seen built with the tools your team is creating?
  • What do you have planned for the future of Explosion, SpaCy, Prodigy, and Thinc?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

A Flexible Open Source ERP Framework To Run Your Business23 Mar 202001:07:33
Summary

Running a successful business requires some method of organizing the information about all of the processes and activity that take place. Tryton is an open source, modular ERP framework that is built for the flexibility needed to fit your organization, rather than requiring you to model your workflows to match the software. In this episode core developers Nicolas Évrard and Cédric Krier are joined by avid user Jonathan Levy to discuss the history of the project, how it is being used, and the myriad ways that you can adapt it to suit your needs. If you are struggling to keep a consistent view of your business and ensure that all of the necessary workflows are being observed then listen now and give Tryton a try.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Nicolas Évrard, Cédric Krier, and Jonathan Levy about Tryton
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Tryton is and how it got started?
  • What kinds of businesses is Tryton most suited to?
    • What kinds of businesses is Tryton not a good fit for?
  • Within a business, who are the primary users of Tryton?
  • Can you talk through a typical workflow for interacting with Tryton?
  • What are some of the most complex or challenging aspects of modeling a business while maintaining a high degree of customizability?
  • Can you describe how Tryton is architected and how its design has evolved since it was first started?
    • If you were to start over today, what would you do differently?
  • There are a number of plugins for Tryton. What kinds of functionality can be customized using the available interfaces?
    • What is the process for building a custom module for Tryton?
  • How do you manage sustainability of the Tryton project?
  • Given the criticality of the Tryton platform, how do you approach ongoing stability and security of the project?
  • What is involved in deploying and maintaining an installation of Tryton?
  • What are some of the most interesting, innovative, or unexpected ways that you have seen Tryton used?
  • What is in store for the future of Tryton?
Keep In Touch Picks Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Getting A Handle On Portable C Extensions With hpy16 Mar 202000:35:14
Summary

One of the driving factors of Python’s success is the ability for developers to integrate with performant languages such as C and C++. The challenge is that the interface for those extensions is specific to the main implementation of the language. This contributes to difficulties in building alternative runtimes that can support important packages such as NumPy. To address this situation a team of developers are working to create the hpy project, a new interface for extension developers that is standardized and provides a uniform target for multiple runtimes. In this episode Antonio Cuni discusses the motivations for creating hpy, how it benefits the whole ecosystem, and ways to contribute to the effort. This is an exciting development that has the potential to unlock a new wave of innovation in the ways that you can run your Python code.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • As a developer, maintaining a state of flow is key to your productivity. Don’t let something as simple as the wrong function ruin your day. Kite is the smartest completions engine available for Python, featuring a machine learning model trained by the brightest stars of GitHub. Featuring ranked suggestions sorted by relevance, offering up to full lines of code, and a programming copilot that offers up the documentation you need right when you need it. Get Kite for free today at getkite.com with integrations for top editors, including Atom, VS Code, PyCharm, Spyder, Vim, and Sublime.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Antonio Cuni about hpy, a project aiming to reimagine the C API for Python
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what the hpy project is and how it got started?
    • What are the goals for the project?
    • Who else is involved?
  • How much engagement have you had with CPython core contributors or the steering council?
  • Who are the consumers of the current C API for the CPython implementation?
    • What are some of the pain points or shortcomings for those consumers?
    • What impact does that have for users of a given library that leverages C extensions?
  • Can you talk through the structure of the hpy project?
    • What are some of the design challenges that you are facing for determining the external API?
    • What is involved in integrating the hpy interface into alternate runtimes such as PyPy or RustPython?
  • What is the potential or observed performance impact for libraries that currently rely on the existing C API?
  • How has the vision and scope of this project been updated as you have gotten further along in the implementation?
  • What are the downstream impacts that you anticipate in projects such as PyPy and Cython?
  • What have you found to be the most challenging or contentious aspects of implementing hpy so far?
  • What are some of the most interesting/unexpected/useful lessons that you have learned while working on hpy?
  • What do you have planned for the near to medium term for hpy?
Keep In Touch Picks Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Open Source Machine Learning On Quantum Computers With Xanadu AI10 Mar 202000:57:22
Summary

Quantum computers promise the ability to execute calculations at speeds several orders of magnitude faster than what we are used to. Machine learning and artificial intelligence algorithms require fast computation to churn through complex data sets. At Xanadu AI they are building libraries to bring these two worlds together. In this episode Josh Izaac shares his work on the Strawberry Fields and Penny Lane projects that provide both high and low level interfaces to quantum hardware for machine learning and deep neural networks. If you are itching to get your hands on the coolest combination of technologies, then listen now and then try it out for yourself.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • As a developer, maintaining a state of flow is key to your productivity. Don’t let something as simple as the wrong function ruin your day. Kite is the smartest completions engine available for Python, featuring a machine learning model trained by the brightest stars of GitHub. Featuring ranked suggestions sorted by relevance, offering up to full lines of code, and a programming copilot that offers up the documentation you need right when you need it. Get Kite for free today at getkite.com with integrations for top editors, including Atom, VS Code, PyCharm, Spyder, Vim, and Sublime.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Josh Izaac about how the work that he is doing at Xanadu AI to make it easier to build applications for quantum processors
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what you are working on at Xanadu AI?
    • How do the specifics of your quantum hardware influence the way in which developers need to build their algorithms? (e.g. as compared to DWave)
  • What are some of the underlying principles that developers need to understand in order to take full advantage of the capabilities provided by quantum processors?
  • Can you outline the different components and libraries that you are building to simplify the work of building machine learning/AI projects for quantum processors?
    • What’s the story behind all of the Beatles references?
    • How do the different libraries fit together?
  • What are some of the workloads and use cases that you and your customers are focused on?
  • What are some of the most challenging aspects of designing a library that is accessible to developers while being able to take advantage of the underlying hardware?
  • How does the workflow for machine learning on quantum computers differ from what is being done in classical environments?
    • Given the magnitude of computational power and data processing that can be achieved in a quantum processor it seems that there is a potential for small bugs to have disproportionately large impacts. How can developers identify and mitigate potential sources of error in their algorithms?
  • For someone who is building an application or algorithm to be executed on a Xanadu processor, what does their workflow look like?
    • What are some of the common errors or misconceptions that you have seen in customer code?
  • Can you describe the design and implementation of the Penny Lane and Strawberry Fields libraries and how they have evolved since you first began working on them?
  • What are some of the most ambitious or exciting use cases for quantum systems that you have seen?
  • How are you using the computational capabilities of your platform to feed back into the research and design of successive generations of hardware?
  • What are some useful heuristics for determining whether it is worthwhile to build for a quantum processor rather than leveraging classical hardware?
  • What are some of the most interesting/unexpected/useful lessons that you have learned while working on quantum algorithms and the libraries to support them?
  • What is in store for the future of the Xanadu software ecosystem?
  • What are your predictions for the near to medium term of quantum computing?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

The Advanced Python Task Scheduler02 Mar 202000:33:16
Summary

Most long-running programs have a need for executing periodic tasks. APScheduler is a mature and open source library that provides all of the features that you need in a task scheduler. In this episode the author, Alex Grönholm, explains how it works, why he created it, and how you can use it in your own applications. He also digs into his plans for the next major release and the forces that are shaping the improved feature set. Spare yourself the pain of triggering events at just the right time and let APScheduler do it for you.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Alex Grönholm about APScheduler, a library for scheduling tasks in your Python projects
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what APScheduler is and the main use cases that APScheduler is designed for?
    • What was your movitvation for creating it?
  • What is the workflow for integrating APScheduler into an application?
    • In the documentation it says not to run more than one instance of the scheduler, what are some strategies for scaling schedulers?
  • What are some common architectures for applications that take advantage of APScheduler?
    • What are some potential pitfalls that developers should be aware of?
  • Can you describe how APScheduler is implemented and how its design has evolved since you first began working on it?
    • What have you found to be the most complex or challenging aspects of building or using a scheduling framework?
  • What are some of the most interesting/innovative/unexpected ways that you have seen APScheduler used?
  • What are some of the features or capabilities that you have consciously left out?
    • What design strategies or features of APScheduler are often overlooked or underappreciated?
  • What are some of the most useful or interesting lessons that you have learned while building and maintaining APScheduler?
  • When is APScheduler the wrong choice for managing task execution?
  • What do you have planned for the future of the project?
Keep In Touch Picks Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Reducing The Friction Of Embedded Software Development With PlatformIO25 Feb 202000:46:49
Summary

Embedded software development is a challenging endeavor due to a fragmented ecosystem of tools. Ivan Kravets experienced the pain of programming for different hardware platforms when embroiled in a home automation project. As a result he built the PlatformIO ecosystem to reduce the friction encountered by engineers working with multiple microcontroller architectures. In this episode he describes the complexities associated with targeting multiple platforms, the tools that PlatformIO offers to simplify the workflow, and how it fits into the development process. If you are feeling the pain of working with different editing environments and build toolchains for various microcontroller vendors then give this interview a listen and then try it out for yourself.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Ivan Kravets about PlatformIO, an open source ecosystem for IoT development including a cross-platform IDE, unified debugger, remote unit testing, and firmware updates.
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what PlatformIO is?
    • What was your motivation for creating it?
    • What are the aspects of embedded development that keep you interested and engaged in this space?
  • What are some of the types of projects that someone might use PlatformIO to build?
  • What are some of the common challenges that a developer might encounter when working on embedded systems?
    • What are the additional complexities that get introduced as more hardware targets get added to a project?
  • What is the workflow for someone using PlatformIO for embedded systems development?
  • What are the different elements of PlatformIO and how do they simplify the work of building embedded systems projects?
  • How is PlatformIO implemented and how has the system design evolved since you first began working on it?
    • What was your reason for selecting Python as the implementation language?
    • If you were to start over today what would you do differently?
  • How has the embedded hardware and software landscape changed since you first started work on PlatformIO?
    • How has that impacted your product direction?
  • How do developers handle testing and validation of their applications?
  • How does PlatformIO help with updating deployed devices with new firmware?
  • What have been some of the most interesting/unexpected/innovative projects that you have seen built with PlatformIO?
  • What have been some of the most interesting/unexpected/challenging aspects of building and maintaining PlatformIO?
  • How are you approaching sustainability of the project and business?
  • What do you have planned for the future of PlatformIO?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

APIs, Sustainable Open Source and The Async Web With Tom Christie18 Feb 202000:43:45
Summary

Tom Christie is probably best known as the creator of Django REST Framework, but his contributions to the state the web in Python extend well beyond that. In this episode he shares his story of getting involved in web development, his work on various projects to power the asynchronous web in Python, and his efforts to make his open source contributions sustainable. This was an excellent conversation about the state of asynchronous frameworks for Python and the challenges of making a career out of open source.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Tom Christie about the Encode organization and the work he is doing to drive the state of the art in async for Python
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what the Encode organization is and how it came to be?
    • What are some of the other approaches to funding and sustainability that you have tried in the past?
    • What are the benefits to the developers provided by an organization which you were unable to achieve through those other means?
    • What benefits are realized by your sponsors as compared to other funding arrangements?
  • What projects are part of the Encode organization?
  • How do you determine fund allocation for projects and participants in the organization?
  • What is the process for becoming a member of the Encode organization and what benefits and responsibilities does that entail?
  • A large number of the projects that are part of the organization are focused on various aspects of asynchronous programming in Python. Is that intentional, or just an accident of your own focus and network?
  • For those who are familiar with Python web programming in the context of WSGI, what are some of the practices that they need to unlearn in an async world, and what are some new capabilities that they should be aware of?
  • Beyond Encode and your recent work on projects such as Starlette you are also well known as the creator of Django Rest Framework. How has your experience building and growing that project influenced your current focus on a technical, community, and professional level?
  • Now that Python 2 is officially unsupported and asynchronous capabilities are part of the core language, what future directions do you foresee for the community and ecosystem?
    • What are some areas of potential focus that you think are worth more attention and energy?
  • What do you have planned for the future of Encode, your own projects, and your overall engagement with the Python ecosystem?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Learning To Program Python By Building Video Games With Arcade11 Feb 202000:41:43
Summary

Video games have been a vehicle for learning to program since the early days of computing. Continuing in that tradition, Paul Craven created the Arcade library as a modern alternative to PyGame for use in his classroom. In this episode he explains his motivations for starting a new framework for video game development, his view on the benefits of games in computer education, and how his students and the broader community are using it to build interesting and creative projects. If you are looking for a way to get new programmers engaged, or just want to experiment with building your own games, then this is the conversation for you. Give it a listen and then give Arcade a try for yourself.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Paul Craven about Arcade, an easy-to-learn Python library for creating 2D video games
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Arcade is?
    • What inspired you to begin working on it?
  • Who is your primary audience?
  • As an educator, what have you found to be most effective about using games as a vehicle for teaching programming?
    • What elements of programming or computer science do you have difficulty in addressing within the context of a video game?
    • For someone who wants to move on from working on games to something like web development or data analytics, what elements of software design and structure are easily translated to other domains?
  • Can you describe how Arcade is implemented and how the architecture has evolved since you first began working on it?
    • If you were to start over today, what would you do differently?
  • What have you found to be the most interesting/unexpected/challenging aspects of building and maintaining Arcade?
  • What are some of the most interesting/innovative/unexpected ways that you have seen Arcade used?
  • When is Arcade the wrong platform, or at what point does someone need to move on from Arcade?
  • What do you have planned for the future of Arcade?
Keep In Touch Picks
  • Tobias
  • Paul
    • Fahrenheit 451 by Ray Bradbury
      • “Mistakes can be profited by Man, when i was young I showed my ignorance in people’s faces. They beat me with sticks. By the time I was forty my blunt instrument had been honed to a fine cutting point for me. If you hide your ignorance, no one will hit you and you’ll never learn.”
Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Remove Roadblocks And Let Your Developers Ship Faster With Self-Serve Infrastructure14 Aug 202201:01:49
Summary

The goal of every software team is to get their code into production without breaking anything. This requires establishing a repeatable process that doesn’t introduce unnecessary roadblocks and friction. In this episode Ronak Rahman discusses the challenges that development teams encounter when trying to build and maintain velocity in their work, the role that access to infrastructure plays in that process, and how to build automation and guardrails for everyone to take part in the delivery process.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Ronak Rahman about how automating the path to production helps to build and maintain development velocity
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you describe what Quali is and the story behind it?
  • What are the problems that you are trying to solve for software teams?
    • How does Quali help to address those challenges?
  • What are the bad habits that engineers fall into when they experience friction with getting their code into test and production environments?
    • How do those habits contribute to negative feedback loops?
  • What are signs that developers and managers need to watch for that signal the need for investment in developer experience improvements on the path to production?
  • Can you describe what you have built at Quali and how it is implemented?
    • How have the design and goals shifted/evolved from when you first started working on it?
  • What are the positive and negative impacts that you have seen from the evolving set of options for application deployments? (e.g. K8s, containers, VMs, PaaS, FaaS, etc.)
  • Can you describe how Quali fits into the workflow of software teams?
  • Once a team has established patterns for deploying their software, what are some of the disruptions to their flow that they should guard against?
  • What are the most interesting, innovative, or unexpected ways that you have seen Quali used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Quali?
  • When is Quali the wrong choice?
  • What do you have planned for the future of Quali?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Build Your Own Personal Data Repository With Nostalgia04 Feb 202000:32:58
Summary

The companies that we entrust our personal data to are using that information to gain extensive insights into our lives and habits while not always making those findings accessible to us. Pascal van Kooten decided that he wanted to have the same capabilities to mine his personal data, so he created the Nostalgia project to integrate his various data sources and query across them. In this episode he shares his motivation for creating the project, how he is using it in his day-to-day, and how he is planning to evolve it in the future. If you’re interested in learning more about yourself and your habits using the personal data that you share with the various services you use then listen now to learn more.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Pascal van Kooten about his nostalgia project, a nascent framework for taking control of your personal data
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing your mission with the nostalgia project?
    • How did the topic of personal data management come to be a focus for you?
  • What other options exist for users to be able to collect and manage their own data?
    • What capabilities were lacking in those options that made you feel the need to build Nostalgia?
  • What is your target audience for this set of projects?
  • How are you using Nostalgia in your own life?
    • What are some of the insights that you have been able to gain as a result of integrating your data with Nostalgia?
  • Can you describe the current architecture of the Nostalgia platform and how it has evolved since you began work on it?
    • What are some of the assumptions that you are using to direct the focus of your development and interaction design?
  • What are the minimum number of data sources needed to make this useful?
  • What are some of the challenges that you are facing in collating and integrating different data sources?
  • What are some of the drawbacks of using something like Nostalgia for managing your personal data?
  • What are some of the most interesting/challenging/unexpected aspects of your work on Nostalgia so far?
  • What do you have planned for the future of the project?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Simplifying Social Login For Your Web Applications27 Jan 202000:34:06
Summary

A standard feature in most modern web applications is the ability to log in or register using accounts that you already own on other sites such as Google, Facebook, or Twitter. Building your own integrations for each service can be complex and time consuming, distracting you from the features that you and your users actually care about. Fortunately the Python social auth library makes it easy to support third party authentication with a large and growing number of services with minimal effort. In this episode Matías Aguirre discusses his motivation for creating the library, how he has designed it to allow for flexibility and ease of use, and the benefits of delegating identity and authentication to third parties rather than managing passwords yourself.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Matías Aguirre about Python social auth and the complexities of third-party authentication
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what the Python social auth project is and your motivation for starting it?
  • Why might someone want to integrate with or rely on a third-party identity provider in their projects?
    • What are some of the tradeoffs or drawbacks of implementing
  • Can you describe the current architecture of the library and how it has evolved since you first began working on it?
  • There are a number of pre-built integrations with different web frameworks in the social auth github organization, but Django is the only one that has seen any commits recently. What are the contributing factors for that state of affairs?
  • There are a number of authentication protocols that you support. What are the common capabilities that they each support and what are some of the more challenging differences between them?
    • How have you implemented the interface for plugging different authentication mechanisms to allow for the variation between them while keeping the library code maintainable?
    • What is involved in adding support for a new authentication provider or protocol?
  • Many times authorization and authentication are conflated or used interchangeably. How does Python social auth address those concerns and what are the limitations of different mechanisms for defining permissions?
  • For someone who is using Python social auth, what is the workflow for integrating it with their application as a consumer?
  • What are some of the most interesting/unexpected/innovative ways that you have seen Python social auth used?
  • What are some of the most interesting/useful/unexpected lessons that you have learned in the process of building and maintaining Python social auth?
  • When is Python social auth more effort than it’s worth?
  • What do you have planned for the future of the project?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Building A Business On Building Data Driven Businesses20 Jan 202000:41:27
Summary

In order for an organization to be data driven they need easy access to their data and a simple way of sharing it. Arik Fraimovich built Redash as a way to address that need by connecting to any data source and building attractive dashboards on top of them. In this episode he shares the origin story of the project, his experiences running a business based on open source, and the challenges of working with data effectively.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host as usual is Tobias Macey and today I’m interviewing Arik Fraimovich about Redash, an open source business intelligence platform that helps you make sense of your data.
Interview
  • Introductions

  • How did you get introduced to Python?

  • Can you start by describing what Redash is and its origin story?

    • What are the primary ways that it is used?

    • The business intelligence market is quite mature and has many commercial and open source projects to choose from. What are the aspects of Redash that have allowed you to be successful?

    • What would you consider to be your closest competitors?

  • What was your background with data before starting on Redash?

    • What are some of the most notable lessons that you have learned about business intelligence since starting the project?
    • How has the landscape for business intelligence and data analysis changed since you began the project?
  • Beyond just accessing data, Redash focuses on enabling visualization of the results. What types of visualizations do you support and how do you support users in choosing the most effective ways to represent the information?

  • What are some of the common challenges that your users and customers encounter when communicating with data?

  • One of the critical aspects of enabling data access in an organization is the ability to collaborate on asking and answering questions. How do you approach that challenge in Redash?

  • How is Redash implemented and how has the overall design and architecture evolved since you first started working on it?

    • How do you manage the complexity of supporting so many different data sources?
    • If you were to start over today, what would you do differently?
  • Beyond the code of Redash, you also have a business around providing it as a hosted service. What are some of the most interesting, challenging, or unexpected lessons that you have learned in the process of building and growing that service?

  • How do you approach the direction and governance of the open source project and balance that against the wants and needs of the community?

  • What are some of the most interesting, innovative, or unexpected ways that you have seen Redash used?

  • When is Redash the wrong platform to use?

  • What do you have planned for the future of the Redash business and project?

Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

© My Podcast Data