Podcast Open||Source||Data by Charna Parkey Episodes

Explore every episode of the podcast Open||Source||Data

Dive into the complete episode list for Open||Source||Data. Each episode is cataloged with detailed descriptions, making it easy to find and explore specific topics. Keep track of all episodes from your favorite podcast and never miss a moment of insightful content.

	Title	Pub. Date	Duration
	Multi-Agent Systems and Human-Agent Collaboration \| Rodrigo Nader	01 Jul 2025	00:58:55
In this episode, Charna Parkey welcomes Rodrigo Nader, the founder of Langflow, an open-source, low-code app builder for multi-agent AI systems. Rodrigo and Charna dive into his beginnings in a small Brazilian town to the future of AI and the emergence of multi-agent systems. Discover how these systems will enable human-agent collaboration, increase productivity, and solve complex problems across various industries. --- TIMESTAMPS 00:01:00 Introduction to Rodrigo Nader, CEO and founder of Langflow, and an overview of Langflow's mission and recent developments. 00:03:00 - Rodrigo Nader's background and journey into open-source, data science, and machine learning, including his early experiences with MIT OpenCourseWare and Kaggle. 00:06:00 - Rodrigo's work at Bitvore Corp, focusing on structuring financial data using machine learning, and his introduction to the open-source AI ecosystem. 00:10:00 - The inspiration behind Langflow, including the idea of connecting multiple AI models to create a more powerful, trainable system. 00:15:00 - Discussion on the evolution of AI agents, their decision-making capabilities, and the future of multi-agent systems. 00:18:00 -The role of agents in AI development, the democratization of AI tools, and the potential for community-driven innovation. 00:22:00 -The importance of multi-agent collaboration and the future of human-AI interaction in productivity and task management. 00:26:00 - Common use cases for Langflow, including language model pipelines, RAG (Retrieval-Augmented Generation), and agentic systems. 00:30:00 - Challenges in AI development, particularly debugging and prompt engineering, and the need for better tools to visualize and monitor AI systems. 00:34:00 - Predictions for the future of AI in 2025, including the rise of specialized agents and the importance of human feedback in AI training. 00:38:00 - Rodrigo's personal interests outside of AI, particularly his fascination with physics, quantum mechanics, and the concept of time. 00:42:00 - Final thoughts on the democratization of AI tools, the importance of community contributions, and advice for aspiring developers and AI enthusiasts. 00:46:00 - Reflections with executive producer Leo Godoy, discussing the impact of Langflow, the differences between traditional and AI development, and the rapid pace of AI evolution. Quotes Charna Parkey "For any developer who has sort of avoided the soft skills, the managerial skills, et cetera, you should go listen to some of those courses. You are now going to be managing this AI workforce that you really do need to treat like a team of interns that you're delegating work to, that you're giving feedback on, and all of those skills of sort of like more senior-level engineering of design reviews, code reviews, feedback, like that's gonna be more central than actually writing a line of code yourself." Rodrigo Nader "We're going to see millions and millions more agents than humans very soon, right? So we don't think that these agents are going to emerge from, one, only developers, meaning like hard-code developers, neither from big companies creating solutions that will suddenly solve all the problems."
	Why AI Can’t Scale Without Infrastructure Fixes \| Darrick Horton	17 Jun 2025	00:50:55
From energy bottlenecks to proprietary GPU ecosystems, the CEO of TensorWave, Darrick Horton explains why today’s AI scale is unsustainable—and how open-source hardware, smarter networking, and nuclear power could be the fix. QUOTES Darrick Horton “The energy crisis is getting worse every day. It’s very hard to find data center capacity—especially capacity that can scale. Five years ago, 10 or 20 megawatts was considered state-of-the-art. Now, 20 is nothing. The real hyperscale AI players are looking at 100 megawatts minimum, going into the gigawatt territory. That’s more than many cities combined just to power one cluster.” Charna Parkey “We’re still training models in a very brute-force way—throwing the biggest datasets possible at the problem and hoping something useful emerges. That’s not sustainable. At some point, we have to shift toward smarter, more intentional training methods. We can’t afford to be wasteful at this scale.” TIMESTAMPS [00:00:00] Introduction [00:01:00] Founding TensorWave [00:04:00] AMD as a Viable Alternative [00:08:00] Open Source as a Startup Enabler [00:09:30] Launching ScalarLM [00:12:00] ScalarLM Impact and Reception [00:14:30] Roadmap for 2025 [00:16:00] Technical Advantages of AMD [00:18:00] Emerging Open Source Infrastructure [00:20:00] Broader Societal Issues AI Must Address [00:22:00] AI’s Impact on Global Energy [00:26:00] Fundamental Hardware vs. Human Efficiency [00:30:00] Data Center Density Evolution [00:34:00] Advice to Founders and Tech Trends [00:38:00] AI Energy Challenges [00:44:00] AI’s Rapid Impact vs. Internet [00:46:00] Monopoly vs. Democratization in AI [00:50:00] Close to Season Wrap Discussion and Predictions
	Open Source AI and Copyright: Building Ethical Models \| Kent Keirsey	11 Feb 2025	01:10:19
Kicking off Open Source Data Season 7, Charna Parkey welcomes the CEO and Founder of Invoke, Kent Keirsey to discuss his thoughts on licensing, copyright in generative AI, and the role of communities in building ethical, free-to-use technologies that can democratize technology and inspire global innovation. Quotes Kent Keirsey "When we look at open source models, if you just release the weights, and you don't really release information on how the data set was captioned, for example, or how you construct the data set, if you don't really know how it got to the artifact that was released, as a user, you do not understand how it works." Charna Parkey But there's still a lot of claims by big tech right now about how anything on the internet should be fair use for training, even if, you know, it might have its own kind of copyright Timestamps [00:02:00] - Kent Keirsey on his journey to open source [00:06:00] - Kent Keirsey on the Open Model Initiative (OMI) [00:08:00] -What makes a model truly open source [00:12:00] - The legal landscape of AI and copyright [00:14:00] - Kent Keirsey on the ethical implications of AI training data fair and use and AI development [00:26:00] Creativity, AI tools, personal AI models and recommendation algorithms: [00:32:00] - Kent Keirsey on TikTok and cultural clash: [00:38:00] - AI, self-reflection and a decision-making tool [00:42:00] - The Bria AI partnership [00:52:00] - The future of creativity, AI and Robotics: [01:00:00] - Final thoughts with producer Leo Godoy Connect with Kent Keirsey Connect with Charna Parkey
	Building Trust in AI: From Open Source to Global Impact with host, Charna Parkey	08 Oct 2024	00:44:03
Join Charna Parkey as she recaps a transformative year in AI, exploring the delicate balance between innovation and ethics. From open source communities to global regulations, discover how trust, diversity, and collaboration are shaping the future of technology.
	AI Regulations in Financial Services with Vinay Kumar	24 Sep 2024	00:54:05
Vinay Kumar discusses the transformation of AI in banking and financial services, addressing challenges and solutions with regulatory compliance and model explainability while addressing the stringent requirements in the financial industry. Episode Quotes Vinay Kumar "I always believe in this: you don't need to solve a very large problem. Maybe it will take a lot of time to do that. A lot of resources to do that but something small, which you can have an opportunity to solve that could be very big or a fundamental for quite a bit is fantastic. Think of a scenario where your small fundamental idea is a base for another small fundamental idea for someone else." Charna Parkey We also want to ground it a little bit in impact we've been seeing. And I think in the financial, banking, insurance industries it's not, I would say, an even distribution of advancement. Different countries have different regulations and different appetites for risk." Timestamps - [00:00:00] Introduction by Charna Parkey. - [00:01:57] Vinay Kumar begins talking about his journey. - [00:05:27] Discussion on building a search engine for STEM researchers. - [00:07:06] Challenges with early deep learning. - [00:09:55] Conversation shifts to ML observability. - [00:17:06] Discussion on simplifying verticalized AI. - [00:22:30] Impact of large language models (LLMs) on AI. - [00:30:58] Comparison of autonomous cars with AI regulation. - [00:37:58] Vinay mentions his science fiction novels. - [00:42:19] Conversation summary with Producer Leo Godoy.
	The importance and the Challenges & Solutions of AI Literacy with Brian Magerko	13 Aug 2024	00:54:19
QuotesBrian Magerko “We're really trying to show that we could co-create experiences with AI technology that augmented our experience rather than served as something to replace us in creative act”. “For every project like [LuminAI], there's a thousand companies out there just trying to do their best to get our money... That's an uncomfortable place to be in for someone who has worked in AI for decades”. “I had no idea what was going to happen kind of in the future. When we started EarSketch... we were advised by a couple of colleagues to not do it. And here we are, having engaged over a million and a half learners globally”. Charna Parkey "I remember the first robot that I built. It was part of the first robotic systems... and watching these machines work with each other was just crazy." “If you're building a product and your goal is to engage underrepresented groups, it is on you to make sure that you're educating the folks in a way that you're trying to reach.” Episode timestamps (01:11) Brian Magerko's Journey into AI and Robotics (05:00) LuminAI and Human-Machine Collaboration in Dance (09:00) Challenges of AI Literacy and Public Perception (17:32) Explainable AI and Accountability (20:00) The Future of AI and Its Impact on Human Interaction (22:10) EarSketch and learning: computing as a meaningful concept (27:18) The need for interdisciplinary collaboration to ensure AI developments are beneficial for society as a whole. (30:02) Brian Magerko's next reshape of the future, better understanding models of collaboration and improvisation between people and computers (35:51) Brian Magerko's advice to researchers based on his own identity and experiences (44:20) Projects and updates related to EarSketch and LuminAI’s improvisation model. (46:24) Backstage with Executive Producer Leo Godoy
	Demystifying AI Governance: A Practical Guide for Organizations with Heather Domin	30 Jul 2024	00:47:44
As AI becomes increasingly integrated into business operations, having robust governance structures in place is no longer optional. But what does effective AI governance look like in practice? In this episode, Dr. Heather Domin, a leading expert in AI ethics and governance, breaks down the key components of a successful AI governance framework. Heather guides us through the opportunities and challenges presented by this transformative technology. Learn about the importance of responsible adoption practices, the role of governance structures, the need for ongoing feedback loops and how to align AI initiatives with organizational values, establishing clear accountability, and creating a culture of responsible innovation. Timestamps 00:00:00 - 00:01:23 - Introduction 00:01:23 - 00:04:30 - Heather Domin's Journey 00:09:50 - 00:12:48 - Open Source and AI Ethics 00:12:48 - 00:15:25 - Generative AI and Governance 00:23:40 - 00:26:22 - Future of Responsible AI Practices 00:35:37 - 00:37:31 - Advice for the Audience 00:37:31 - 00:46:04 - Reflection on Risk and Hope in AI Quotes Heather Domin "I think that each of us individually can scan our environment and understand, you know, where can I make an impact? What problem can I help solve? What is the next thing that I can really contribute to?" "There are absolutely ways to automate, you know, the prompt testing and many of the routine tasks that you want to leverage automation in that way so that you can actually have the humans focus on other, other things so they can focus on the critical thinking and outside the box sort of thinking that we want the humans to be focused on." Charna Parkey "I think that it's a hard for people getting into it for the first time to jump to hope if they've experienced something that they should fear in the past. By that, I mean, groups that have been marginalized by other forms of technology are not going to start hopeful with this new one that is is using their data without their permission.." "If for some reason I came to understand in a month what that meant, I should be able to go back and revoke and be like, nope, I actually don't want you to have that anymore. So I think that that would help people feel better." Check Heather's paper: On the ROI of AI Ethics and Governance Investments Connect with Heather Connect with Charna
	Transforming Food Systems with Regenerative AI with Ethan Soloviev	16 Jul 2024	01:00:55
Ethan Soloviev, Chief Innovation Officer at HowGood, reveals how generative AI can revolutionize the food and agriculture industry. Discover the potential of AI to create a regenerative, sustainable, and net-positive food system that benefits the planet and all living beings. Timestamps 1. Introduction and Background (00:00:00 - 00:01:16) 2. Ethan's Journey (00:01:16 - 00:05:12) 3. The Role of Food and Agriculture (00:05:12 - 00:06:52) 4. Investment in Regenerative Agriculture and Generative AI (00:06:52 - 00:07:44) 5. Levels of AI Impact (00:07:44 - 00:12:42) 6. HowGood's Use of AI (00:12:42 - 00:13:20) 7. Consumer Impact and Corporate Responsibility (00:13:20 - 00:15:44) 8. Future of AI in Food Systems (00:15:44 - 00:20:30) 9. Innovative Perspectives on AI Training (00:20:30 - 00:21:10) 10. Action models in agriculture, optimizing water and soil use on a larger scale. (00:24:14 - 00:25:28) 11. Discussion on integrating human cultural geography into AI models. (00:27:37 - 00:30:00) 12. Charna and Ethan discuss procurement decisions and their impact on sustainability. (00:30:20- 00:40:15) 13. The ethical implications of AI in corporate and government decision-making. (00:42:01 - 00:54:31) 14. Leo brings up the impact of AI on consumers, discussing how AI can change purchasing decisions by highlighting product sustainability. (00:54:40 - 00:55:30) 15. Charna elaborates on using AI to understand different business models and how generational changes affect consumer choices. (00:55:47 - 00:57:32) Quotes Ethan Soloviev "What if we're using ecological data? What if we're training on trees and insects and animals and whale song? What kind of questions would a gen AI trained on whale song and hummingbird language ask us?" Charna Parkey "If we have this great translator that is Gen AI, we already have text and language to code. We can do code generation. We can already interpret this code and tell me what it's going to do. Take that code to language. Why can't we do that with some of these other senses and these other measurements?" Connect with Ethan Connect with Charna
	Redefining AI Ethics: The Key Role of Explainability with Beth Rudden	02 Jul 2024	00:53:18
Beth Rudden, recognized as one of the 100 most brilliant leaders in AI ethics, discusses the crucial role of explainability and traceability in building trustworthy AI systems. She shares how Bast AI is using ontologies and knowledge graphs to provide contextual relevance and understanding, enabling humans to fully trust artificial intelligence and how it allows the system to transform fields like education and healthcare. Timestamps 00:00:00 - Intro 00:02:00 - Beth’s Journey 00:19:33 - Ontologies in AI 00:21:44 - Data Lineage and Provenance 00:32:52 - Open Source Tools 00:38:38 - Explainable AI 00:44:58- Inspiration from Nature Quotes Beth Rudden: "The best thing that I could tell you that I see is that it's going to shift from more pure mathematical and statistical to much more semantic, more qualitative. Instead of quantity, we're going to have quality." Charna Parkey: "I love that because I've been so mathematical for most of my life. I didn't have a lot of words for the feelings or expressions, right? And so I had sort of this lack of data and the Brené Brown reference you make, like I have many of her books on my shelf and I often pull, I don't even know where it is right now, but the Atlas of the Heart because I am having this feeling and I don't know what it is." Links Connect with Beth Connect with Charna
	Eliminating AI Bias Through Inclusive Data Annotation with Andrea Brown	18 Jun 2024	00:45:56
Learn how Andrea Brown, CEO of Reliabl, is revolutionizing AI by ensuring diverse communities are represented in data annotation. Discover how this approach not only reduces bias but also improves algorithmic performance. Andrea shares insights from her journey as an entrepreneur and AI researcher. Episode timestamps (02:22) Andrea's Career Journey and Experience with Open Source (Adobe, Macromedia, and Alteryx) (11:59) Origins of Alteryx's AI and ML Capabilities / Challenges of Data Annotation and Bias in AI (19:00) Data Transparency & Agency (26:05) Ethical Data Practices (31:00) Open Source Inclusion Algorithms (38:20) Translating AI Governance Policies into Technical Controls (39:00) Future Outlook for AI and ML (42:34) Impact of Diversity Data and Inclusion in Open Source Quotes Andrea Brown "If we get more of this with data transparency, if we're able to include more inputs from marginalized communities into open source data sets, into open source algorithms, then these smaller platforms that maybe can't pay for a custom algorithm can use an algorithm without having to sacrifice inclusion." Charna Parkey “I think if we lift every single platform up, then we'll advance all of the state of the art and I'm excited for that to happen." Connect with Andrea Connect with Charna
	Regulation's Role in Driving Responsible AI with Asa Whillock	04 Jun 2024	00:58:16
In this week’s episode, Charna welcomes Asa Whillock, the VP & GM Machine Learning and Artificial Intelligence at Alteryx. Asa shares a surprising perspective on AI regulation, explaining how it sets a baseline for responsible practices. Discover why he believes regulation is crucial in guiding the ethical development and deployment of AI and learn the importance of continuous learning and what the past can teach us about navigating the challenges and opportunities of AI today. Episode timestamps (01:47) Asa Whillock's career journey at market-leading companies and the role of open source in each (Adobe, Macromedia, Alteryx) (04:56) Feature Labs acquisition by Alteryx and its open source roots in democratizing machine learning capabilities (11:00) Survey findings on enterprise board members' perspectives on AI and the need to move beyond policy creation to implementation and governance. (27:00) Applying AI capabilities and decision-making related to AI (30:00) The future of AI predominance, including cost reduction, open source model advancements, and the push for demonstrating business value (43:33) Advice for navigating AI expertise and decision-making, including continuous learning, self-awareness of decision-making models, and acknowledging knowledge limits Quotes Asa Whillock "I love regulation. I think it's great. And people are like, what? Why would you say that? And the reason why I say that is because I think it puts a floor underneath all of us of what do we think good looks like?" Charna Parkey "I think we need to, as a community, focus on meeting them where they are if we really want the democratization that is promised. Yeah, I don't know any other way to do it."
	Transforming Client Experience with AI with Robbi Armstrong	21 May 2024	01:00:25
Join Charna Parkey as she interviews Robbi Armstrong, AI Products and Strategy Director at KeyBank. Discover how this $190 billion bank is navigating the rapidly evolving landscape of generative AI, balancing the need for innovation with the challenges of managing risk in a heavily regulated industry. Explore the impact of KeyBank's virtual assistant, MyKey, on client experience. With nearly 70% repeat usage, MyKey seamlessly transfers clients to contact center agents, providing a warm handoff that includes authentication and chat context. Episode Timestamps (02:11): Robbi Armstrong's role at KeyBank and intersection with open source and AI initiatives in the financial industry (04:06): Compliance and regulatory trends in AI for banking (12:10): Organizational Change Management with AI (28:00): Responsible and Ethical AI (37:00): Financial Literacy and AI Quotes Robbi Armstrong “I truly believe that if you are an organization and you are sitting back and you're not organizing a team and you're not organizing a program and you're not learning, you're not looking at education, you're not looking at change management around Gen AI, I don't think you'll be here in two years. I really truly believe that. Because you won't be able to compete." Charna Parkey “I think the democratization is real and I think it's incredibly important because that step in between the domain expert and the technology is very lossy. You know, oftentimes we say, well, if only I had the data to answer your question let me give you a different answer or let me answer it completely and now we can actually put it in the hands of the experts and say, well, oh, then let's go collect that data." Links Connect with Robbi Connect with Charna
	Building Open-Source LLMs with Philosophy \| Anastasia Stasenko	03 Jun 2025	00:57:45
Join Charna Parkey as she welcomes Anastasia Stasenko, CEO and co-founder of pleias, through her unique journey from philosophy to building open-source, energy-efficient LLMs. Discover how pleias is revolutionizing the AI landscape by training models exclusively on open data and establishing a precedent for ethical and socially acceptable AI. Learn about the challenges and opportunities in creating multilingual models and contributing back to the open-source community. QUOTES [00:00:00] Introducing Anastasia and pleias [00:02:00] From Philosophy to AI [00:06:00] The Problem of Generic Models [00:10:00] Open Weights vs. Open Source vs. Open Science [00:14:00] Why Open Data Matters [00:18:00] High-Quality, Specialized Models [00:22:00] Multilingual Challenges [00:26:00] Global Inclusion Requires Small Models [00:30:00] Using and Contributing to Wikidata [00:38:00] The Future: Specialized Models [00:48:00] Advice for Newcomers [00:54:00] Cultural Sensitivity and Data Representation [00:50:00] Leo’s Takeaways [00:52:00] Charna on Ethical, Verifiable AI [00:54:00] Representation vs. Exclusion [00:56:00] Letting People Be More Human [00:57:30] Applied, Transformative AI QUOTES Charna: "If you didn’t make it represented in the data, then we’re leaving another culture behind... So which one are you wanting to do, misrepresent them or just completely leave them behind from this technical revolution?" Anastasia: "The real issue now is that the lack of diversity in the current AI labs leads to the situation where all LLMs look alike." Anastasia: "Being able to design, to find, and also to create the appropriate data mix for large language models is something that we shouldn't really forget about when we talk about the success of what large language models are."
	Navigating Open Source Talent, AI & Policy Challenges with Amanda Brock	07 May 2024	00:40:16
Amanda Brock's path began with picking potatoes at 8 years old. Now, she's the CEO of OpenUK, advocating for open source across the UK. In this insightful interview, Brock shares her journey into open source law and policy. She dives into OpenUK's latest research on the state of open technology in Britain, talent challenges, and the economic impact of open source contributions. Brock also unpacks key discussions from State of OpenCon 2024 on open data, generative AI, and balanced regulation. Episode timestamps (05:06): State of open source in the UK (07:22): Importance of open source community (15:19): Balancing openness and regulation in AI (21:19): Pace of technological development and regulation (28:21): Reliability and discernment with AI outputs (35:24): Universal advice Quotes Amanda Brock “I think the governments that are going to win, the governments that are going to have the best regulation that promotes most innovation are going to be the ones which are able to make their regulatory environment flow in the same way as the technology evolution and innovation flows." Charna Parkey "I think the expectation needs to change. Part of what has happened with, you know, literal text search or keyword search and just Google and things like that, is that the average person expects what comes back to be relatively factual. That it's been referenced and, you know, backlinked, etc. That's a deterministic system. These are not. These are based upon statistical likelihoods of what word should come next." Links Connect with Charna Connect with Amanda
	Using AI to Impact Performance Feedback Equity with Tacita Morway	23 Apr 2024	00:48:01
Dive into the world of purposeful AI with Tacita Morway, CTO of Textio. Learn how Textio ensures their AI is built responsibly and ethically to transform the way teams communicate, hire, and measure their health. Discover their rigorous testing processes and the importance of having a diverse team to catch potential risks and how that helps the company develop strategies for avoiding bias and maintaining data privacy. Episode timestamps (02:15): Tacita's unconventional career path to becoming a CTO (07:00): Textio's practices for building AI responsibly and ethically (14:00) The impact of Textio's AI on performance feedback (17:00) The importance of purpose-built vs generic AI models (28:00) Balancing open source and proprietary data/models (42:00) Advice for the AI industry moving forward Quotes Tacita Morway “When you've got a team with different backgrounds, educational, lived experiences, identity, careers, all of those things, we have those different perspectives in the room. And we're all working off of the same expectations. We can catch each other's gaps.” Charna Parkey “There's an interesting conversation happening, I think, in the community right now about these purpose-built LLMs. Are they as good as generic LLMs? Sure, certainly if you're not going to apply something purpose-built to something generic or outside of its domain, it is not as good. But I think some of this shows us that unless you have something purpose-built and unless you're leveraging the data in the right way, you may just be feeding noise back into the system.” Links Connect with Tacita Connect with Charna
	The Ethical Path to High-Quality AI Data with Fabiana Clemente	09 Apr 2024	00:50:12
How can we accelerate AI while protecting privacy? Fabiana Clemente discusses founding YData to enable high-quality synthetic data for machine learning. She covers open sourcing data profiling tools, the impact of generative AI on synthetic data, and maintaining work-life balance as an introvert leader. Timestamps (00:02:29) Fabiana's journey starting YData and becoming a public speaker (00:20:19) Misconceptions and hype around generative AI and AGI (00:32:46) Potential real-world impact and use cases of LLMs today (00:34:55) The role of synthetic data in making AI models more robust and fair (00:43:55) Advice for founders: value your time and learn to say no (00:48:24) The importance of technical leaders being able to communicate well Quotes Charna Parkey: "It's a balance. I think that's also what led us to some of the demographic based data science. Essentially, folks were making like event data into pre-aggregated data. And then they were trying to obscure it so much that you couldn't get back to the person. And so you're like, okay, what's their age and what's their gender? And you're like, that's not actually the most useful part of data science that can't predict behavior or intent or any of that. It throws out time as a component of the entire process, seasonality, everything. And so there just, there has to be a better way." Fabiana Clemente: "I have to say, that's a very beautiful way to put it. Hallucinations, I have to say. I never thought about that. And it makes a lot of sense. I do think, though, that in terms of LLMs, it's so language, it's so definitely, it sounds like we are getting very, very intelligent system, exactly, because language is very complex. And we know that was needed for the leap of humanity. I do think there are other, the sense of combining. Well, and here we enter in the multimodal kind of space. It's what's missing." Links Connect with Charna Connect with Fabiana
	Disrupting Data Analysis with Avi Press	26 Mar 2024	00:50:35
Join host Charna Parkey as she sits down with Scarf’s CEO and Founder Avi Press in a riveting exchange about his pioneering journey into the world of open source with Scarf. Learn how Avi challenges conventional data analytics and collection, aiming to reshape industry standards through the power of open source. A conversation that delves into altering analytics norms, innovative monetization strategies, and the exploration of alternative licenses like BSL. Avi’s insights offer a unique perspective on the transformative role of open source in driving data analytics forward, fostering community engagement, and encouraging transparent development. Episode timestamps (02:15): Challenges of collecting open source usage data (22:06): Driving impact with open source usage data (28:27): Avi's entrepreneurial journey (39:42) Persistence and vision in startups (44:03) Tracking outcomes to stay motivated Quotes Avi Press “I mean, one thing is, for any project that you might be thinking about doing or any initiative that you want to work on or goal that you have, I think there's a lot of power in just trying the thing. You may not have all the details figured out, but just try it anyway and see where it takes you. And I think a lot of projects that I've ever worked on that led anywhere, I didn't know all these details, but I just start trying and seeing what works anyway and being very open to it not working out, but attempting it anyway. And then the other thing, which is I think admittedly fitting into our agenda at Scarf, but it is something that I really believe, which is that for any of these things you're doing, tracking the outcomes of that thing is very, very important and will both be tactically helpful, but also I think, like you said, give you these inspirational moments that keep you going, whether that's awe or inspiration or fulfillment or whatever that feeling is that helps you keep going. I think that tracking the outputs of your work such that you can understand the impact that you have is both very strategic and the most rewarding way to do anything, I think”. Charna Parkey “Given the venture-backed nature of a lot of these startups, there's going to have to be some sort of monetization at some point. You're not gonna have 1 million, 10 million, 40 million dollars dumped into just giving software away for free. So sort of these misaligned motivations are certainly what raised my hackles where I'm like, oh, you're claiming forever or you're claiming that you're like a values-driven organization, but you're venture-backed and you need to make money. And so show me how those motivations align or misalign. Tell me what your monetization strategy is gonna be. I know you need one. That way I'm not wondering, should I use this? Should I not?” Links Connect with Charna Connect with Avi
	Tech, Trust, and Transformation with Paula Paul	12 Mar 2024	00:53:58
On today’s episode of Open Source Data, Charna Parkey chats with tech veteran Paula Paul, exploring her remarkable 40-year journey in the technology sector. Starting at 16, Paula navigated through pivotal tech revolutions and embraced the essence of open source and community. Delve into Paula's world of coding on tape, the evolution of technology, and how communities foster growth, innovation, and trust. Discover the impact of open source in shaping technology and professional paths. Paula also sheds light on personal growth, community's pivotal role in professional mobility, and offers invaluable advice to aspiring tech professionals. A captivating look at the intersections of technology, community, and open source through the lens of an industry pioneer. Timestamps 00:00 - Intro 05:10 - Paula’s Professional Journey 10:30 - What Inspired Paula to Go Through the Open Source Path 14:50 - What are some of the biggest challenges and impacts that Paula sees in companies trying to derive value? 23:30 - Is the Tech World a Meritocracy? 25:35 - A Shift Of What is a Tech Company? 27:30 - Kids Interacting with New Technologies 31:30 - What Does Open Source Data Means to Paula? 42:50 - What is a Question that Paula has never been asked before? 47:00 - What Advice would you give to the audience? 51:50 - Backstage with Executive Producer Leo Godoy Quotes: Charna Parkey “I think from my side, as the applications we build change, then some of those backing technologies have to. Where databases used to be used by expert-like database administrators and you needed to have like data architects to your data model and you had to do all of these very, very specific things. And now we have this Gen AI moment and all of a sudden all of these specialized vector databases, NoSQL databases, etc., need to be used by an average developer. So they just want an API and it has to work and it has to be fast. And so, over these different moments, different technologies came about or were evolved, but I think it might be the application that's actually driving the change instead of the technology itself opening”. Paula Paul “It still surprises people to hear that 90% of any given modern application is open source and then there's 10% custom code that, depending on your company, you own or not. And it just still amazes me that we have these open source projects like jQuery is a project of the OpenJS Foundation and it's in a tremendous amount of our ecommerce infrastructure. But it's a project that's maintained by a very small team of contributors. And, you know, if this were a commercial product, it would be like a $1,000,000,000 company. (...) The piece of work being done by the new foundation to help make sure that we have the healthy web and that it's secure is really important, because people, if I say Log4j, people that remember those days know how important it is to keep security vulnerabilities addressed. And that's a concern for me, that people don't pay more attention to this. I mean, if you had a commercial software product, you typically would pay 20% a year in maintenance fees. But as many of us know, sometimes you find a bug and you would just report the bug, but it might take years for that bug to get fixed in a commercial release. Whereas if it's open source, there are people out there who can jump on it. But it's really crazy that there's no funding for that or no public works through the government, given all the dependance and dependencies that we have on these open source assets.” Links LinkedIn - Connect with Charna Linkedin - Connect with Paula
	An Innovative Approach to AI & NLP with Milos Rusic	27 Feb 2024	00:45:30
Starting the new season of Open Source Data, our new host Charna Parkey welcomes the CEO and Co-founder of deepset, Milos Rusic. With an impressive journey around NLP and AI, pioneering several areas in the Open Source field, Milos has revolutionized data search processes and brought about a new era of user-friendly and efficient enterprise search systems. Charna also shares some common ground with Milos when talking about joining an NLP Startup in 2015-16, predictive maintenance and more. Don’t miss it!
	New Beginnings: Open\|\|Source\|\|Data in Transition	20 Dec 2023	00:50:14
This episode features an interview with Charna Parkey, Real-Time AI Product and Strategy Leader at DataStax. Charna has been developing AI and ML products over the last 17 years and has worked with 90 of the Fortune 100 in her various roles. She is also a co-author and inventor on several patents. In this episode, Sam and Charna discuss handing over the role as host, Sam’s new startup journey, and how their thinking has evolved during the explosion of LLMs. ------------------- “Now, it seems like we have this opportunity where the conversation and the place that society is at is different. Where we want to contribute to the right set of data when we talk open source data. We want to make sure that we have the right data to train this model in order to get the right outcome. We want to provide a lens of, ‘All right, you are this persona. How would you say this thing?’ I do think that from a lot of what the LLMs have today, the outcome of those words are still missing. And we need to solve that. Like, ‘Is this piece of writing actually going to achieve the outcome I want versus am I following legal's guidelines? Am I technically correct? Is my CEO going to like it?’ That doesn't mean you're achieving impact in the world. There's an aspect there where we've given feedback loops, it seems, to be like, ‘Did I like the answer or not?’ But not, ‘Did I take an action?’ As we get to autonomousness, we're going to have to have an outcome or multiple outcomes associated with the reward of the system.” – Charna Parkey “I personally believe that all cognition is bias. My degree is in cognitive science. One of the things that we trained on is attention. And to pay attention, literally means to selectively choose what data is coming in from the world that you're going to pay attention to and what you're going to discard. Which is also, to me, the definition of bias. All cognition is bias, but what do we care about? Do you trust this thing? What does that mean? Well, do you trust it to do these particular actions to a level of consistency in this particular domain? It doesn't mean that you're going to trust it in all environments. There's a lot more nuance that hopefully will evolve in this strange age of nuanced destruction machines.” – Sam Ramji ------------------- Episode Timestamps: (01:04): Sam and Charna catch up (06:05): Sam explains his new company, Sailplane (14:21): How Charna’s thinking has evolved during the LLM explosion (25:45): Sam’s thoughts after 5 seasons of Open\|\|Source\|\|Data (38:52): What Charna is looking forward to in the next season of the podcast (40:44): A question Sam wishes to be asked (45:45): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Charna LinkedIn - Connect with Sam Learn more about Sailplane
	The Intersection of Open Source and AI with Stefano Maffulli & Stephen O’Grady	13 Dec 2023	00:55:40
This episode features a panel discussion with Stefano Maffulli, Executive Director of the Open Source Initiative (OSI); and Stephen O’Grady, Co-founder of RedMonk. Stefano has decades of experience in open source advocacy. He co-founded the Italian chapter of Free Software Foundation Europe, built the developer community of the OpenStack Foundation, and led open source marketing teams at several international companies. Stephen has been an industry analyst for several decades and is author of the developer playbook, The New Kingmakers: How Developers Conquered the World. In this episode, Sam, Stefano, and Stephen discuss the intersection of open source and AI, good data for everyone, and open data foundations. ------------------- “Internet Archive, Wikipedia, they have that mission to accumulate data. The OpenStreetMap is another big one with a lot of interesting data. It's a fascinating space, though. There are so many facets of the word ‘data.’ One of the reasons why open data is so hard to manage and hasn't had that same impact of open source is because, like Stephen, the stories that he was telling about the startups having a hard time assembling the mixing and matching, or modifying of data has a different connotation. It's completely different from being able to do the same with software.” – Stefano Maffulli “It's also not clear how said foundation would get buy-in. Because, as far as a lot of the model holders themselves, they've been able to do most of what they want already. What's the foundation really going to offer them? They've done what they wanted. Not having any inside information here, but just judging by the fact that they are willing to indemnify their users, they feel very confident legally in their stance. Therefore, it at least takes one of the major cards off the table for them.” – Stephen O’Grady ------------------- Episode Timestamps: (01:44): What open source in the context of AI means to each guest (16:21): Stefano explains OSI’s opportunity to shine a light on models and teams (21:22): The next step of open source AI according to Stephen (25:38): Creating better definitions in order to modify software (33:09): The case of funding an open data foundation (42:31): The future of open source data (51:54): Executive producer, Audra Montenegro's backstage takeaways ------------------- Links: LinkedIn - Connect with Stefano Visit Open Source Initiative LinkedIn - Connect with Stephen Visit RedMonk
	Throwback: The AI-Native Stack with Mikiko Bazeley, Zain Hasan, and Tuana Celik	15 Nov 2023	00:57:37
This episode features a panel discussion with Mikiko Bazeley, Head of MLOps at Featureform; Zain Hasan, Senior Developer Advocate at Weaviate; and Tuana Celik, Developer Advocate at deepset. In this episode, Mikiko, Zain, and Tuana discuss what open source data means to them, how their companies fit into the AI-first ecosystem, and how jobs will need to evolve with the AI-native stack. ------------------- “We're almost part of a fancy new AI robot kitchen that you'd find in Tokyo, in some ways. I see a virtual feature store as, yes, you can have a bunch of your ingredients tossed into a closet. Or, what you can do is you can essentially have a nice way to organize them. You can have a way to label them, to capture information.” – Mikiko Bazeley “I really like that analogy as well. I like how Mikiko put it where a vector search engine is really extracting value from what you've already got. [...] So where I see vector search engines, really, is if we think of these embedding providers as the translators to take all of our unstructured data and bring it into vector space into a common machine language, vector search engines are essentially the workhorses that allow us to compute and search over these objects in vectorized format. They're essentially the calculators of the AI stack.” – Zain Hasan “Haystack, I would really position as the kitchen. I need Mikiko to bring the apples. I need Zain to bring the pears. I need Hugging Face or OpenAI to bring the oranges to make a good fruit salad. But, Haystack will provide the spoons and the pans and the knives to make that into something that works together.” – Tuana Celik ------------------- Episode Timestamps: (02:58): What open source data means to the panelists (09:11): What interested the panelists about AI/ML (24:10): Mikiko explains Featureform (27:00): Zain explains Weaviate (30:23): Tuana explains deepset (36:00): The panelists discuss how their companies fit into the AI-first ecosystem (44:58): How jobs need to evolve with the AI-native stack (54:35): Executive producer, Audra Montenegro's backstage takeaways ------------------- Links: LinkedIn - Connect with Mikiko Visit Featureform LinkedIn - Connect with Zain Visit Weaviate LinkedIn - Connect with Tuana Visit deepset Visit Data-centric AI
	How We Should Think About Data Reliability for Our LLMs with Mona Rakibe	01 Nov 2023	00:38:17
This episode features an interview with Mona Rakibe, CEO and Co-founder of Telmai, an AI-based data observability platform built for open architecture. Mona is a veteran in the data infrastructure space and has held engineering and product leadership positions that drove product innovation and growth strategies for startups and enterprises. She has served companies like Reltio, EMC, Oracle, and BEA where AI-driven solutions have played a pivotal role. In this episode, Sam sits down with Mona to discuss the application of LLMs, cleaning up data pipelines, and how we should think about data reliability. ------------------- “When this push of large language model generative AI came in, the discussions shifted a little bit. People are more keen on, ‘How do I control the noise level in my data, in-stream, so that my model training is proper or is not very expensive, we have better precision?’ We had to shift a little bit that, ‘Can we separate this data in-stream for our users?’ Like good data, suspicious data, so they train it on little bit pre-processed data and they can optimize their costs. There's a lot that has changed from even people, their education level, but use cases also just within the last three years. Can we, as a tool, let users have some control and what they define as quality data reliability, and then monitor on those metrics was some of the things that we have done. That's how we think of data reliability. Full pipeline from ingestion to consumption, ability to have some human’s input in the system.” – Mona Rakibe ------------------- Episode Timestamps: (01:04): The journey of Telmai (05:30): How we should think about data reliability, quality, and observability (13:37): What open source data means to Mona (15:34): How Mona guides people on cleaning up their data pipelines (26:08): LLMs in real life (30:37): A question Mona wishes to be asked (33:22): Mona’s advice for the audience (36:02): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Mona Learn more about Telmai
	Democratizing Cloud Infrastructure \| Kevin Carter	20 May 2025	00:59:19
Discover how Rackspace Spot is democratizing cloud infrastructure with an open-market, transparent option for cloud servers. Kevin Carter, Product Director at Rackspace Technology, discusses Rackspace Spot's hypothesis and the impact of an open marketplace for cloud resources. Discover how this novel approach is transforming the industry. TIMESTAMPS [00:00:00] – Introduction & Kevin Carter’s Background [00:02:00] – Journey to Rackspace and Open Source [00:04:00] – Engineering Culture and Pushing Boundaries [00:06:00] – Rackspace Spot and Market-Based Compute [00:08:00] – Cognitive vs. Technical Barriers in Cloud Adoption [00:10:00] – Tying Spot to OpenStack and Resource Scheduling [00:12:00] – Product Roadmap and Expansion of Spot [00:16:00] – Hardware Constraints and Power Consumption [00:18:00] – Scrappy Startups and Emerging Hardware Solutions [00:20:00] – Programming Languages for Accelerators (e.g., Mojo) [00:22:00] – Evolving Role of Software Engineers [00:24:00] – Importance of Collaboration and Communication [00:28:00] – Building Personal Networks Through Open Source [00:30:00] – The Power of Asking and Offering Help [00:34:00] – A Question No One Asks: Mentors [00:38:00] – The Power of Educators and Mentorship [00:40:00] – Rackspace’s OpenStack and Spot Ecosystem Strategy [00:42:00] – Open Source Communities to Join [00:44:00] – Simplifying Complex Systems [00:46:00] – Getting Started with Rackspace Spot and GitHub [00:48:00] – Human Skills in the Age of GenAI - Post Interview Conversation [00:54:00] – Processing Feedback with Emotional Intelligence [00:56:00] – Encouraging Inclusive and Clear Collaboration QUOTES CHARNA PARKEY “If you can’t engage with this infrastructure in a way that’s going to help you, then I guarantee you it’s not up to par for the direction that we’re going. [...] This democratization — if you don’t know how to use it — it’s not doing its job.” KEVIN CARTER “Those scrappy startups are going to be the ones that solve it. They’re going to figure out new and interesting ways to leverage instructions. [...] You’re going to see a push from them into the hardware manufacturers to enhance workloads on FPGAs, leveraging AVX 512 instruction sets that are historically on CPU silicon, not on a GPU.”
	Throwback: Open Source Innovation, The GPL for Data, and The Data In to Data Out Ratio with Larry Augustin	18 Oct 2023	00:40:57
This episode features an interview with Larry Augustin, angel investor and advisor to early-stage technology companies. Larry previously served as the Vice President for Applications at AWS, where he was responsible for application services like Pinpoint, Chime, and WorkSpaces. Before joining AWS, Larry was the CEO of SugarCRM, an open source CRM vendor. He also was the founder and CEO of VA Linux, where he launched SourceForge. Among the group who coined the term “open source”, Larry has sat on the boards of several open source and Linux organizations. In this episode, Sam and Larry discuss who owns the rights to data, the data in to data out ratio, and why Larry is an open source titan. ------------------- "People are willing to give up so much of their personal information because they get an awful lot back. And privacy experts come along and say, ‘Well, you're taking all this personal information’. But then most people look at that and say, ‘But I get a lot of value back out of that.’ And it's this data ratio value question, which is: for a little in, I get a lot back. That becomes a key element in this. And I think there has to be some kind of similar thought process around open source data in general, which is if I contribute some data into this, I'm going to get a lot of value back. So this data in to data out ratio, I think it's an incredibly important one. And it gets everyone in the mindset of, ‘How do I provide more and more and take less and less?’ It's a principle of application development that I like a lot. And I think there's a similar concept here around open source data. Are there models or structures that we can come up with where people can contribute small amounts of data and as a result of that, they get back a lot of value.” – Larry Augustin ------------------- Episode Timestamps: (02:52): How Larry is spending his time now after AWS (06:25): What drove Larry to open source (18:41): What is the GPL for data? (24:28): Areas of progress in open source data (28:57): The data in to data out ratio (36:39): Larry’s advice for folks in open source ------------------- Links: LinkedIn - Connect with Larry Twitter - Follow Larry
	Reframing Machine Learning and AI-Assisted Development with Jorge Torres	27 Sep 2023	00:45:11
This episode features an interview with Jorge Torres, Co-founder and CEO of MindsDB. MindsDB is a virtual AI database that works with existing data to help developers build AI-centered apps. In 2008, Jorge began his work on scaling solutions using machine learning as the first full-time engineer at Couchsurfing, growing the company from a few thousand users to a few million. He has also served a number of data-intensive start-ups and was a visiting scholar at UC Berkeley researching machine learning automation and explainability. In this episode, Sam and Jorge discuss the inspiration and challenges behind MindsDB, classic data science AI versus applied AI, and time series transformers. ------------------- “So much data in the world is time series data, so much data. Even data that people don't know is time series, it's time series. So long as it’s moving over time, it is time series data. Whether you store it or not, that's a different thing. For having a pre-trained model on time series data, it even enabled the fact that you don't have to store all the historical data. You can just take the model and start passing data as it comes through, and then you get out the forecast. So you don't even have to have the historical data. All you need to have is the data at that given instance, and you can pass it to the model and you get an output. It's mind blowing.” – Jorge Torres ------------------- Episode Timestamps: (05:20): The inspiration behind MindsDB (10:20): Classic data science AI approach vs. applied AI (22:09): What open source data means to Jorge (28:51): What excites Jorge about Nixtla and time series transformers (37:07): A question Jorge wishes to be asked (40:20): Jorge’s advice for the audience (41:38): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Jorge Learn more about MindsDB open source code Learn more about MindsDB
	A Sam Ramji Feature: The Evolution of Open Source, Kubernetes, and AI's Forward Journey	06 Sep 2023	01:09:46
On this episode, we’ve partnered with the Future Rodeo podcast for a discussion between Sam and Matt Wallace. Matt is the Chief Technology Officer and EVP at Faction, a pioneer of multi-cloud data services, and host of Future Rodeo. In this episode, Sam and Matt discuss Microsoft’s transformation, the impact of Kubernetes on container orchestration, and the rapid acceleration of AI research and development. ------------------- Episode Timestamps: (01:38): Microsoft’s open source transformation (13:19): The impact of Kubernetes and how it defragmented the industry (22:06): The transformative power of AI and how it’s changing the value of reasoning (54:58): The concept of cognitive economy and its potential impact on AI and software development (01:03:25): Potential implications of advancements in robotics, AI, and clean energy (01:04:17): Sam’s advice for those entering the industry or choosing a career path ------------------- Links: LinkedIn - Connect with Matt Listen to the Future Rodeo podcast
	The Importance of Open Source Data for Generative AI, Now and in the Future with Abby Kearns	23 Aug 2023	00:46:14
This episode features an interview with Abby Kearns, technology executive, board director, and angel investor. Her career has spanned executive leadership, product marketing, product management, and consulting across Fortune 500 companies and startups, including Puppet, Cloud Foundry Foundation, and Verizon. Abby currently serves as a board director for Lightbend, Stackpath, and Invoke. In this episode, Sam sits down with Abby to discuss the betrayal source license, the role open source plays in AI, and empowering trust. ------------------- “There's so much happening so quickly that I think open source has the power to help harness a lot of that innovative conversation. In a way that I think it's going to be really, really hard to match in a proprietary way. I think open source and the ability, given the fact that we're talking about AI and data, the two are very interrelated at this point. AI is not super interesting without data. I think the power of open source right now and what's happening, I think it has to happen in open source and I think it really has to have that level of transparency and visibility. But, always the ability for everyone to step up and understand what's happening at this moment in time and shape it.” – Abby Kearns ------------------- Episode Timestamps: (00:50): Sam and Abby discuss the betrayal source license (14:12): What open source data means to Abby (23:30): Abby dives into the companies she’s investing in (34:30): How nonprofits can empower trust (38:32): A question Abby wishes to be asked (40:21): Abby’s advice for the audience (43:53): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Abby Twitter - Follow Abby Read Design the Life You Love
	The Value of Reproducibility and Ease of AI Deployment with Daniel Lenton	09 Aug 2023	00:33:58
This episode features an interview with Daniel Lenton, Founder and CEO of Ivy, where the team is on a mission to unify the fragmented AI stack. Prior to Ivy, Daniel was a Robotics Research Engineer at Dyson and a Deep Learning Research Scientist for Amazon Prime Air. During his PhD, Daniel explored the intersection between learning-based geometric representations, ego-centric perception, spatial memory, and visuomotor control for robotics. In this episode, Sam and Daniel discuss the inspiration behind Ivy, open source reproducibility, and democratizing AI. ------------------- "There's too much amazing stuff going on, from too many different parties. We just want to be the objective source of truth to show you the data and show you where your model will be doing best, and continue to do this as a service or something like this. This is high-level, some of the areas we see and going into, we really want to be a useful tool for anybody that wants to just kind of understand this fragmented complex space quickly and intuitively, and we are trying to be the tool that does that." – Daniel Lenton ------------------- Episode Timestamps: (01:00): What open source data means to Daniel (05:37): The challenges of building Ivy (15:37): The future of Ivy (25:19): Who should know about Ivy (28:46): Daniel’s advice for the audience (32:00): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Daniel Learn more about Ivy
	ML Engineering Teams and Niche Chat Bot Experiences with Demetrios Brinkmann	26 Jul 2023	00:50:17
This episode features an interview with Demetrios Brinkmann, Founder of the MLOps Community, an organization for people to share best practices around MLOps. Demetrios fell into the Machine Learning Operations world and has since interviewed leading names around MLOps, data science, and machine learning. In this episode, Sam sits down with Demetrios to discuss LLM in production use cases, ML engineering teams, and the LLM Survey Report from the MLOps Community. ------------------- "I think the most novel ones that I saw from the survey were when a chat bot would prompt a human as opposed to the human prompting the chat bot. It's almost like you have this LLM coach. And in that way, it's not necessarily like this isn't LLM in production that an end user is getting that's not outside the business or that is outside the business. It's more like internally, you can think about maybe it's an accountant and the accountant is filing my taxes for the year. As they're filing them, the LLM is prompting them on different tax laws that maybe they weren't thinking about or different ways that they could file things." – Demetrios Brinkmann ------------------- Episode Timestamps: (04:30): LLMs as the new standard (19:26): Key LLM in production use cases (31:18): What open source data means to Demetrios (34:36): What Demetrios is seeing in open source AI models (42:44): One question Demetrios wishes to be asked (44:41): Demetrios’s advice for the audience (47:19): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Demetrios Read the LLM Survey Report Listen to The MLOps Podcast
	Building With Trust, Inspiration, and Reputation with Jaya Gupta, Yuliia Tkachova, and Omoju Miller	12 Jul 2023	00:04:12
This bonus episode features conversations from season 5 of the Open\|\|Source\|\|Data podcast. In this episode, you’ll hear from Jaya Gupta, Partner at Foundation Capital; Yuliia Tkachova, Co-founder and CEO of Masthead Data; and Omoju Miller, Founder and CEO of Fimio. Sam sat down with each guest to discuss how they are building foundations for trust, inspiration, and reputation as we all race into the AI-centric future. You can listen to the full episodes from Jaya Gupta, Yuliia Tkachova, and Omoju Miller by clicking the links below. ------------------- Episode Timestamps: (00:49): Jaya Gupta (01:48): Yuliia Tkachova (03:03): Omoju Miller ------------------- Links: Listen to Jaya’s episode Listen to Yuliia’s episode Listen to Omoju’s episode
	FMOps and a Founders Automated Future with Jaya Gupta	28 Jun 2023	00:33:49
This episode features an interview with Jaya Gupta, Partner at Foundation Capital, where she leads early-stage investments across the enterprise software stack. Previously, Jaya was a Senior Business Analyst at McKinsey & Company focusing on software diligence and helping startups expand their go-to-market strategies. In this episode, Sam and Jaya discuss her journey to Foundation Model Ops, how software is becoming more accessible, and the democratization of AI tools. ------------------- "At the end of the day, FMOps isn't just about the new tools. It's actually more about the new builders, the new workflows, and a completely new market of customers. I was on the other day, looking at LangChain's page of integrations, I don't know if you've seen it, but it's like Anyscale, Databricks, all these other huge legendary companies are integrating with LangChain, and I think it's clear that there's a huge community that is building something real and valuable." – Jaya Gupta ------------------- Episode Timestamps: (01:05): What open source data means to Jaya (08:51): Jaya’s journey to Foundation Model Ops (15:58): How software is becoming more accessible (23:04): The democratization of AI tools (27:01): One question Jaya wishes to be asked (29:32): Jaya’s advice for the audience (31:51): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Jaya Follow Jaya on Twitter Learn more about FMOps
	Web3 and Putting Reputation on Code with ML with Omoju Miller	31 May 2023	01:02:01
This episode features an interview with Omoju Miller, Founder and CEO of Fimio, a web3 reputation company. Originally from Lagos, Nigeria, Omoju holds a doctoral degree in Computer Science Education from UC Berkeley. Her expertise in machine learning and computational intelligence led her to companies such as Google and GitHub. Omoju also served as a volunteer advisor to the Obama administration’s White House Presidential Innovation Fellows. In this episode, Sam sits down with Omoju to discuss how machine learning can make applications more secure, what the future of the internet looks like, and the fascinating story behind Fimio. ------------------- “So my first view is, in this future internet we have people, we also have bots, we have machines, we have code doing things. And bots sounds like such a horrible word now. [...] You need to have a level of trust on what that bot is. Everything from the humans to the machines collaborating in this decentralized world, we need to have some kind of reputation attached to each of those nodes. And the reason why we need that reputation is, as the thing scales, it becomes overwhelming to get value from it. You need something to help you filter, to find what you're looking for. Otherwise, you get stuck in that environment where you're just completely overwhelmed and you don't even know what to do. So I think of what I'm doing as just reputation to make this decentralized future slightly more attainable.” – Omoju Miller ------------------- Episode Timestamps: (00:59): Omoju’s inspiration for starting Fimio (10:27): The future of smart contracts (28:47): Using mathematics to guarantee the safety of algorithms (34:34): What led Omoju to building a mathematical product (51:27): What open source data means to Omoju (55:38): One question Omoju wishes to be asked (57:47): Omoju’s advice for the audience (01:00:08): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Omoju Visit Fimio
	The Human Right to Privacy and Caring About UX Design with Yuliia Tkachova	17 May 2023	00:46:34
This episode features an interview with Yullia Tkachova, Co-founder and CEO of Masthead Data, an observability platform that catches anomalies in Google BigQuery in real-time. She holds degrees in Management Information Systems, Math, Statistics, and Marketing. Prior to Masthead, Yuliia designed complex BI products and solutions powered by ML and utilized by Fortune 500 companies. In this episode, Sam and Yuliia discuss how ML is shaping the future of data analytics, caring about users, and the fundamental human right to privacy. ------------------- “We map those errors and anomalies on lineage, helping to understand what upstreams and downstreams are affected, what business users are affected. And that actually speeds up all the troubleshooting from hours to minutes. And this is the ultimate goal where we deliver. Because again, my belief that if you don't have this lineage piece was mapped anomalous in errors, it's not observability. It's monitoring. [...] What is also very unique to us, because Masthead operates on logs, it's triggered by logs. So, we do support streaming data. Unlike SQL-first solutions, as you can guess. We don't have to run SQL queries to see if they're anomalous, we’re triggered by logs. And this is also what sets us apart.” – Yuliia Tkachova ------------------- Episode Timestamps: (01:14): What got Yuliia excited about math and statistics (11:31): The basic human right to privacy (18:21): What open source data means to Yuliia (28:00): Yuliia’s reason for building a solution focused on privacy and security (38:09): One question Yuliia wishes to be asked (42:21): Yuliia’s advice for the audience (44:46): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Yuliia Visit Masthead Data
	AI and the Future of Media Consumption \| Pete Pachal	06 May 2025	01:03:58
In this episode of Open Source Data, Charna Parkey interviews Pete Pachal, founder of The Media Copilot. With over two decades of experience covering technology, Pete shares his insights on how AI is transforming media, journalism and discusses how journalists can embrace AI as a tool to enhance their work to adapt and thrive in this new environment. QUOTES PETE PACHAL: AI is something that you control. I know, it feels like it's a wave that's coming over that it's unstoppable, inevitable. And that's true to a large extent. But at the same time, it's not, there's no there, right? There's no spark, there's no intent. (...) Never relinquish your role as the ultimate creator and person responsible for what's coming out of this thing. CHARNA PARKEY: I think that there was a point where I found myself shifting more away from media and towards individual curated newsletters because like subject matter experts in that area, I could be like maybe they're going to summarize it incorrectly, et cetera. But at least I know my theory of mind of that individual. And then when I expand that to media, I don't know who's writing what and who's shadow writing what for who. TIMESTAMPS 00:00:00 - Introduction of Pete Pachal and his background in journalism and AI. 00:02:00 - Pete’s career journey, including his work at CoinDesk and founding The Media Copilot. 00:04:00 - AI training for media professionals (journalists, PR, marketers). 00:06:00 - Evolution of AI in journalism: From skepticism to ethical frameworks. 00:08:00 - AI in content pipelines: Idea generation vs. post-production tasks. 00:10:00 - Open-source builders needing to cater to domain experts (e.g., journalists). 00:12:00 - Meta’s removal of fact-checking and its implications. 00:16:00 - Public tolerance for AI errors (e.g., Apple’s AI summaries). 00:18:00 - Consumer trust shifts away from platforms like Facebook/X. 00:22:00 - Ghostwriting vs. authenticity in AI-generated content. 00:24:00 - Preference for human-curated newsletters over AI summaries. 00:26:00 - AI in news digests (e.g., Perplexity, Alexa). 00:28:00 - Publisher AI experiments (Washington Post chatbot, TIME summaries). 00:32:00 - AI’s impact on click-through rates and publisher economics. 00:34:00 - AI-written articles (e.g., ESPN’s use case) and copyright issues. 00:36:00 - Legal battles over AI training data (NYT vs. OpenAI). 00:38:00 - Copyright concerns with AI-generated outputs. 00:40:00 - AI search tools (Perplexity, ChatGPT) and publisher licensing deals. 00:46:00 - The unhealthy impact of social media trends on journalism. 00:48:00 - Post-interview discussion: Accountability in AI and media. 00:56:00 - Leo’s perspective as a journalist on AI adoption. 00:58:00 - Closing thoughts on balancing AI innovation with industry needs.
	Determinism in Complex Environments and Workflow Services with Maxim Fateev	03 May 2023	00:42:06
This episode features an interview with Maxim Fateev, Co-founder and CEO of Temporal, an open source, distributed, and scalable workflow orchestration engine capable of running millions of workflows. He has 20 years of experience architecting mission-critical systems at Uber, Google, Amazon, and Microsoft. In this episode, Sam sits down with Maxim to discuss workflow services, the power behind Temporal, and bringing determinism to highly complex environments. ------------------- “[Temporal] has this notion of workflows, which can run for a very long time and handle external events, you can treat them as a durable actor. And they're very good at implementing a lifecycle. For example, you can have an object per model and let this object handle all the events. Like, new data came in, notify this object, this object will go and retrain it. Or, it'll run an activity to superiorly check the status. So you can have end-to-end lifecycle implemented fully in Temporal.” – Maxim Fateev ------------------- Episode Timestamps: (01:03): What’s top of mind for Maxim in workflow services (04:09): What open source data means to Maxim (11:07): Maxim explains his time at AWS and building Cadence at Uber (23:09): Use cases and the community of Temporal (28:26): How Temporal is being used for ML workloads (32:28): One question Maxim wishes to be asked (36:38): Maxim’s advice for those working with complex distributed systems (39:11): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Maxim Temporal.io Watch Maxim’s talk “Designing a Workflow Engine from First Principles” Replay Conference 2023
	The AI-Native Stack in Practice with Charna Parkey and Sam Bean	15 Mar 2023	01:06:25
This episode features a panel discussion with Charna Parkey, a Real-Time AI Product and Strategy leader at DataStax; and Sam Bean, Staff Engineer at You.com. Charna is a co-author and inventor on several patents, including patent-pending work on ML/coordinated feature engine at the edge. Sam helped create the Spark connector to Weaviate, and is passionate about Big Data, Spark, NLP, Hugging Face, and large language models. In this episode, Charna and Sam discuss adapting to user expectations, what’s missing in the AI stack, and how to become an advanced citizen in open source. ------------------- "We've seen these companies start to better understand that these streaming technologies have a place, whether it's Kafka or Flink or Pulsar, but it's still incredibly difficult to use and we need a different level of abstraction. [...] We're starting to see the stack change so that it becomes more interchangeable of the components and try to sort of raise that layer of abstraction so that we can get these types of models and these types of capabilities to more people." – Charna Parkey "I think that a lot of what you need to adjust to are these, what you were discussing as I call interaction data, you were calling it event data. But these interactions that people have with the internet and trying to find ways to model that in a way that even if your models aren't real-time, having ways to featurize real-time data in a way that's interpretable by a model. [...] I think Spark and Kafka and Delta and all of those things, give you a lot more flexibility now to move in different directions and readjust and I think, pivot what you want to do with the system." – Sam Bean ------------------- Episode Timestamps: (01:29): Sam explains his background (03:36): Charna explains her background (18:13): Sam explains the problems You.com is solving for (28:21): Changes in user expectations in the AI-native stack (39:09): Advice for becoming an advanced citizen in open source (47:25): What’s missing in the AI stack (54:51): What open source data means to the panelists (58:22): How technologists should prepare for the future (01:03:10): Executive producer, Audra Montenegro's backstage takeaways ------------------- Links: LinkedIn - Connect with Charna Visit DataStax LinkedIn - Connect with Sam Visit You.com
	The AI-Native Stack with Mikiko Bazeley, Zain Hasan, and Tuana Celik	01 Mar 2023	00:56:48
This episode features a panel discussion with Mikiko Bazeley, Head of MLOps at Featureform; Zain Hasan, Senior Developer Advocate at Weaviate; and Tuana Celik, Developer Advocate at deepset. In this episode, Mikiko, Zain, and Tuana discuss what open source data means to them, how their companies fit into the AI-first ecosystem, and how jobs will need to evolve with the AI-native stack. ------------------- “We're almost part of a fancy new AI robot kitchen that you'd find in Tokyo, in some ways. I see a virtual feature store as, yes, you can have a bunch of your ingredients tossed into a closet. Or, what you can do is you can essentially have a nice way to organize them. You can have a way to label them, to capture information.” – Mikiko Bazeley “I really like that analogy as well. I like how Mikiko put it where a vector search engine is really extracting value from what you've already got. [...] So where I see vector search engines, really, is if we think of these embedding providers as the translators to take all of our unstructured data and bring it into vector space into a common machine language, vector search engines are essentially the workhorses that allow us to compute and search over these objects in vectorized format. They're essentially the calculators of the AI stack.” – Zain Hasan “Haystack, I would really position as the kitchen. I need Mikiko to bring the apples. I need Zain to bring the pears. I need Hugging Face or OpenAI to bring the oranges to make a good fruit salad. But, Haystack will provide the spoons and the pans and the knives to make that into something that works together.” – Tuana Celik ------------------- Episode Timestamps: (02:08): What open source data means to the panelists (08:22): What interested the panelists about AI/ML (23:20): Mikiko explains Featureform (26:11): Zain explains Weaviate (29:34): Tuana explains deepset (35:11): The panelists discuss how their companies fit into the AI-first ecosystem (44:12): How jobs need to evolve with the AI-native stack (53:45): Executive producer, Audra Montenegro's backstage takeaways ------------------- Links: LinkedIn - Connect with Mikiko Visit Featureform LinkedIn - Connect with Zain Visit Weaviate LinkedIn - Connect with Tuana Visit deepset Visit Data-centric AI
	Special Episode: Data on Kubernetes and Cassandra Forward with Patrick McFadin	22 Feb 2023	00:18:44
This special episode of Open\|\|Source\|\|Data features an interview with Patrick McFadin. Patrick has been a distributed systems hacker since he first plugged a modem into his Atari computer. Looking for adventure, he joined the US Navy, working on the Naval Tactical Data System (NTDS), which cemented his love of distributed systems. He is now an Apache Cassandra Committer, and is the Vice President of Developer Relations at DataStax. Sam catches up with Patrick at Data Day Texas to discuss his book Managing Cloud Native Data on Kubernetes, Cassandra Forward, and the future of Apache Cassandra. ------------------- “I can now use my Parquet file in Iceberg or DuckDB, and this is data that I created with Cassandra. And we're not getting to the point where we have to reinvent an entire database. We can just connect the Lego parts together and if they're open, then I don't have these encumbrances. I'm not like, ‘Well, I can connect that if I call a salesperson and get a license.’ [...] That's what's exciting to me about Cassandra, the way that the ecosystem is evolving around Cassandra. It's not, ‘Cassandra's at the center, it's just a player.’ It's at the party." – Patrick McFadin ------------------- Episode Timestamps: (01:06): What open source data means to Patrick (02:11): Patrick discusses his book Managing Cloud Native Data on Kubernetes (10:02): Patrick discusses Cassandra Forward (11:09): The future of Apache Cassandra ------------------- Links: LinkedIn - Connect with Patrick Cassandra Forward
	Making Graph Data Easier with Open Initiatives with Denise Gosnell	15 Feb 2023	00:40:10
This episode features an interview with Denise Gosnell, Principal Product Manager at Amazon Web Services. At AWS, Denise leads product and strategy for Amazon Neptune, a fully managed graph database service. Her career centers on her passion for examining, applying, and advocating for the applications of graph data. Denise has also authored, patented, and spoken on graph theory, algorithms, databases, and applications across all industry verticals. In this episode, Sam sits down with Denise to discuss graph initiatives, the future of developer models, and what Denise learned from hiking the Appalachian Trail. ------------------- “We just open sourced something called graph-explorer, which is something for the community by the community, Apache 2.0 license. graph-explorer is a low-code visualization tool. But, the best part about it is that it works for JanusGraph, it works for Blazegraph, it works for all of these graph models that we've talked about, because we've got this divided graph community, but it was written to work with all graphs. [...] Today it's all, ‘Here's your Lego blocks and build one on your own. If you want to go ahead and fork Jupyter Notebook and figure out a way to get that D3 force-directed graph way out to pop up, have fun.’ It's the first time that we've had a unified way across graph vendors and graph implementations to have a way to visualize your graph data in one tool that's open source.” – Denise Gosnell ------------------- Episode Timestamps: (01:17): What open source data means to Denise (04:27): How Denise got interested in computer science (08:39): Denise’s work on graph initiatives (14:30): How Denise’s work at LDBC relates to SQL standards (23:43): The future of developer models (29:43): One question Denise wishes to be asked (34:05): Denise’s advice for graph practitioners (37:37): Executive producer, Audra Montenegro's backstage takeaways ------------------- Links: LinkedIn - Connect with Denise The Practitioner’s Guide to Graph Data
	Advising Big Data and The Future of AI/ML with Ben Lorica	01 Feb 2023	00:48:13
This episode features an interview with Ben Lorica, Co-founder and Principal of Gradient Flow, a company that provides a wide range of content on data and technology. Ben is an industry expert on data, machine learning, and AI. He is a Technical Advisor for Databricks, a program chair for several data conferences, and he hosts The Data Exchange Podcast. In this episode, Sam and Ben discuss Big Data and the improvements and future opportunities of AI and machine learning. ------------------- “The reason I use the word decentralize is because when you try to explain it to someone, let's say you want to train a different model for each user, or region, or sensor, or device. So you can't use necessarily just personalized because recommenders can be personalized, but they're still centralized models.” – Ben Lorica ------------------- Episode Timestamps: (01:17): What open source data means to Ben (05:54): What intrigued Ben about Big Data (12:07): What brought Ben to working on Ray (16:15): Ben’s opinion on how far AI and ML have come in the last 5 years (26:38): What Ben sees happening in this space in the next 5 years (39:06): What challenges Ben sees in the next 5 years (43:51): One question Ben’s always wanted to be asked (44:55): Ben’s advice for those starting their open source data adventure (46:34): Executive producer, Audra Montenegro's backstage takeaways ------------------- Links: LinkedIn - Connect with Ben Gradient Flow’s Newsletter Gradient Flow’s 2023 Trends Report Visit Sky Labs
	Functional Programming and an Ideal Data Stack Building Experience with Holden Karau	18 Jan 2023	00:45:11
This episode features an interview with Holden Karau, an Open Source Engineer at Netflix. Holden is best known for her work on Apache Spark, her advocacy in the open source software movement, and her creation of a variety of related projects including spark-testing-base. Previously, Holden worked at Big Tech companies like Apple, IBM, and Google as a software engineer and developer advocate. In this episode, Sam sits down with Holden to discuss the data analysis stack, functional programming, and the future of open source software data tooling. ------------------- “These things are not one off. We may think that they're one off and they don't need testing, but that's not the reality. When you write something, it needs to be maintainable and as software people, the only real way that I think we know to make something vaguely maintainable is to at least have tests. And these tests need to cover common failure cases that we've experienced. And certainly, there's different approaches to this. There's property based testing, there's golden sets, all kinds of different options. I don't think necessarily any one approach is right or better here, but I think we need something. We need less untitled 5.IPython Notebook running in production, scheduled every hour. That is not a way to run a company.” – Holden Karau ------------------- Episode Timestamps: (02:27): What open source data means to Holden (04:37): What interested Holden in mathematical computer science (09:51): What drew Holden to Spark (12:49): What Holden has learned about cognitive systems (20:02): What we need to learn as developers and data specialists (25:28): The future of the data analysis stack (31:21): Improvements in data tooling over the next 5 years (34:25): A question Holden wishes to be asked (40:51): Holden’s advice for open source data project committers (43:18): Executive producer, Audra Montenegro's backstage takeaways ------------------- Links: LinkedIn - Connect with Holden Buy Holden’s books Visit Holden’s website
	Workflow Engines and Building a Domain Specific Language for Data Quality with Tom Baeyens	04 Jan 2023	00:33:47
This episode features an interview with Tom Baeyens, Co-founder and CTO of Soda, where he oversees the company's product development, software architecture, and technology strategy. He is passionate about open source and committed to building a community where data engineers can succeed using the Soda Data Monitoring Platform. Tom is the inventor of the widely-used open source project jBPM and Activiti. He also co-founded Effektif, a cloud process automation company. In this episode, Sam and Tom discuss the evolution of open source workflow engines, data contracts, and why data quality needs a language approach. ------------------- “Where we're heading is what I think is exactly the same as with software engineering in the testing. Test-driven development was a radical new thing back then. But then it turns out, you can much more reliably release software. And this is exactly the same here. If you don't inject data testing, data observability throughout your data stack, then how are you going to trust the data that you put into your machine learning model? This is something that people are realizing, but we're still figuring out the best practices, the dos, the don'ts. We've come a long way, but there's still a way to go before this is as common and as normal as in the test-driven development software engineering space.” - Tom Baeyens ------------------- Episode Timestamps: (01:23): What open source data means to Tom (04:34): Tom’s motivations for creating jBPM (09:39): What led Tom to building Soda (13:57): Why data quality needs a language approach (19:24): The community of Soda (22:47): The future of Soda as a technology (24:59): A question Tom wishes to be asked (30:24): Tom’s advice for engineers who want to leverage data observability tools ------------------- Links: LinkedIn - Connect with Tom Twitter - Follow Tom Visit SodaCL
	Enabling Edge Workers, AI & ML, and The Future of Data Science with Matthew Rocklin	14 Dec 2022	00:44:14
This episode features an interview with Matthew Rocklin, CEO of Coiled, the scalable Dask-based cloud platform. Prior to founding Coiled, Matthew worked on Dask at Anaconda and then NVIDIA where his teams focused on accelerating Dask through parallel computing and GPUs. Matthew is an industry speaker, author, and founding member of Pangeo, whose mission is to develop open source analysis tools for ocean, atmosphere, and climate science. In this episode, Sam sits down with Matthew to discuss enabling edge workers, the future of data science, and the revolution of AI and ML. ------------------- “There's all sorts of fun people using these tools and that's the most fun part of this job. You get to learn so much about so many different applications that are all so different and all so fascinating. You were thinking about all these different tools and technologies and I was talking to someone once, it's like, ‘Oh, it's like you're standing on the shoulders of giants.’ That's not quite right. There's lots of sort of normal size people all standing on each other's shoulders in like a massive pyramid. [...] Dask was designed to scale up an existing ecosystem. There's a legacy Python ecosystem that’ll provide a layer of parallel computing on top of it. You can do that either by rewriting the whole thing, which is not feasible, or you can do it by talking to lots of people and getting them to integrate in interesting, fun ways. That's actually been the fun parts of Dask. I think I've probably talked to every major maintainer group ever. I have worked with them to find out the ways to get everything to work smoothly together. And that's super fun. There's an interesting sort of technical and social hacking that occurs, which I think Python has done pretty well at, historically. Which is why it has success.” – Matthew Rocklin ------------------- Episode Timestamps: (00:58): What open source data means to Matthew (03:29): Matthew’s motivations behind Python (18:58): How Matthew is enabling edge workers (34:46): What the future of data Python space looks like (39:29): Matthew’s advice for the technical data audience (41:36): Executive producer, Audra Montenegro's backstage takeaways ------------------- Links: LinkedIn - Connect with Matthew Twitter - Follow Matthew Visit Matthew’s Website Visit Dask Dask Examples Visit Coiled SciPy Mission
	OSPOs, Measuring Community Success, and Self Knowledge with Nithya Ruff	07 Dec 2022	00:34:44
This episode features an interview with Nithya Ruff, Head of Open Source Program Office at Amazon. At Amazon, she drives open source culture and coordination and engagement with external communities. Prior to Amazon, Nithya spearheaded and grew Open Source Program Offices (OSPOs) for Comcast and Western Digital. She has also served as the Director-At-Large on the Linux Foundation Board since 2016, where she works to advance the mission of building sustainable ecosystems that are built on open collaboration. In this episode, Sam and Nithya discuss OSPOs, how to measure success, and the evolution of the data ecosystem. ------------------- “I think if we look at what matters to customers, which is innovation, trust, and being a force for change with open source, then we can really deliver on the metrics that the company cares about.” – Nithya Ruff ------------------- Episode Timestamps: (04:02): What open source data means to Nithya (06:29): What interested Nithya about open source software (12:34): What Nithya learned at Western Digital and Comcast that she uses now at Amazon (18:23): What Nithya teaches people in OSPO curriculum (22:06): How the open source data ecosystem has evolved in the last decade (27:44): One question Nithya wishes to be asked (30:37): Nithya’s advice for folks who want to create an OSPO ------------------- Links: LinkedIn - Connect with Nithya Twitter - Follow Nithya Open Source Law, Policy and Practice LinkedIn - Connect with Amazon Twitter - Follow Amazon Visit Amazon
	Your AI Roadmap: Building a Career, Revenue and a Future in AI \| Dr. Joan Bajorek	22 Apr 2025	00:55:36
In this episode, Dr. Joan Bajorek—AI entrepreneur, author of Your AI Roadmap, and founder of Clarity AI—joins Charna Parkey to talk about what it really takes to build a future in AI. From career pivots and layoff anxiety to financial transparency and finding joy in your work, Joan shares practical advice and personal stories navigating fear, burnout, and career uncertainty in tech, while staying grounded in purpose, community, and long-term resilience. TIMESTAMPS [00:00:00] — Introduction to Joan Bajorek & Her Work [00:02:00] — Transparency About Finances and Career [00:04:00] — The Taboo Around Talking About Money [00:06:00] — Resilience During Tech Layoffs [00:08:00] — How to Get Credit for Your Work [00:12:00] — Should You Chase an AI Job? [00:14:00] — Career Goals vs. Financial Security [00:16:00] — Translating Academic and Life Skills into Tech [00:18:00] — Defining and Finding Joy in Work [00:20:00] — Multiple Income Streams and Personal Freedom [00:24:00] — AI’s Near-Future Impact on Jobs and Industries [00:26:00] — Data and AI Opportunities in Underexplored Domains [00:34:00] — Creating Scalable, Alternative Income Models [00:36:00] — How Joan Maintains Long-Term Motivation [00:42:00] — Post-Interview Discussion QUOTES Joan Bajorek "Networking is how I've gotten the best opportunities and jobs of my life... LinkedIn has this research about how after COVID layoffs, 70% of people landed their next job based on an intro." Charna Parkey "I always try to strive for transparency, and I get such mixed results where at work with coworkers, it's absolutely valued. And then there seems to always be some sort of consequences in my personal life."
	IoT Databases, Digital Twins, and Real Holodecks with Jonathan Beri	23 Nov 2022	00:36:45
This episode features an interview with Jonathan Beri, Founder & CEO of Golioth, a commercial IoT development platform built for scale. Previously, Jonathan was a Product Manager at Particle, Google/Nest, Magneto, and Myspace where he spent his time building IoT solutions. In this episode, Sam sits down with Jonathan to discuss the concept of digital twins, the future of IoT databases, and how to build a real holodeck. ------------------- “I think about IoT when I started at Nest, we had some of the best engineers I've ever worked with. Starting from first principles, defining networking protocols, and introducing new specifications that became parts of the fabric of the internet. And fast forward 10 years later, a lot of that exists now as building blocks. Someone who's not a PhD with a lifetime and achievement award from the ITF can go actually design systems that are highly productive, integrated, and enabling. And that's where I get excited. And the through line I think is enabling teams of developers to really create more with their own bare hands. And the technology around it, that is that enabler.” – Jonathan Beri ------------------- Episode Timestamps: (01:33): Jonathan’s motivation for starting Golioth (08:59): The role of data in IoT (11:01): What is a digital twin and why does it matter? (17:12): The classes of problems Jonathan is trying to solve (20:35): The future of IoT databases in the next five years (31:04): What open source data means to Jonathan (32:24): Jonathan explains how to build a real holodeck (33:42): Jonathan’s advice for those excited about industrial data ------------------- Links: LinkedIn - Connect with Jonathan Twitter - Follow Jonathan Visit Jonathan’s Website LinkedIn - Connect with Golioth Twitter - Follow Golioth Visit Golioth
	Healthcare Infrastructure, ALS Research and Reliable Data with Indu Navar	09 Nov 2022	00:46:00
This episode features an interview with Indu Navar, CEO and Founder of EverythingALS, a patient-driven non-profit, bringing technological innovations and data science to support efforts from care to cure, for people with ALS. Indu’s impressive career includes being an original member of the WebMD engineering team, where she was instrumental in using emerging technologies to achieve application scalability and performance. In this episode, Sam sits down with Indu to discuss healthcare infrastructure applications, her strategies for providing reliable patient data, and the future of ALS research. ------------------- “We said, ‘Okay, we're going to make this a citizen-driven research.’ That means patients are going to come and enroll because it's their project and it's patient-driven. So, it's a patient-driven, open innovation. So, once you do open patient-driven, open innovation, now we are the custodians of the data. Patients own the data, so all the data is shared with the patient. That was not done before in any of the research. And so, we give all the data back to the patients. And of course, we give them metrics as well. What was the rate of their speed of their speech? And if they don't want to see it, it's fine, at least they have it. And that data, we are the custodians and as custodians we share the data. So, once we did this model, we got almost close to one thousand people enrolled, consented, within 16 months. As supposed to about 25 people in one year or 50 people in one to two years.” – Indu Navar ------------------- Episode Timestamps: (01:19): What’s changed for Indu in the last tear (05:46): What data infrastructure was like 25 years ago to solve for health outcomes (13:00): Indu’s personal experience with healthcare data (16:47): What Indu is looking forward to in ALS research (20:43): How regulatory establishments have shifted in healthcare (30:31): Where Indu wants to see EverythingALS go in the next year (36:28): One question Indu wishes to be asked (38:28): Indu’s advice for people inspired by EverythingALS ------------------- Links: LinkedIn - Connect with Indu Twitter - Follow Indu Twitter - Follow EverythingALS Visit EverythingALS
	Shifting Left on Data with DeVaris Brown, Tomer Shiran, and Erica Brescia	02 Nov 2022	00:03:10
This bonus episode features conversations from season 3 of the Open\|\|Source\|\|Data podcast. In this episode, you’ll hear from DeVaris Brown, CEO & Co-founder of Meroxa; Tomer Shiran, Founder & CPO of Dremio; and Erica Brescia, Managing Director at Redpoint Ventures. Sam sat down with each guest to discuss how they’re making data more programmable by shifting left. You can listen to the full episodes from DeVaris Brown, Tomer Shiran, and Erica Brescia by clicking the links below. ------------------- Episode Timestamps: (00:12): DeVaris Brown (00:42): Tomer Shiran (01:32): Erica Brescia ------------------- Links: Listen to DeVaris’ episode Listen to Tomer’s episode Listen to Erica’s episode
	Serial Entrepreneurship, Metadata Capture Systems, and Osquery with Tony Gauda	26 Oct 2022	00:33:33
This episode features an interview with Tony Gauda, Head of Customer Engineering at Fleet Device Management, an open core company powered by Osquery. Tony is a serial entrepreneur and inventor with a profound history in fraud, security, and SaaS business. He holds several issued patents and his companies have raised over $40 million in venture funding. Tony is also the founder of ThinAir, a Y-Combinator backed SaaS service that tackles the insider threat problem for enterprises and government agencies. In this episode, Sam and Tony discuss calculating data usage at scale, the creativity of attackers, and how to evolve as threats increase. ------------------- “The great thing about Osquery is that since it is a sensor-based system that is queryable, it literally gives you the ability to discover new indicators of compromise and then use those when doing security investigations. And Osquery allows you to create these extremely interesting queries that would find things that you would never be able to find with a traditionally static functionality agent. And, that to me, is extremely exciting. The fact that you have this agent that is extendable and it's configurable and it's deployable across multiple different platforms, at the end of the day, it feels like it's almost a superpower for visibility.” – Tony Gauda ------------------- Episode Timestamps: (01:17): What Tony is curious about these days (04:39): What problems Tony is trying to solve (05:47): How Tony got into the tech world (11:09): Tony’s inspiration behind ThinAir (15:25): What open source data means to Tony (17:06): What led Tony to being an early adopter of Osquery (20:31): What’s ahead for building next level applications with open and secure data (25:37): One question Tony’s always wanted to be asked (29:24): Tony’s advice for inventors ------------------- Links: LinkedIn - Connect with Tony Twitter - Follow Tony Twitter - Follow Fleetdm Fleetdm Fleetdm GitHub Platform

About us Privacy Policy