AI Evals and Analytics Podcast – Détails, épisodes et analyse

Détails du podcast

Informations techniques et générales issues du flux RSS du podcast.

AI Evals and Analytics Podcast

AI Evals and Analytics Podcast

Stella and Amy

Actualités
Éducation
Business & Entrepreneuriat

Fréquence : 1 épisode/13j. Total Éps: 2

Firstory

Build trustworthy AI products through evaluation-driven development.

Each episode covers practical evaluation strategies, industry trends, and best practices for building safe, reliable AI systems. From dataset generation and evals metrics design to cross-functional collaboration and post-launch analytics, we talk about how to build trustworthy and lasting AI products with a good AI evals and analytics framework.

Subscribe for practical techniques, industry insights, and guest interviews on AI evaluation and analytics.

More about AI Evals and Analytics -- https://ai-evals.org/

We (Stella & Amy) created the AI Evaluation & Analytics Playbook, a practical framework that helps teams ship production-ready, trustworthy AI systems.



Powered by Firstory Hosting

Site
RSS
Apple

Classements récents

Dernières positions dans les classements Apple Podcasts et Spotify.

Apple Podcasts

  • 🇺🇸 États-Unis - techNews

    14/02/2026
    #85
  • 🇺🇸 États-Unis - techNews

    13/02/2026
    #74
  • 🇺🇸 États-Unis - techNews

    12/02/2026
    #70
  • 🇺🇸 États-Unis - techNews

    11/02/2026
    #58
  • 🇺🇸 États-Unis - techNews

    10/02/2026
    #84
  • 🇺🇸 États-Unis - techNews

    09/02/2026
    #74
  • 🇺🇸 États-Unis - techNews

    08/02/2026
    #92
  • 🇺🇸 États-Unis - techNews

    07/02/2026
    #99
  • 🇺🇸 États-Unis - techNews

    06/02/2026
    #72
  • 🇺🇸 États-Unis - techNews

    05/02/2026
    #52

Spotify

    Aucun classement récent disponible



Qualité et score du flux RSS

Évaluation technique de la qualité et de la structure du flux RSS.

See all
Qualité du flux RSS
À améliorer

Score global : 38%


Historique des publications

Répartition mensuelle des publications d'épisodes au fil des années.

Episodes published by month in

Derniers épisodes publiés

Liste des épisodes récents, avec titres, durées et descriptions.

See all

Build AI Evals from Scratch: When and How?

samedi 7 février 2026Durée 17:48

What is Evaluation-driven development? When should you start building evals for your product? How to build it from scrach?

Using a real-world example of a customer chatbot for a medical insurance company, we walk through the process of setting up evals from scratch: translating product requirements into quantifiable metrics, curating quality test datasets (hint: you need fewer examples than you think), and making go/no-go decisions based on eval scores.

You'll learn why accuracy and safety require different approaches, how to avoid the trap of AI-generated test data, and why 94% vs 95% accuracy matters less than you'd expect—but safety guardrails are non-negotiable. This is the practical blueprint for anyone building AI products who wants to catch problems before users do.

00:00 – Introduction: Why We Need to Talk About Evals Now
00:39 – When to Start AI Evals?
03:20 – Example Setup: Medical Insurance Customer Chatbot
04:30 – Defining Evals in Product Requirements
07:19 – What Is Evaluation-Driven Development?
08:27 – Breaking Down "Accuracy": What Does It Really Mean?
09:42 – Dataset Curation: Quality Over Quantity
11:24 – How Big Should Your Test Set Be?
12:25 – Safety Guardrails: Knowledge Boundary and PII Leakage
15:29 – Making Release Decisions with Eval Metrics
17:33 – Start with What's Critical to Your Use Case

Stella Liu: https://www.linkedin.com/in/wenxingl/
Amy Chen: https://www.linkedin.com/in/amy17519/

More about AI Evals and Analytics -- https://ai-evals.org/

We (Stella & Amy) created the AI Evaluation & Analytics Playbook, a practical framework that helps teams ship production-ready, trustworthy AI systems.



Powered by Firstory Hosting

AI Evals Skills: Why Data Scientists Have a Natural Advantage

lundi 26 janvier 2026Durée 22:10

What are the skills required for AI evals? Why data scientists have a natural advantage in AI evals? 

Evaluating AI isn’t just about "vibe coding" with an AI assistant. It actually requires a solid foundation in statistics for picking sample sizes and coding to build your own testing frameworks. Data scientists have a huge head start here because they are already pros at designing metrics and communicating risks. 

In the augural episode, we also explain why Evals (pre-launch testing) and Analytics (post-launch user feedback) are two sides of the same coin: one makes sure the AI works, and the other makes sure people actually love using it.

00:00 – Introduction to AI Evals & Analytics 
01:31 – Why Data Scientists Have a Natural Advantage
01:59 – Technical Pillar: Statistics 
02:48 – Technical Pillar: Coding & Prompt Engineering 
05:03 – Technical Pillar: Dataset Generation 
08:35 – Soft Skills & Stakeholder Collaboration 
11:17 – Domain Expertise in Regulated Industries 
15:50 – New Skills for the GenAI Era 
19:25 – Why Evals and Analytics Must Come Together 

Stella Liu: https://www.linkedin.com/in/wenxingl/
Amy Chen: https://www.linkedin.com/in/amy17519/

More about AI Evals and Analytics -- https://ai-evals.org/

We (Stella & Amy) created the AI Evaluation & Analytics Playbook, a practical framework that helps teams ship production-ready, trustworthy AI systems.




Powered by Firstory Hosting

Podcasts Similaires Basées sur le Contenu

Découvrez des podcasts liées à AI Evals and Analytics Podcast. Explorez des podcasts avec des thèmes, sujets, et formats similaires. Ces similarités sont calculées grâce à des données tangibles, pas d'extrapolations !
冏冏電台 文字與資本主義
Peggy Fo Show
步非烟asmr
小熱NOW
博音
BrosBond
霸軒廣播電台
茫DAY不錄
失婚婦女Chill High High
度菇13咦的茶水間
© My Podcast Data