TalkRL: The Reinforcement Learning Podcast – Détails, épisodes et analyse

Détails du podcast

Informations techniques et générales issues du flux RSS du podcast.

TalkRL: The Reinforcement Learning Podcast

TalkRL: The Reinforcement Learning Podcast

Robin Ranjit Singh Chauhan

Technology
Technology

Fréquence : 1 épisode/31j. Total Éps: 66

Transistor
TalkRL podcast is All Reinforcement Learning, All the Time. In-depth interviews with brilliant people at the forefront of RL research and practice. Guests from places like MILA, OpenAI, MIT, DeepMind, Berkeley, Amii, Oxford, Google Research, Brown, Waymo, Caltech, and Vector Institute. Hosted by Robin Ranjit Singh Chauhan.
Site
RSS
Apple

Classements récents

Dernières positions dans les classements Apple Podcasts et Spotify.

Apple Podcasts

  • 🇫🇷 France - technology

    22/06/2025
    #99
  • 🇨🇦 Canada - technology

    26/12/2024
    #88

Spotify

    Aucun classement récent disponible



Qualité et score du flux RSS

Évaluation technique de la qualité et de la structure du flux RSS.

See all
Qualité du flux RSS
À améliorer

Score global : 62%


Historique des publications

Répartition mensuelle des publications d'épisodes au fil des années.

Episodes published by month in

Derniers épisodes publiés

Liste des épisodes récents, avec titres, durées et descriptions.

See all

Neurips 2024 RL meetup Hot takes: What sucks about RL?

Épisode 61

lundi 23 décembre 2024Durée 17:45

What do RL researchers complain about after hours at the bar?  In this "Hot takes" episode, we find out!  

Recorded at The Pearl in downtown Vancouver, during the RL meetup after a day of Neurips 2024.  

Special thanks to "David Beckham" for the inspiration :)  

RLC 2024 - Posters and Hallways 5

Épisode 60

vendredi 20 septembre 2024Durée 13:17

Posters and Hallway episodes are short interviews and poster summaries.  Recorded at RLC 2024 in Amherst MA.   

Featuring:  

  • 0:01 David Radke of the Chicago Blackhawks NHL on RL for professional sports  
  • 0:56 Abhishek Naik from the National Research Council on Continuing RL and Average Reward  
  • 2:42 Daphne Cornelisse from NYU on Autonomous Driving and Multi-Agent RL  
  • 08:58 Shray Bansal from Georgia Tech on Cognitive Bias for Human AI Ad hoc Teamwork  
  • 10:21 Claas Voelcker from University of Toronto on Can we hop in general?  
  • 11:23 Brent Venable from The Institute for Human & Machine Cognition on Cooperative information dissemination  


Arash Ahmadian on Rethinking RLHF

Épisode 51

lundi 25 mars 2024Durée 33:30

Arash Ahmadian is a Researcher at Cohere and Cohere For AI focussed on Preference Training of large language models. He’s also a researcher at the Vector Institute of AI.

Featured Reference

Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara Hooker


Additional References

Glen Berseth on RL Conference

Épisode 50

lundi 11 mars 2024Durée 21:38

Glen Berseth is an assistant professor at the Université de Montréal, a core academic member of the Mila - Quebec AI Institute, a Canada CIFAR AI chair, member l'Institute Courtios, and co-director of the Robotics and Embodied AI Lab (REAL). 

Featured Links 

Reinforcement Learning Conference 

Closing the Gap between TD Learning and Supervised Learning--A Generalisation Point of View
Raj Ghugare, Matthieu Geist, Glen Berseth, Benjamin Eysenbach

Ian Osband

Épisode 49

jeudi 7 mars 2024Durée 01:08:26

Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty.  

We spoke about: 

- Information theory and RL 

- Exploration, epistemic uncertainty and joint predictions 

- Epistemic Neural Networks and scaling to LLMs 


Featured References 

Reinforcement Learning, Bit by Bit 
Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen 

From Predictions to Decisions: The Importance of Joint Predictive Distributions 

Zheng Wen, Ian Osband, Chao Qin, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracherla, Mohammad Asghari, Benjamin Van Roy  

 

Epistemic Neural Networks 

Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy  


Approximate Thompson Sampling via Epistemic Neural Networks 

Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy 

  


Additional References  

Sharath Chandra Raparthy

Épisode 48

lundi 12 février 2024Durée 40:41

Sharath Chandra Raparthy on In-Context Learning for Sequential Decision Tasks, GFlowNets, and more!  

Sharath Chandra Raparthy is an AI Resident at FAIR at Meta, and did his Master's at Mila.  


Featured Reference 

Generalization to New Sequential Decision Making Tasks with In-Context Learning   
Sharath Chandra Raparthy , Eric Hambro, Robert Kirk , Mikael Henaff, , Roberta Raileanu 

Additional References  


Pierluca D'Oro and Martin Klissarov

Épisode 47

lundi 13 novembre 2023Durée 57:24

Pierluca D'Oro and Martin Klissarov on Motif and RLAIF, Noisy Neighborhoods and Return Landscapes, and more!  

Pierluca D'Oro is PhD student at Mila and visiting researcher at Meta.


Martin Klissarov is a PhD student at Mila and McGill and research scientist intern at Meta.  


Featured References 

Motif: Intrinsic Motivation from Artificial Intelligence Feedback 
Martin Klissarov*, Pierluca D'Oro*, Shagun Sodhani, Roberta Raileanu, Pierre-Luc Bacon, Pascal Vincent, Amy Zhang, Mikael Henaff 

Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control 
Nate Rahn*, Pierluca D'Oro*, Harley Wiltzer, Pierre-Luc Bacon, Marc G. Bellemare 

To keep doing RL research, stop calling yourself an RL researcher
Pierluca D'Oro 

Martin Riedmiller

Épisode 46

mardi 22 août 2023Durée 01:13:56

Martin Riedmiller of Google DeepMind on controlling nuclear fusion plasma in a tokamak with RL, the original Deep Q-Network, Neural Fitted Q-Iteration, Collect and Infer, AGI for control systems, and tons more!  


Martin Riedmiller is a research scientist and team lead at DeepMind.   


Featured References   


Magnetic control of tokamak plasmas through deep reinforcement learning 
Jonas Degrave, Federico Felici, Jonas Buchli, Michael Neunert, Brendan Tracey, Francesco Carpanese, Timo Ewalds, Roland Hafner, Abbas Abdolmaleki, Diego de las Casas, Craig Donner, Leslie Fritz, Cristian Galperti, Andrea Huber, James Keeling, Maria Tsimpoukelli, Jackie Kay, Antoine Merle, Jean-Marc Moret, Seb Noury, Federico Pesamosca, David Pfau, Olivier Sauter, Cristian Sommariva, Stefano Coda, Basil Duval, Ambrogio Fasoli, Pushmeet Kohli, Koray Kavukcuoglu, Demis Hassabis & Martin Riedmiller


Human-level control through deep reinforcement learning
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis 

Neural fitted Q iteration–first experiences with a data efficient neural reinforcement learning method 
Martin Riedmiller  

Max Schwarzer

Épisode 45

mardi 8 août 2023Durée 01:10:18

Max Schwarzer is a PhD student at Mila, with Aaron Courville and Marc Bellemare, interested in RL scaling, representation learning for RL, and RL for science.  Max spent the last 1.5 years at Google Brain/DeepMind, and is now at Apple Machine Learning Research.   

Featured References

Bigger, Better, Faster: Human-level Atari with human-level efficiency 
Max Schwarzer, Johan Obando-Ceron, Aaron Courville, Marc Bellemare, Rishabh Agarwal, Pablo Samuel Castro 

Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier
Pierluca D'Oro, Max Schwarzer, Evgenii Nikishin, Pierre-Luc Bacon, Marc G Bellemare, Aaron Courville 

The Primacy Bias in Deep Reinforcement Learning
Evgenii Nikishin, Max Schwarzer, Pierluca D'Oro, Pierre-Luc Bacon, Aaron Courville 


Additional References   



Julian Togelius

Épisode 44

mardi 25 juillet 2023Durée 40:04

Julian Togelius is an Associate Professor of Computer Science and Engineering at NYU, and Cofounder and research director at modl.ai


  

Featured References  
Choose Your Weapon: Survival Strategies for Depressed AI Academics

Julian Togelius, Georgios N. Yannakakis


Learning Controllable 3D Level Generators

Zehua Jiang, Sam Earle, Michael Cerny Green, Julian Togelius


PCGRL: Procedural Content Generation via Reinforcement Learning

Ahmed Khalifa, Philip Bontrager, Sam Earle, Julian Togelius


Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation

Niels Justesen, Ruben Rodriguez Torrado, Philip Bontrager, Ahmed Khalifa, Julian Togelius, Sebastian Risi



Podcasts Similaires Basées sur le Contenu

Découvrez des podcasts liées à TalkRL: The Reinforcement Learning Podcast. Explorez des podcasts avec des thèmes, sujets, et formats similaires. Ces similarités sont calculées grâce à des données tangibles, pas d'extrapolations !
Jérémy Coron
My First Million
The Crypto Podcast
Programming Throwdown
Maintenance Phase
The One You Feed
Nutrition with Judy | Carnivore Diet
Sex and Psychology Podcast
The Worn & Wound Podcast
Lex Fridman Podcast
© My Podcast Data