Explore every episode of the podcast TalkRL: The Reinforcement Learning Podcast
Dive into the complete episode list for TalkRL: The Reinforcement Learning Podcast. Each episode is cataloged with detailed descriptions, making it easy to find and explore specific topics. Keep track of all episodes from your favorite podcast and never miss a moment of insightful content.
Rows per page:
50
1–50 of 66
Title
Pub. Date
Duration
Neurips 2024 RL meetup Hot takes: What sucks about RL?
23 Dec 2024
00:17:45
What do RL researchers complain about after hours at the bar? In this "Hot takes" episode, we find out!
Recorded at The Pearl in downtown Vancouver, during the RL meetup after a day of Neurips 2024.
Special thanks to "David Beckham" for the inspiration :)
RLC 2024 - Posters and Hallways 5
20 Sep 2024
00:13:17
Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA.
Featuring:
0:01 David Radke of the Chicago Blackhawks NHL on RL for professional sports
0:56 Abhishek Naik from the National Research Council on Continuing RL and Average Reward
2:42 Daphne Cornelisse from NYU on Autonomous Driving and Multi-Agent RL
08:58 Shray Bansal from Georgia Tech on Cognitive Bias for Human AI Ad hoc Teamwork
10:21 Claas Voelcker from University of Toronto on Can we hop in general?
11:23 Brent Venable from The Institute for Human & Machine Cognition on Cooperative information dissemination
Arash Ahmadian on Rethinking RLHF
25 Mar 2024
00:33:30
Arash Ahmadian is a Researcher at Cohere and Cohere For AI focussed on Preference Training of large language models. He’s also a researcher at the Vector Institute of AI.
Glen Berseth is an assistant professor at the Université de Montréal, a core academic member of the Mila - Quebec AI Institute, a Canada CIFAR AI chair, member l'Institute Courtios, and co-director of the Robotics and Embodied AI Lab (REAL).
Martin Riedmiller of Google DeepMind on controlling nuclear fusion plasma in a tokamak with RL, the original Deep Q-Network, Neural Fitted Q-Iteration, Collect and Infer, AGI for control systems, and tons more!
Martin Riedmiller is a research scientist and team lead at DeepMind.
Featured References
Magnetic control of tokamak plasmas through deep reinforcement learning Jonas Degrave, Federico Felici, Jonas Buchli, Michael Neunert, Brendan Tracey, Francesco Carpanese, Timo Ewalds, Roland Hafner, Abbas Abdolmaleki, Diego de las Casas, Craig Donner, Leslie Fritz, Cristian Galperti, Andrea Huber, James Keeling, Maria Tsimpoukelli, Jackie Kay, Antoine Merle, Jean-Marc Moret, Seb Noury, Federico Pesamosca, David Pfau, Olivier Sauter, Cristian Sommariva, Stefano Coda, Basil Duval, Ambrogio Fasoli, Pushmeet Kohli, Koray Kavukcuoglu, Demis Hassabis & Martin Riedmiller
Human-level control through deep reinforcement learning Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis
Max Schwarzer is a PhD student at Mila, with Aaron Courville and Marc Bellemare, interested in RL scaling, representation learning for RL, and RL for science. Max spent the last 1.5 years at Google Brain/DeepMind, and is now at Apple Machine Learning Research.
Niels Justesen, Ruben Rodriguez Torrado, Philip Bontrager, Ahmed Khalifa, Julian Togelius, Sebastian Risi
Jakob Foerster
08 May 2023
01:03:45
Jakob Foerster on Multi-Agent learning, Cooperation vs Competition, Emergent Communication, Zero-shot coordination, Opponent Shaping, agents for Hanabi and Prisoner's Dilemma, and more.
Jakob Foerster is an Associate Professor at University of Oxford.
Danijar Hafner on the DreamerV3 agent and world models, the Director agent and heirarchical RL, realtime RL on robots with DayDreamer, and his framework for unsupervised agent design!
Danijar Hafner is a PhD candidate at the University of Toronto with Jimmy Ba, a visiting student at UC Berkeley with Pieter Abbeel, and an intern at DeepMind. He has been our guest before back on episode 11.
AI Generating Algos, Learning to play Minecraft with Video PreTraining (VPT), Go-Explore for hard exploration, POET and Open Endedness, AI-GAs and ChatGPT, AGI predictions, and lots more!
Professor Jeff Clune is Associate Professor of Computer Science at University of British Columbia, a Canada CIFAR AI Chair and Faculty Member at Vector Institute, and Senior Research Advisor at DeepMind.
Hear about why OpenAI cites her work in RLHF and dialog models, approaches to rewards in RLHF, ChatGPT, Industry vs Academia, PsiPhi-Learning, AGI and more!
Dr Natasha Jaques is a Senior Research Scientist at Google Brain.
Jacob Beck and Risto Vuorio on their recent Survey of Meta-Reinforcement Learning. Jacob and Risto are Ph.D. students at Whiteson Research Lab at University of Oxford.
John Schulman is a cofounder of OpenAI, and currently a researcher and engineer at OpenAI.
Featured References
WebGPT: Browser-assisted question-answering with human feedback Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew Knight, Benjamin Chess, John Schulman
Training language models to follow instructions with human feedback Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, Ryan Lowe
Sven Mika is the Reinforcement Learning Team Lead at Anyscale, and lead committer of RLlib. He holds a PhD in biomathematics, bioinformatics, and computational biology from Witten/Herdecke University.
Ray Summit 2022 is coming to San Francisco on August 23-24. Hear how teams at Dow, Verizon, Riot Games, and more are solving their RL challenges with Ray's RLlib.
Register at raysummit.org and use code RAYSUMMIT22RL for a further 25% off the already reduced prices.
Karol Hausman and Fei Xia
16 Aug 2022
01:03:09
Karol Hausman is a Senior Research Scientist at Google Brain and an Adjunct Professor at Stanford working on robotics and machine learning. Karol is interested in enabling robots to acquire general-purpose skills with minimal supervision in real-world environments.
Fei Xia is a Research Scientist with Google Research. Fei Xia is mostly interested in robot learning in complex and unstructured environments. Previously he has been approaching this problem by learning in realistic and scalable simulation environments (GibsonEnv, iGibson). Most recently, he has been exploring using foundation models for those challenges.
Featured References
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances [ website ] Michael Ahn, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Daniel Ho, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Eric Jang, Rosario Jauregui Ruano, Kyle Jeffrey, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Kuang-Huei Lee, Sergey Levine, Yao Lu, Linda Luu, Carolina Parada, Peter Pastor, Jornell Quiambao, Kanishka Rao, Jarek Rettinghouse, Diego Reyes, Pierre Sermanet, Nicolas Sievers, Clayton Tan, Alexander Toshev, Vincent Vanhoucke, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Mengyuan Yan
Inner Monologue: Embodied Reasoning through Planning with Language Models Wenlong Huang, Fei Xia, Ted Xiao, Harris Chan, Jacky Liang, Pete Florence, Andy Zeng, Jonathan Tompson, Igor Mordatch, Yevgen Chebotar, Pierre Sermanet, Noah Brown, Tomas Jackson, Linda Luu, Sergey Levine, Karol Hausman, Brian Ichter
Ray Summit 2022 is coming to San Francisco on August 23-24. Hear how teams at Dow, Verizon, Riot Games, and more are solving their RL challenges with Ray's RLlib.
Register at raysummit.org and use code RAYSUMMIT22RL for a further 25% off the already reduced prices.
Sai Krishna Gottipati
01 Aug 2022
01:08:11
Saikrishna Gottipati is an RL Researcher at AI Redefined, working on RL, MARL, human in the loop learning.
Ray Summit 2022 is coming to San Francisco on August 23-24. Hear how teams at Dow, Verizon, Riot Games, and more are solving their RL challenges with Ray's RLlib.
Register at raysummit.org and use code RAYSUMMIT22RL for a further 25% off the already reduced prices.
Aravind Srinivas 2
09 May 2022
00:58:33
Aravind Srinivas is back! He is now a research Scientist at OpenAI.
Dr. Rohin Shah is a Research Scientist at DeepMind, and the editor and main contributor of the Alignment Newsletter.
Featured References
The MineRL BASALT Competition on Learning from Human Feedback Rohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William Guss, Sharada Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, Pieter Abbeel, Stuart Russell, Anca Dragan
Benefits of Assistance over Reward Learning Rohin Shah, Pedro Freire, Neel Alex, Rachel Freedman, Dmitrii Krasheninnikov, Lawrence Chan, Michael D Dennis, Pieter Abbeel, Anca Dragan, Stuart Russell
Jordan Terry is a PhD candidate at University of Maryland, the maintainer of Gym, the maintainer and creator of PettingZoo and the founder of Swarm Labs.
Featured References
PettingZoo: Gym for Multi-Agent Reinforcement Learning J. K. Terry, Benjamin Black, Nathaniel Grammel, Mario Jayakumar, Ananth Hari, Ryan Sullivan, Luis Santos, Rodrigo Perez, Caroline Horsch, Clemens Dieffendahl, Niall L. Williams, Yashas Lokesh, Praveen Ravi
Amy Zhang is a postdoctoral scholar at UC Berkeley and a research scientist at Facebook AI Research. She will be starting as an assistant professor at UT Austin in Spring 2023.
Xianyuan Zhan is currently a research assistant professor at the Institute for AI Industry Research (AIR), Tsinghua University. He received his Ph.D. degree at Purdue University. Before joining Tsinghua University, Dr. Zhan worked as a researcher at Microsoft Research Asia (MSRA) and a data scientist at JD Technology. At JD Technology, he led the research that uses offline RL to optimize real-world industrial systems.
Dr. Jess Whittlestone is a Senior Research Fellow at the Centre for the Study of Existential Risk and the Leverhulme Centre for the Future of Intelligence, both at the University of Cambridge.
Sam Ritter is a Research Scientist on the neuroscience team at DeepMind.
Featured References
Unsupervised Predictive Memory in a Goal-Directed Agent (MERLIN) Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja, Agnieszka Grabska-Barwinska, Jack Rae, Piotr Mirowski, Joel Z. Leibo, Adam Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harley, Josh Abramson, Shakir Mohamed, Danilo Rezende, David Saxton, Adam Cain, Chloe Hillier, David Silver, Koray Kavukcuoglu, Matt Botvinick, Demis Hassabis, Timothy Lillicrap
Professor Marc G. Bellemare is a Research Scientist at Google Research (Brain team), An Adjunct Professor at McGill University, and a Canada CIFAR AI Chair.
14:54 Jonathan Li from Rensselaer Polytechnic Institute
Robert Osazuwa Ness
08 May 2021
01:18:43
Robert Osazuwa Ness is an adjunct professor of computer science at Northeastern University, an ML Research Engineer at Gamalon, and the founder of AltDeep School of AI. He holds a PhD in statistics. He studied at Johns Hopkins SAIS and then Purdue University.
Dr. Marlos C. Machado is a research scientist at DeepMind and an adjunct professor at the University of Alberta. He holds a PhD from the University of Alberta and a MSc and BSc from UFMG, in Brazil.
Arulkumaran, K., Dilokthanakul, N., Shanahan, M. & Bharath, A. A. (2016). Classifying Options for Deep Reinforcement Learning. International Joint Conference on Artificial Intelligence, Deep Reinforcement Learning Workshop.
Garnelo, M., Arulkumaran, K. & Shanahan, M. (2016). Towards Deep Symbolic Reinforcement Learning. Annual Conference on Neural Information Processing Systems, Deep Reinforcement Learning Workshop.
I'm interested in robustness in RL and multi-agent RL, specifically as it applies to making the interaction between AI systems and society at large to be more beneficial.
Shimon Whiteson is a Professor of Computer Science at Oxford University, the head of WhiRL, the Whiteson Research Lab at Oxford, and Head of Research at Waymo UK.
Nan Jiang is an Assistant Professor of Computer Science at University of Illinois. He was a Postdoc Microsoft Research, and did his PhD at University of Michigan under Professor Satinder Singh.
[Robin] I misspoke when I said in domain randomization we want the agent to "ignore" domain parameters. What I should have said is, we want the agent to perform well within some range of domain parameters, it should be robust with respect to domain parameters.
RLC 2024 - Posters and Hallways 1
10 Sep 2024
00:05:46
Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA.
Danijar Hafner is a PhD student at the University of Toronto, and a student researcher at Google Research, Brain Team and the Vector Institute. He holds a Masters of Research from University College London.
Featured References
A deep learning framework for neuroscience Blake A. Richards, Timothy P. Lillicrap , Philippe Beaudoin, Yoshua Bengio, Rafal Bogacz, Amelia Christensen, Claudia Clopath, Rui Ponte Costa, Archy de Berker, Surya Ganguli, Colleen J. Gillon , Danijar Hafner, Adam Kepecs, Nikolaus Kriegeskorte, Peter Latham , Grace W. Lindsay, Kenneth D. Miller , Richard Naud , Christopher C. Pack, Panayiota Poirazi , Pieter Roelfsema , João Sacramento, Andrew Saxe, Benjamin Scellier, Anna C. Schapiro , Walter Senn, Greg Wayne, Daniel Yamins, Friedemann Zenke, Joel Zylberberg, Denis Therien, Konrad P. Kording
The Predictron: End-To-End Learning and Planning, David Silver, Hado van Hasselt, Matteo Hessel, Tom Schaul, Arthur Guez, Tim Harley, Gabriel Dulac-Arnold, David Reichert, Neil Rabinowitz, Andre Barreto, Thomas Degris
Solving Rubik’s Cube with a Robot Hand ; Paper, OpenAI, Ilge Akkaya, Marcin Andrychowicz, Maciek Chociej, Mateusz Litwin, Bob McGrew, Arthur Petron, Alex Paino, Matthias Plappert, Glenn Powell, Raphael Ribas, Jonas Schneider, Nikolas Tezak, Jerry Tworek, Peter Welinder, Lilian Weng, Qiming Yuan, Wojciech Zaremba, Lei Zhang
Tackling climate change with Machine Learning David Rolnick, Priya L. Donti, Lynn H. Kaack, Kelly Kochanski, Alexandre Lacoste, Kris Sankaran, Andrew Slavin Ross, Nikola Milojevic-Dupont, Natasha Jaques, Anna Waldman-Brown, Alexandra Luccioni, Tegan Maharaj, Evan D. Sherwin, S. Karthik Mukkavilli, Konrad P. Kording, Carla Gomes, Andrew Y. Ng, Demis Hassabis, John C. Platt, Felix Creutzig, Jennifer Chayes, Yoshua Bengio
Behaviour Suite for Reinforcement Learning, Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvari, Satinder Singh, Benjamin Van Roy, Richard Sutton, David Silver, Hado Van Hasselt
Learning Latent Plans from Play, Corey Lynch, Mohi Khansari, Ted Xiao, Vikash Kumar, Jonathan Tompson, Sergey Levine, Pierre Sermanet
Thank you to all the presenters that participated. I covered as many as I could given the time and crowds, if you were not included and wish to be, please email talkrl@pathwayi.com
5:38 On the Convergence of Episodic Reinforcement Learning Algorithms at the Example of RUDDER; Markus Holzleitner (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); José Arjona-Medina (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); Marius-Constantin Dinu (LIT AI Lab / University Linz ); Sepp Hochreiter (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria).
33:05 Bottom-Up Meta-Policy Search; Luckeciano Melo (Aeronautics Institute of Technology); Marcos Máximo (Aeronautics Institute of Technology); Adilson Cunha (Aeronautics Institute of Technology) [external pdf link].