Back

Explore every episode of the podcast Argmax

Dive into the complete episode list for Argmax. Each episode is cataloged with detailed descriptions, making it easy to find and explore specific topics. Keep track of all episodes from your favorite podcast and never miss a moment of insightful content.

Rows per page:

1–17 of 17

TitlePub. DateDuration
LoRA02 Sep 202301:02:56

We talk about Low Rank Approximation for fine tuning Transformers. We are also on YouTube now! Check out the video here: https://youtu.be/lLzHr0VFi3Y

15: InstructGPT28 Mar 202300:57:27

In this episode we discuss the paper "Training language models to follow instructions with human feedback" by Ouyang et al (2022). We discuss the RLHF paradigm and how important RL is to tuning GPT.

6: Deep Reinforcement Learning at the Edge of the Statistical Precipice06 Jun 202201:01:08

We discuss NeurIPS outstanding paper award winning paper, talking about important topics surrounding metrics and reproducibility.

5: QMIX26 Apr 202200:42:06

We talk about QMIX https://arxiv.org/abs/1803.11485 as an example of Deep Multi-agent RL.

4: Can Neural Nets Learn the Same Model Twice?06 Apr 202200:55:23

Todays paper: Can Neural Nets Learn the Same Model Twice? Investigating Reproducibility
and Double Descent from the Decision Boundary Perspective (https://arxiv.org/pdf/2203.08124.pdf)

Summary:
A discussion of reproducibility and double descent through visualizations of decision boundaries.

Highlights of the discussion:

  • Relationship between model performance and reproducibility
  • Which models are robust and reproducible
  • How they calculate the various scores



3: VICReg21 Mar 202200:44:46

Todays paper: VICReg (https://arxiv.org/abs/2105.04906)

Summary of the paper
VICReg prevents representation collapse using a mixture of variance, invariance and covariance when calculating the loss. It does not require negative samples and achieves great performance on downstream tasks.

Highlights of discussion

  • The VICReg architecture (Figure 1)
  • Sensitivity to hyperparameters (Table 7)
  • Top 5 metric usefulness
2: data2vec07 Mar 202200:53:23

Todays paper: data2vec (https://arxiv.org/abs/2202.03555)

Summary of the paper
A multimodal SSL algorithm that predicts latent representation of different types of input.

Highlights of discussion

  • What are the motivations of SSL and multimodal
  • How does the student teacher learning work?
  • What are similarities and differences between ViT, BYOL, and Reinforcement Learning algorithms.
1: Reward is Enough21 Feb 202200:54:36

This is the first episode of Argmax! We talk about our motivations for doing a podcast, and what we hope listeners will get out of it.

Todays paper: Reward is Enough

Summary of the paper
The authors present the Reward is Enough hypothesis: Intelligence, and its associated abilities, can be understood as subserving the maximisation of reward by an agent acting in its environment.

Highlights of discussion

  • High level overview of Reinforcement Learning
  • How evolution can be encoded as a reward maximization problem
  • What is the one reward signal we are trying to optimize?
14: Whisper17 Mar 202300:49:14
This week we talk about Whisper. It is a weakly supervised speech recognition model.



13: AlphaTensor11 Mar 202300:49:05

We talk about AlphaTensor, and how researchers were able to find a new algorithm for matrix multiplication.

12: SIRENs25 Oct 202200:54:17

In this episode we talked about "Implicit Neural Representations with Periodic Activation Functions" and the strength of periodic non-linearities.

11: CVPR Workshop on Autonomous Driving Keynote by Ashok Elluswamy, a Tesla engineer30 Sep 202200:48:51

In this episode we discuss this video: https://youtu.be/jPCV4GKX9Dw

How Tesla approaches collision detection with novel methods.

10: Outracing champion Gran Turismo drivers with deep reinforcement learning23 Aug 202200:54:50

We discuss Sony AI's accomplishment of creating a novel AI agent that can beat professional racers in Gran Turismo. Some topics include:
- The crafting of rewards to make the agent behave nicely
- What is QR-SAC?
- How to deal with "rare" experiences in the replay buffer

Link to paper: https://www.nature.com/articles/s41586-021-04357-7

9: Heads-Up Limit Hold'em Poker Is Solved29 Jul 202200:47:55

Today we talk about recent AI advances in Poker; specifically the use of counterfactual regret minimization to solve the game of 2-player Limit Texas Hold'em.

8: GATO (A Generalist Agent)29 Jul 202200:44:51

Today we talk about GATO, a multi-modal, multi-task, multi-embodiment generalist agent.

7: Deep Unsupervised Learning Using Nonequilibrium Thermodynamics (Diffusion Models)14 Jun 202200:30:55

We start talking about diffusion models as a technique for generative deep learning.

Mixture of Experts08 Oct 202400:54:46

In this episode we talk about the paper "Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, Jeff Dean.

Š My Podcast Data