Retour

Explorez tous les épisodes du podcast Joe Carlsmith Audio

Plongez dans la liste complète des épisodes de Joe Carlsmith Audio. Chaque épisode est catalogué accompagné de descriptions détaillées, ce qui facilite la recherche et l'exploration de sujets spécifiques. Suivez tous les épisodes de votre podcast préféré et ne manquez aucun contenu pertinent.

Rows per page:

1–50 of 71

TitreDateDurée
Building AIs that do human-like philosophy29 Jan 202600:33:33

AIs will face philosophical questions humans can't answer for them. Text version here: https://joecarlsmith.com/2026/01/29/building-ais-that-do-human-like-philosophy/

How human-like do safe AI motivations need to be?12 Nov 202501:23:32

AIs with alien motivations can still follow instructions safely on the inputs that matter. Text version here: https://joecarlsmith.com/2025/11/12/how-human-like-do-safe-ai-motivations-need-to-be/

What is it to solve the alignment problem?13 Feb 202500:40:13

Also: to avoid it? Handle it? Solve it forever? Solve it completely?

Text version here: https://joecarlsmith.substack.com/p/what-is-it-to-solve-the-alignment

How do we solve the alignment problem?13 Feb 202500:08:43

Introduction to a series of essays about paths to safe and useful superintelligence. 

Text version here: https://joecarlsmith.substack.com/p/how-do-we-solve-the-alignment-problem

Fake thinking and real thinking28 Jan 202501:18:47

When the line pulls at your hand. 

Text version here: https://joecarlsmith.com/2025/01/28/fake-thinking-and-real-thinking/. 


Takes on "Alignment Faking in Large Language Models"18 Dec 202401:27:54

What can we learn from recent empirical demonstrations of scheming in frontier models? Text version here: https://joecarlsmith.com/2024/12/18/takes-on-alignment-faking-in-large-language-models/

(Part 2, AI takeover) Extended audio from my conversation with Dwarkesh Patel30 Sep 202402:07:33

Extended audio from my conversation with Dwarkesh Patel. This part focuses on the basic story about AI takeover. Transcript available on my website here: https://joecarlsmith.com/2024/09/30/part-2-ai-takeover-extended-audio-transcript-from-my-conversation-with-dwarkesh-patel




(Part 1, Otherness) Extended audio from my conversation with Dwarkesh Patel30 Sep 202403:58:38

Extended audio from my conversation with Dwarkesh Patel. This part focuses on my series "Otherness and control in the age of AGI." Transcript available on my website here: https://joecarlsmith.com/2024/09/30/part-1-otherness-extended-audio-transcript-from-my-conversation-with-dwarkesh-patel/

Introduction and summary for "Otherness and control in the age of AGI"21 Jun 202400:12:23

This is the introduction and summary for my series "Otherness and control in the age of AGI." 

Text version here: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

Second half of full audio for "Otherness and control in the age of AGI"18 Jun 202404:11:02

Second half of the full audio for my series on how agents with different values should relate to one another, and on the ethics of seeking and sharing power. 

First half here: https://joecarlsmithaudio.buzzsprout.com/2034731/15266490-first-half-of-full-audio-for-otherness-and-control-in-the-age-of-agi

PDF of the full series here: https://jc.gatspress.com/pdf/otherness_full.pdf
Summary of the series here: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

First half of full audio for "Otherness and control in the age of AGI"17 Jun 202403:07:29

First half of the full audio for my series on how agents with different values should relate to one another, and on the ethics of seeking and sharing power. 

Second half here: https://joecarlsmithaudio.buzzsprout.com/2034731/15272132-second-half-of-full-audio-for-otherness-and-control-in-the-age-of-agi

PDF of the full series here: https://jc.gatspress.com/pdf/otherness_full.pdf
Summary of the series here: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

Loving a world you don't trust17 Jun 202401:03:54

Garden, campfire, healing water.

Text version here: https://joecarlsmith.com/2024/06/18/loving-a-world-you-dont-trust

This essay is part of a series I'm calling "Otherness and control in the age of AGI." I'm hoping that individual essays can be read fairly well on their own, but see here for brief text summaries of the essays that have been released thus far: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

Leaving Open Philanthropy, going to Anthropic03 Nov 202500:32:09

On a career move, and on AI-safety-focused people working at AI companies. Text version here: https://joecarlsmith.com/2025/11/03/leaving-open-philanthropy-going-to-anthropic/

On attunement25 Mar 202400:44:14

Examining a certain kind of meaning-laden receptivity to the world.

Text version here: https://joecarlsmith.com/2024/03/25/on-attunement

This essay is part of a series I'm calling "Otherness and control in the age of AGI." I'm hoping that individual essays can be read fairly well on their own, but see here for brief text summaries of the essays that have been released thus far: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

(Though: note that I haven't put the summary post on the podcast yet.)

On green21 Mar 202401:15:13

Examining a philosophical vibe that I think contrasts in interesting ways with "deep atheism."

Text version here: https://joecarlsmith.com/2024/03/21/on-green

This essay is part of a series I'm calling "Otherness and control in the age of AGI." I'm hoping that individual essays can be read fairly well on their own, but see here for brief text summaries of the essays that have been released thus far: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

(Though: note that I haven't put the summary post on the podcast yet.)

On the abolition of man18 Jan 202401:09:22

What does it take to avoid tyranny towards to the future?

Text version here: https://joecarlsmith.com/2024/01/18/on-the-abolition-of-man

This essay is part of a series I'm calling "Otherness and control in the age of AGI." I'm hoping that individual essays can be read fairly well on their own, but see here for brief text summaries of the essays that have been released thus far: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

(Though: note that I haven't put the summary post on the podcast yet.)

Being nicer than Clippy16 Jan 202400:47:30

Let's be the sort of species that aliens wouldn't fear the way we fear paperclippers.

Text version here: https://joecarlsmith.com/2024/01/16/being-nicer-than-clippy/

This essay is part of a series I'm calling "Otherness and control in the age of AGI." I'm hoping that individual essays can be read fairly well on their own, but see here for brief text summaries of the essays that have been released thus far: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

(Though: note that I haven't put the summary post on the podcast yet.)

An even deeper atheism11 Jan 202400:25:12

Who isn't a paperclipper?

Text version here: https://joecarlsmith.com/2024/01/11/an-even-deeper-atheism

This essay is part of a series I'm calling "Otherness and control in the age of AGI." I'm hoping that individual essays can be read fairly well on their own, but see here for brief summaries of the essays that have been released thus far: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi


Does AI risk "other" the AIs?09 Jan 202400:13:15

Examining Robin Hanson's critique of the AI risk discourse.

Text version here: https://joecarlsmith.com/2024/01/09/does-ai-risk-other-the-ais

This essay is part of a series of essays called "Otherness and control in the age of AGI." I'm hoping the individual essays can be read fairly well on their own, but see here for brief summaries of the essays that have been released thus far: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

When "yang" goes wrong08 Jan 202400:21:32

On the connection between deep atheism and seeking control. 

Text version here: https://joecarlsmith.com/2024/01/08/when-yang-goes-wrong

This essay is part of a series of essays called "Otherness and control in the age of AGI." I'm hoping the individual essays can be read fairly well on their own, but see here for brief summaries of the essays that have been released thus far: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

Deep atheism and AI risk04 Jan 202400:46:59

On a certain kind of fundamental mistrust towards Nature. 

Text version here: https://joecarlsmith.com/2024/01/04/deep-atheism-and-ai-risk

This is the second essay in my series “Otherness and control in the age of AGI. I’m hoping that the individual essays can be read fairly well on their own, but see here for brief summaries of the essays released thus far: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

Gentleness and the artificial Other02 Jan 202400:22:39

AIs as fellow creatures. And on getting eaten. 

Link: https://joecarlsmith.com/2024/01/02/gentleness-and-the-artificial-other

This is the first essay in a series of essays that I’m calling “Otherness and control in the age of AGI.” See here for more about the series as a whole: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi.

In search of benevolence (or: what should you get Clippy for Christmas?)27 Dec 202300:52:52

What is altruism towards a paperclipper? Can you paint with all the colors of the wind at once? 

(This is a recording of an essay originally published in 2021. Text here: https://joecarlsmith.com/2021/07/19/in-search-of-benevolence-or-what-should-you-get-clippy-for-christmas)

Controlling the options AIs can pursue29 Sep 202500:55:34

On boxing AIs, and on making deals with them. Text version here: https://joecarlsmith.com/2025/09/29/controlling-the-options-ais-can-pursue

Empirical work that might shed light on scheming (Section 6 of "Scheming AIs")16 Nov 202300:28:00

This is section 6 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Summing up "Scheming AIs" (Section 5)16 Nov 202300:15:46

This is section 5 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Speed arguments against scheming (Section 4.4-4.7 of "Scheming AIs")16 Nov 202300:15:19

This is section 4.4 through 4.7 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Simplicity arguments for scheming (Section 4.3 of "Scheming AIs")16 Nov 202300:19:37

This is section 4.3 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

The counting argument for scheming (Sections 4.1 and 4.2 of "Scheming AIs")16 Nov 202300:10:40

This is sections 4.1 and 4.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Arguments for/against scheming that focus on the path SGD takes (Section 3 of "Scheming AIs")16 Nov 202300:29:03

This is section 3 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Non-classic stories about scheming (Section 2.3.2 of "Scheming AIs")16 Nov 202300:24:34

This is section 2.3.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs")16 Nov 202300:22:54

This is section 2.3.1.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

The goal-guarding hypothesis (Section 2.3.1.1 of "Scheming AIs")16 Nov 202300:19:11

This is section 2.3.1.1 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

How useful for alignment-relevant work are AIs with short-term goals? (Section 2.2.4.3 of "Scheming AIs")16 Nov 202300:09:21

This is section 2.2.4.3 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Giving AIs safe motivations18 Aug 202501:23:25

A four-step picture. Text version here: https://joecarlsmith.com/2025/08/18/giving-ais-safe-motivations

Is scheming more likely if you train models to have long-term goals? (Sections 2.2.4.1-2.2.4.2 of "Scheming AIs")16 Nov 202300:09:01

This is sections 2.2.4.1-2.2.4.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

"Clean" vs. "messy" goal-directedness (Section 2.2.3 of "Scheming AIs")16 Nov 202300:16:44

This is section 2.2.3 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Two sources of beyond-episode goals (Section 2.2.2 of "Scheming AIs")16 Nov 202300:21:25

This is section 2.2.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Two concepts of an "episode" (Section 2.2.1 of "Scheming AIs")16 Nov 202300:12:08

This is section 2.2.1 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Situational awareness (Section 2.1 of "Scheming AIs")16 Nov 202300:09:27

This is section 2.1 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

On "slack" in training (Section 1.5 of "Scheming AIs")16 Nov 202300:07:12

This is section 1.5 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Why focus on schemers in particular? (Sections 1.3-1.4 of "Scheming AIs")16 Nov 202300:31:17

This is sections 1.3-1.4 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

A taxonomy of non-schemer models (Section 1.2 of "Scheming AIs")16 Nov 202300:11:20

This is section 1.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Varieties of fake alignment (Section 1.1 of "Scheming AIs")16 Nov 202300:17:54

This is section 1.1 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” 

Text of the report here: https://arxiv.org/abs/2311.08379
 
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
 
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Full audio for "Scheming AIs: Will AIs fake alignment during training in order to get power?"15 Nov 202306:13:17

This is the full audio for my report "Scheming AIs: Will AIs fake alignment during training in order to get power?"

(I’m also posting audio for individual sections of the report on this podcast, but the ordering was getting messed up on various podcast apps, and I think some people might want one big audio file regardless, so here it is. I’m going to be posting the individual sections one by one, in the right order, over the coming days. )

Full text of the report here: https://arxiv.org/abs/2311.08379
Summary here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

The stakes of AI moral status21 May 202500:37:29

On seeing and not seeing souls. Text version here: https://joecarlsmith.com/2025/05/21/the-stakes-of-ai-moral-status/

Introduction and summary of "Scheming AIs: Will AIs fake alignment during training in order to get power?"14 Nov 202300:56:32

This is a recording of the introductory section of my report "Scheming AIs: Will AIs fake alignment during training in order to get power?".  This section includes a summary of the full report. The summary covers most of the main points and technical terminology, and I'm hoping that it will provide much of the context necessary to understand individual sections of the report on their own. (Note: the text of the report itself may not be public by the time this episode goes live.)

In memory of Louise Glück15 Oct 202300:21:22

"It was, she said, a great discovery, albeit my real life."

On the limits of idealized values12 May 202301:00:14

Contra some meta-ethical views, you can't forever aim to approximate the self you would become in idealized conditions. You have to actively create yourself, often in the here and now. 

Originally published in 2021. Text version here: https://joecarlsmith.com/2021/06/21/on-the-limits-of-idealized-values

Predictable updating about AI risk08 May 202301:03:14

How worried about AI risk will we feel in the future, when we can see advanced machine intelligence up close? We should worry accordingly now. Text version here: https://joecarlsmith.com/2023/05/08/predictable-updating-about-ai-risk 

© My Podcast Data