Explorez tous les épisodes du podcast AI Safety Newsletter
| Titre | Date | Durée | |
|---|---|---|---|
| AISN #60: The AI Action Plan | 31 Jul 2025 | 00:15:41 | |
Also: ChatGPT Agent and IMO Gold. In this edition: The Trump Administration publishes its AI Action Plan; OpenAI released ChatGPT Agent and announced that an experimental model achieved gold medal-level performance on the 2025 International Mathematical Olympiad. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. The AI Action Plan On the 23rd, the White House released its AI Action Plan. The document is the outcome of a January executive order that required the President's Science Advisor, ‘AI and Crypto Czar’, and National Security Advisor (currently Michael Kratsios, David Sacks, and Marco Rubio) to submit a plan to “sustain and enhance America's global AI dominance in order to promote human flourishing, economic competitiveness, and national security.” President Trump also delivered an hour-long speech on the plan, and signed three executive orders beginning to implement some of its policies. Trump displaying an executive order at the [...]--- Outline: (00:34) The AI Action Plan (07:36) ChatGPT Agent and IMO Gold (12:48) In Other News --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/$s_!_NBd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39879533-bbcb-4b77-a1b9-67d248591bf5_1446x852.pnghttps://substackcdn.com/image/fetch/$s_!YR3_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32c045cf-daf7-4254-8cdc-4dd861f2c397_884x802.pnghttps://substackcdn.com/image/fetch/$s_!yeVV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf95488b-7af9-4342-aec3-fddfd3b5ee7c_1400x933.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #59: EU Publishes General-Purpose AI Code of Practice | 15 Jul 2025 | 00:09:23 | |
Plus: Meta Superintelligence Labs. Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. In this edition: The EU published a General-Purpose AI Code of Practice for AI providers, and Meta is spending billions revamping its superintelligence development efforts. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. EU Publishes General-Purpose AI Code of Practice In June 2024, the EU adopted the AI Act, which remains the world's most significant law regulating AI systems. The Act bans some uses of AI like social scoring and predictive policing and limits other “high risk” uses such as generating credit scores or evaluating educational outcomes. It also regulates general-purpose AI (GPAI) systems, imposing transparency requirements, copyright protection policies, and safety and security standards for models that pose systemic risk (defined as those trained [...] --- Outline: (00:31) EU Publishes General-Purpose AI Code of Practice (04:50) Meta Superintelligence Labs (06:17) In Other News --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/$s_!glEy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd30e7d8d-65ae-4c7c-aa81-f7e56c8b8c96_1360x966.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #50: AI Action Plan Responses | 31 Mar 2025 | 00:12:25 | |
Plus, Detecting Misbehavior in Reasoning Models. In this newsletter, we cover AI companies’ responses to the federal government's request for information on the development of an AI Action Plan. We also discuss an OpenAI paper on detecting misbehavior in reasoning models by monitoring their chains of thought. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. On January 23, President Trump signed an executive order giving his administration 180 days to develop an “AI Action Plan” to “enhance America's global AI dominance in order to promote human flourishing, economic competitiveness, and national security.” Despite the rhetoric of the order, the Trump administration has yet to articulate many policy positions with respect to AI development and safety. In a recent interview, Ben Buchanan—Biden's AI advisor—interpreted the executive order as giving the administration time to develop its AI policies. The AI Action Plan will therefore likely [...] --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6544cd82-9ba4-472a-8183-d108be2c86ac_1537x675.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fcfe4f8-9b5c-4ce4-9611-683a441c230b_1600x956.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #49: AI Action Plan Responses | 31 Mar 2025 | 00:12:25 | |
Plus, Detecting Misbehavior in Reasoning Models. In this newsletter, we cover AI companies’ responses to the federal government's request for information on the development of an AI Action Plan. We also discuss an OpenAI paper on detecting misbehavior in reasoning models by monitoring their chains of thought. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. On January 23, President Trump signed an executive order giving his administration 180 days to develop an “AI Action Plan” to “enhance America's global AI dominance in order to promote human flourishing, economic competitiveness, and national security.” Despite the rhetoric of the order, the Trump administration has yet to articulate many policy positions with respect to AI development and safety. In a recent interview, Ben Buchanan—Biden's AI advisor—interpreted the executive order as giving the administration time to develop its AI policies. The AI Action Plan will therefore likely [...] --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fcfe4f8-9b5c-4ce4-9611-683a441c230b_1600x956.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6544cd82-9ba4-472a-8183-d108be2c86ac_1537x675.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN | 06 Mar 2025 | 00:11:31 | |
Plus, Measuring AI Honesty. Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. In this newsletter, we discuss two recent papers: a policy paper on national security strategy, and a technical paper on measuring honesty in AI systems. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. Superintelligence Strategy CAIS director Dan Hendrycks, former Google CEO Eric Schmidt, and Scale AI CEO Alexandr Wang have authored a new paper, Superintelligence Strategy. The paper (and an in-depth expert version) argues that the development of superintelligence—AI systems that surpass humans in nearly every domain—is inescapably a matter of national security. In this story, we introduce the paper's three-pronged strategy for national security in the age of advanced AI: deterrence, nonproliferation, and competitiveness. Deterrence The simultaneous power and danger of superintelligence presents [...] --- Outline: (00:20) Superintelligence Strategy (01:09) Deterrence (02:41) Nonproliferation (04:04) Competitiveness (05:33) Measuring AI Honesty (09:24) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9ac746a-e95a-47f6-9d7a-2bb63ddcf744_1600x768.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4455070d-25de-4786-8540-3b221b8976dd_1600x876.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37b8b6f7-3ac8-41e2-a5b4-3cc7ed902c3e_1600x725.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ac21dab-6473-4436-880b-da868c9e5d9b_1600x738.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b74ae32-76b8-430f-92c9-2cf86e1ba710_1600x900.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4a71e77-48c9-49a6-a757-8cdbc28d19e8_1600x720.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| Superintelligence Strategy: Expert Version | 05 Mar 2025 | ||
Superintelligence is destabilizing since it threatens other states’ survival—it could be weaponized, or states may lose control of it. Attempts to build superintelligence may face threats by rival states—creating a deterrence regime called Mutual Assured AI Malfunction (MAIM). In this paper, Dan Hendrycks, Eric Schmidt, and Alexandr Wang detail a strategy—focused on deterrence, nonproliferation, and competitiveness—for nations to navigate the risks of superintelligence. | |||
| Superintelligence Strategy: Standard Version | 05 Mar 2025 | ||
Superintelligence is destabilizing since it threatens other states’ survival—it could be weaponized, or states may lose control of it. Attempts to build superintelligence may face threats by rival states—creating a deterrence regime called Mutual Assured AI Malfunction (MAIM). In this paper, Dan Hendrycks, Eric Schmidt, and Alexandr Wang detail a strategy—focused on deterrence, nonproliferation, and competitiveness—for nations to navigate the risks of superintelligence. | |||
| AISN #48: Utility Engineering and EnigmaEval | 18 Feb 2025 | 00:08:56 | |
Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. In this newsletter, we explore two recent papers from CAIS. We’d also like to highlight that CAIS is hiring for editorial and writing roles, including for a new online platform for journalism and analysis regarding AI's impacts on national security, politics, and economics. Utility Engineering A common view is that large language models (LLMs) are highly capable but fundamentally passive tools, shaping their responses based on training data without intrinsic goals or values. However, a new paper from the Center for AI Safety challenges this assumption, showing that LLMs exhibit coherent and structured value systems. Structured preferences emerge with scale. The paper introduces Utility Engineering, a framework for analyzing and controlling AI [...] --- Outline: (00:26) Utility Engineering (04:48) EnigmaEval --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fbfb7e4-413d-4552-ad61-2dd0ccd7d309_1600x1223.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ea44b62-5e2b-43de-b9de-02ee70db25ef_1600x576.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe67fb642-4cce-463b-aed5-26777d393977_1600x588.jpeghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f8e4ae6-7a37-4377-9f3d-a41efb1cbd7b_1072x782.jpegApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #47: Reasoning Models | 06 Feb 2025 | 00:09:00 | |
Plus, State-Sponsored AI Cyberattacks. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. Reasoning Models DeepSeek-R1 has been one of the most significant model releases since ChatGPT. After its release, the DeepSeek's app quickly rose to the top of Apple's most downloaded chart and NVIDIA saw a 17% stock decline. In this story, we cover DeepSeek-R1, OpenAI's o3-mini and Deep Research, and the policy implications of reasoning models. DeepSeek-R1 is a frontier reasoning model. DeepSeek-R1 builds on the company's previous model, DeepSeek-V3, by adding reasoning capabilities through reinforcement learning training. R1 exhibits frontier-level capabilities in mathematics, coding, and scientific reasoning—comparable to OpenAI's o1. DeepSeek-R1 also scored 9.4% on Humanity's Last Exam—at the time of its release, the highest of any publicly available system. DeepSeek reports spending only about $6 million on the computing power needed to train V3—however, that number doesn’t include the full [...] --- Outline: (00:13) Reasoning Models (04:58) State-Sponsored AI Cyberattacks (06:51) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F872ba487-5b6a-484d-a542-4173781925fd_1600x1170.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #46: The Transition | 23 Jan 2025 | 00:11:20 | |
Plus, Humanity's Last Exam, and the AI Safety, Ethics, and Society Course. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. The Transition The transition from the Biden to Trump administrations saw a flurry of executive activity on AI policy, with Biden signing several last-minute executive orders and Trump revoking Biden's 2023 executive order on AI risk. In this story, we review the state of play. Trump signing first-day executive orders. Source. The AI Diffusion Framework. The final weeks of the Biden Administration saw three major actions related to AI policy. First, the Bureau of Industry and Security released its Framework for Artificial Intelligence Diffusion, which updates the US’ AI-related export controls. The rule establishes three tiers of countries 1) US allies, 2) most other countries, and 3) arms-embargoed countries.
--- Outline: (00:16) The Transition (04:38) CAIS and Scale AI Introduce Humanitys Last Exam (08:03) AI Safety, Ethics, and Society Course (09:21) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb5a0d1e-563e-4ccb-91f4-5d3c92ff6cae_1122x318.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8badbf9d-83a5-42df-aa19-c71cb7fb0594_1600x470.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6add9d20-4b98-42be-82b8-c0140557e590_1055x1600.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbdc95e7-ef06-4c09-9d98-efc76100d9dc_1374x796.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3fc9f4a-a082-4c93-b867-14cd09b3e4a2_1600x900.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #45: Center for AI Safety 2024 Year in Review | 19 Dec 2024 | 00:11:31 | |
As 2024 draws to a close, we want to thank you for your continued support for AI safety and review what we’ve been able to accomplish. In this special-edition newsletter, we highlight some of our most important projects from the year. The mission of the Center for AI Safety is to reduce societal-scale risks from AI. We focus on three pillars of work: research, field-building, and advocacy. Research CAIS conducts both technical and conceptual research on AI safety. Here are some highlights from our research in 2024: Circuit Breakers. We published breakthrough research showing how circuit breakers can prevent AI models from behaving dangerously by interrupting crime-enabling outputs. In a jailbreaking competition with a prize pool of tens of thousands of dollars, it took twenty thousand attempts to jailbreak a model trained with circuit breakers. The paper was accepted to NeurIPS 2024. The WMDP Benchmark. We developed the Weapons [...] --- Outline: (00:34) Research (04:25) Advocacy (06:44) Field-Building (10:38) Looking Ahead --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39af8cc5-f5b2-499d-9339-2cec4dba653b_1600x964.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faca531b4-89a6-4cb3-b01f-4a90529147f1_1600x728.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2461925-c1d1-49ae-bc5f-f5e8740a8079_1192x422.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2925c7c6-ee18-4ab9-8405-fca897d63024_1546x1048.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #44: The Trump Circle on AI Safety | 19 Nov 2024 | 00:11:22 | |
Plus, Chinese researchers used Llama to create a military tool for the PLA, a Google AI system discovered a zero-day cybersecurity vulnerability, and Complex Systems. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. The Trump Circle on AI Safety The incoming Trump administration is likely to significantly alter the US government's approach to AI safety. For example, Trump is likely to immediately repeal Biden's Executive Order on AI. However, some of Trump's circle appear to take AI safety seriously. The most prominent AI safety advocate close to Trump is Elon Musk, who earlier this year supported SB 1047. However, he is not alone. Below, we’ve gathered some promising perspectives from other members of Trump's circle and incoming administration. Trump and Musk at UFC 309. Photo Source.
--- Outline: (00:24) The Trump Circle on AI Safety (02:41) Chinese Researchers Used Llama to Create a Military Tool for the PLA (04:14) A Google AI System Discovered a Zero-Day Cybersecurity Vulnerability (05:27) Complex Systems (08:54) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb705c69-2697-4ca1-a554-ae7b75402a1d_1339x693.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbf2434c-97c4-457f-b0e4-2db36b107fc2_959x639.jpegApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #58: Senate Removes State AI Regulation Moratorium | 03 Jul 2025 | 00:09:04 | |
Plus: Judges Split on Whether Training AI on Copyrighted Material is Fair Use. In this edition: The Senate removes a provision from Republican's “Big Beautiful Bill” aimed at restricting states from regulating AI; two federal judges split on whether training AI on copyrighted books in fair use. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. Senate Removes State AI Regulation Moratorium The Senate removed a provision from Republican's “Big Beautiful Bill” aimed at restricting states from regulating AI. The moratorium would have prohibited states from receiving federal broadband expansion funds if they regulated AI—however, it faced procedural and political challenges in the Senate, and was ultimately removed in a vote of 99-1. Here's what happened. A watered-down moratorium cleared the Byrd Rule. In an attempt to bypass the Byrd Rule, which prohibits policy provisions in budget bills, the Senate Commerce Committee revised the [...] --- Outline: (00:35) Senate Removes State AI Regulation Moratorium (03:04) Judges Split on Whether Training AI on Copyrighted Material is Fair Use (07:19) In Other News --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/$s_!3W7Q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0121db23-e6ab-48b8-9f8e-50a6e3705f24_1600x1067.jpegApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #43: White House Issues First National Security Memo on AI | 28 Oct 2024 | 00:14:55 | |
Plus, AI and Job Displacement, and AI Takes Over the Nobels. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. White House Issues First National Security Memo on AI On October 24, 2024, the White House issued the first National Security Memorandum (NSM) on Artificial Intelligence, accompanied by a Framework to Advance AI Governance and Risk Management in National Security. The NSM identifies AI leadership as a national security priority. The memorandum states that competitors have employed economic and technological espionage to steal U.S. AI technology. To maintain a U.S. advantage in AI, the memorandum directs the National Economic Council to assess the U.S.'s competitive position in:
The Intelligence Community must make gathering intelligence on competitors' operations against the [...] --- Outline: (00:18) White House Issues First National Security Memo on AI (03:22) AI and Job Displacement (09:13) AI Takes Over the Nobels --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80a09bc-e4c3-4d08-96af-ff156cb5131e_2670x1982.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde24ea1d-e137-4392-98b3-9fa80ce102a7_1220x1166.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #42: Newsom Vetoes SB 1047 | 01 Oct 2024 | 00:13:11 | |
Plus, OpenAI's o1, and AI Governance Summary. Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. Newsom Vetoes SB 1047 On Sunday, Governor Newsom vetoed California's Senate Bill 1047 (SB 1047), the most ambitious legislation to-date aimed at regulating frontier AI models. The bill, introduced by Senator Scott Wiener and covered in a previous newsletter, would have required AI developers to test frontier models for hazardous capabilities and take steps to mitigate catastrophic risks. (CAIS Action Fund was a co-sponsor of SB 1047.) Newsom states that SB 1047 is not comprehensive enough. In his letter to the California Senate, the governor argued that “SB 1047 does not take into account whether an AI system is deployed in high-risk environments, involves [...] --- Outline: (00:18) Newsom Vetoes SB 1047 (01:55) OpenAI's o1 (06:44) AI Governance (10:32) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F597d95c5-d56f-498c-9861-1c8bcd9cb9e6_855x520.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e9a48f-d81c-4ee1-80ce-5be272592b71_1600x1100.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ab7ef13-032e-49f4-abaf-878cbd92c902_1600x622.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #41: The Next Generation of Compute Scale | 11 Sep 2024 | 00:11:59 | |
Plus, Ranking Models by Susceptibility to Jailbreaking, and Machine Ethics. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. The Next Generation of Compute Scale AI development is on the cusp of a dramatic expansion in compute scale. Recent developments across multiple fronts—from chip manufacturing to power infrastructure—point to a future where AI models may dwarf today's largest systems. In this story, we examine key developments and their implications for the future of AI compute. xAI and Tesla are building massive AI clusters. Elon Musk's xAI has brought its Memphis supercluster—“Colossus”—online. According to Musk, the cluster has 100k Nvidia H100s, making it the largest supercomputer in the world. Moreover, xAI plans to add 50k H200s in the next few months. For comparison, Meta's Llama 3 was trained on 16k H100s. Meanwhile, Tesla's “Gigafactory Texas” is expanding to house an AI supercluster. Tesla's Gigafactory supercomputer [...] --- Outline: (00:18) The Next Generation of Compute Scale (04:36) Ranking Models by Susceptibility to Jailbreaking (06:07) Machine Ethics --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d3f91f9-4ca4-4349-9968-700b7d2839af_1280x684.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #40: California AI Legislation | 21 Aug 2024 | 00:14:00 | |
Plus, NVIDIA Delays Chip Production, and Do AI Safety Benchmarks Actually Measure Safety?. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. SB 1047, the Most-Discussed California AI Legislation California's Senate Bill 1047 has sparked discussion over AI regulation. While state bills often fly under the radar, SB 1047 has garnered attention due to California's unique position in the tech landscape. If passed, SB 1047 would apply to all companies performing business in the state, potentially setting a precedent for AI governance more broadly. This newsletter examines the current state of the bill, which has had various amendments in response to feedback from various stakeholders. We'll cover recent debates surrounding the bill, support from AI experts, opposition from the tech industry, and public opinion based on polling. The bill mandates safety protocols, testing procedures, and reporting requirements for covered AI models. The bill was [...] --- Outline: (00:18) SB 1047, the Most-Discussed California AI Legislation (04:38) NVIDIA Delays Chip Production (06:49) Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? (10:22) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d2f21c7-15ee-456c-8ee1-4b6741519f9f_1600x842.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #39: Implications of a Trump Administration for AI Policy | 29 Jul 2024 | 00:12:00 | |
Plus, Safety Engineering Overview. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. Implications of a Trump administration for AI policy Trump named Ohio Senator J.D. Vance—an AI regulation skeptic—as his pick for vice president. This choice sheds light on the AI policy landscape under a future Trump administration. In this story, we cover: (1) Vance's views on AI policy, (2) views of key players in the administration, such as Trump's party, donors, and allies, and (3) why AI safety should remain bipartisan. Vance has pushed for reducing AI regulations and making AI weights open. At a recent Senate hearing, Vance accused Big Tech companies of overstating risks from AI in order to justify regulations to stifle competition. This led tech policy experts to expect that Vance would favor looser AI regulations. However, Vance has also praised Lina Khan, Chair of the Federal Trade [...] --- Outline: (00:18) Implications of a Trump administration for AI policy (04:57) Safety Engineering (08:49) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6041e326-c428-4327-bfd0-32066833d9ec_1600x584.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F902ac97b-07bb-4232-adb5-d17778692649_1600x1067.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #38: Supreme Court Decision Could Limit Federal Ability to Regulate AI | 09 Jul 2024 | 00:10:31 | |
Plus, “Circuit Breakers” for AI systems, and updates on China's AI industry. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. Supreme Court Decision Could Limit Federal Ability to Regulate AI In a recent decision, the Supreme Court overruled the 1984 precedent Chevron v. Natural Resources Defence Council. In this story, we discuss the decision's implications for regulating AI. Chevron allowed agencies to flexibly apply expertise when regulating. The “Chevron doctrine” had required courts to defer to a federal agency's interpretation of a statute in the case that that statute was ambiguous and the agency's interpretation was reasonable. Its elimination curtails federal agencies’ ability to regulate—including, as this article from LawAI explains, their ability to regulate AI. The Chevron doctrine expanded federal agencies’ ability to regulate in at least two ways. First, agencies could draw on their technical expertise to interpret ambiguous statutes [...] --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0247aba-ec24-4f37-93c5-730a6419aebb_1340x918.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #37: US Launches Antitrust Investigations | 18 Jun 2024 | 00:11:02 | |
US Launches Antitrust Investigations The U.S. Government has launched antitrust investigations into Nvidia, OpenAI, and Microsoft. The U.S. Department of Justice (DOJ) and Federal Trade Commission (FTC) have agreed to investigate potential antitrust violations by the three companies, the New York Times reported. The DOJ will lead the investigation into Nvidia while the FTC will focus on OpenAI and Microsoft. Antitrust investigations are conducted by government agencies to determine whether companies are engaging in anticompetitive practices that may harm consumers and stifle competition. Nvidia investigated for GPU dominance. The New York Times reports that concerns have been raised about Nvidia's dominance in the GPU market, “including how the company's software locks [...] --- Outline: (00:10) US Launches Antitrust Investigations (02:58) Recent Criticisms of OpenAI and Anthropic (05:40) Situational Awareness (09:14) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #36: Voluntary Commitments are Insufficient | 30 May 2024 | 00:10:09 | |
Voluntary Commitments are Insufficient AI companies agree to RSPs in Seoul. Following the second AI Global Summit held in Seoul, the UK and Republic of Korea governments announced that 16 major technology organizations, including Amazon, Google, Meta, Microsoft, OpenAI, and xAI have agreed to a new set of Frontier AI Safety Commitments. Some commitments from the agreement include:
These commitments [...] --- Outline: (00:03) Voluntary Commitments are Insufficient (02:45) Senate AI Policy Roadmap (05:18) Chapter 1: Overview of Catastrophic Risks (07:56) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #35: Lobbying on AI Regulation | 16 May 2024 | 00:12:09 | |
OpenAI and Google Announce New Multimodal Models In the current paradigm of AI development, there are long delays between the release of successive models. Progress is largely driven by increases in computing power, and training models with more computing power requires building large new data centers. More than a year after the release of GPT-4, OpenAI has yet to release GPT-4.5 or GPT-5, which would presumably be trained on 10x or 100x more compute than GPT-4, respectively. These models might be released over the next year or two, and could represent large spikes in AI capabilities. But OpenAI did announce a new model last week, called GPT-4o. The “o” stands for “omni,” referring to the fact that the model can use text, images, videos [...] --- Outline: (00:03) OpenAI and Google Announce New Multimodal Models (02:36) The Surge in AI Lobbying (05:29) How Should Copyright Law Apply to AI Training Data? (10:10) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #34: New Military AI Systems | 01 May 2024 | 00:17:02 | |
Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. AI Labs Fail to Uphold Safety Commitments to UK AI Safety Institute In November, leading AI labs committed to sharing their models before deployment to be tested by the UK AI Safety Institute. But reporting from Politico shows that these commitments have fallen through. OpenAI, Anthropic, and Meta have all failed to share their models with the UK AISI before deployment. Only Google DeepMind, headquartered in London, has given pre-deployment access to UK AISI. Anthropic released the most powerful publicly available language model, Claude 3, without any window for pre-release testing by the UK AISI. When asked for comment, Anthropic co-founder Jack Clark said, “Pre-deployment testing is a nice idea but very difficult to implement.” When asked about their concerns with pre-deployment testing [...] --- Outline: (00:03) AI Labs Fail to Uphold Safety Commitments to UK AI Safety Institute (02:17) New Bipartisan AI Policy Proposals in the US Senate (06:35) Military AI in Israel and the US (11:44) New Online Course on AI Safety from CAIS (12:38) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #57: The RAISE Act | 17 Jun 2025 | 00:07:12 | |
In this edition: The New York Legislature passes an act regulating frontier AI—but it may not be signed into law for some time. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. The RAISE Act New York may soon become the first state to regulate frontier AI systems. On June 12, the state's legislature passed the Responsible AI Safety and Education (RAISE) Act. If New York Governor Kathy Hochul signs it into law, the RAISE Act will be the most significant state AI legislation in the U.S. New York's RAISE Act imposes four guardrails on frontier labs: developers must publish a safety plan, hold back unreasonably risky models, disclose major incidents, and face penalties for non-compliance.
--- Outline: (00:21) The RAISE Act (04:43) In Other News --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faaa39fa0-a05c-4785-9130-ab331a0e0e34_1600x427.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #33: Reassessing AI and Biorisk | 11 Apr 2024 | 00:20:27 | |
Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. This week, we cover:
AI Startups Seek Support From Large Financial Backers As AI development demands ever-increasing compute resources, only well-resourced developers can compete at the frontier. In practice, this means that AI startups must either partner with the world's [...] --- Outline: (00:45) AI Startups Seek Support From Large Financial Backers (03:47) National AI Investments (05:16) Federal Spending on AI (08:35) An Updated Assessment of AI and Biorisk (15:35) $250K in Prizes: SafeBench Competition Announcement (16:08) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #32: Measuring and Reducing Hazardous Knowledge in LLMs | 07 Mar 2024 | 00:17:56 | |
Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. Measuring and Reducing Hazardous Knowledge The recent White House Executive Order on Artificial Intelligence highlights risks of LLMs in facilitating the development of bioweapons, chemical weapons, and cyberweapons. To help measure these dangerous capabilities, CAIS has partnered with Scale AI to create WMDP: the Weapons of Mass Destruction Proxy, an open source benchmark with more than 4,000 multiple choice questions that serve as proxies for hazardous knowledge across biology, chemistry, and cyber. This benchmark not only helps the world understand the relative dual-use capabilities of different LLMs, but it also creates a path forward for model builders to remove harmful information from their models through machine unlearning techniques. Measuring hazardous knowledge in bio, chem, and cyber. Current evaluations of dangerous AI capabilities have [...] --- Outline: (00:03) Measuring and Reducing Hazardous Knowledge (04:35) Language models are getting better at forecasting (07:51) Proposals for Private Regulatory Markets (14:25) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #31: A New AI Policy Bill in California | 21 Feb 2024 | 00:13:24 | |
Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. This week, we’ll discuss:
A New Bill on AI Policy in California Several leading AI companies have public plans for how they’ll invest in safety and security as they develop more dangerous AI systems. A new bill in California's state legislature would codify this practice as a legal requirement, and clarify the legal liability faced by developers [...] --- Outline: (00:33) A New Bill on AI Policy in California (04:38) Precedents for AI Policy: Healthcare and Biosecurity (07:56) Enforcing the EU AI Act (08:55) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #30: Investments in Compute and Military AI | 24 Jan 2024 | 00:11:25 | |
Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. Compute Investments Continue To Grow Pausing AI development has been proposed as a policy for ensuring safety. For example, an open letter last year from the Future of Life Institute called for a six-month pause on training AI systems more powerful than GPT-4. But one interesting fact about frontier AI development is that it comes with natural pauses that can last many months or years. After releasing a frontier model, it takes time for AI developers to construct new compute clusters with larger numbers of more advanced computer chips. The supply of compute is currently unable to keep up with demand, meaning some AI developers cannot buy enough chips for their needs. This explains why OpenAI was reportedly limited by GPUs last year. [...] --- Outline: (00:06) Compute Investments Continue To Grow (03:48) Developments in Military AI (07:19) Japan and Singapore Support AI Safety (08:57) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #29: Progress on the EU AI Act | 04 Jan 2024 | 00:12:14 | |
Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. A Provisional Agreement on the EU AI Act On December 8th, the EU Parliament, Council, and Commission reached a provisional agreement on the EU AI Act. The agreement regulates the deployment of AI in high risk applications such as hiring and credit pricing, and it bans private companies from building and deploying AI for unacceptable applications such as social credit scoring and individualized predictive policing. Despite lobbying by some AI startups against regulation of foundation models, the agreement contains risk assessment and mitigation requirements for all general purpose AI systems. Specific requirements apply to AI systems trained with >1025 FLOP such as Google's Gemini and OpenAI's GPT-4. Minimum basic transparency requirements for all GPAI. The provisional agreement regulates foundation models — using the [...] --- Outline: (00:06) A Provisional Agreement on the EU AI Act (04:55) Questions about Research Standards in AI Safety (06:48) The New York Times sues OpenAI and Microsoft for Copyright Infringement (10:34) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| The Landscape of US AI Legislation | 29 Dec 2023 | 00:09:57 | |
Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. This week we’re looking closely at AI legislative efforts in the United States, including:
Senator Schumer's AI Insight Forum The CEOs of more than a dozen major AI companies gathered in Washington on Wednesday for a hearing with the Senate. Organized by Democratic Majority Leader Chuck Schumer and a bipartisan group of Senators, this was the first of many hearings in their AI Insight Forum. After the hearing, Senator Schumer said, “I asked everyone in the room, ‘Is government needed to play a role in regulating AI?’ and [...] --- Outline: (00:30) Senator Schumer's AI Insight Forum (01:20) The Blumenthal-Hawley Framework (03:09) Agencies Proposed to Govern Digital Platforms (04:46) Deepfakes and Watermarking Legislation (06:12) State and Local Laws Against AI Surveillance (06:52) National AI Research Resource (NAIRR) (08:18) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #28: Center for AI Safety 2023 Year in Review | 21 Dec 2023 | 00:11:08 | |
As 2023 comes to a close, we want to thank you for your continued support for AI safety. This has been a big year for AI and for the Center for AI Safety. In this special-edition newsletter, we highlight some of our most important projects from the year. Thank you for being part of our community and our work. Center for AI Safety's 2023 Year in Review The Center for AI Safety (CAIS) is on a mission to reduce societal-scale risks from AI. We believe this requires research and regulation. These both need to happen quickly (due to unknown timelines on AI progress) and in tandem (because either one is insufficient on its own). To achieve this, we pursue three pillars of work: research, field-building, and advocacy. Research CAIS conducts both technical and conceptual research on AI safety. We pursue multiple overlapping strategies which can be layered together [...] --- Outline: (00:27) Center for AI Safety's 2023 Year in Review (00:56) Research (03:37) Field-Building (07:35) Advocacy (10:04) Looking Ahead (10:23) Support Our Work --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #27: Defensive Accelerationism | 07 Dec 2023 | 00:12:10 | |
Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. Defensive Accelerationism Vitalik Buterin, the creator of Ethereum, recently wrote an essay on the risks and opportunities of AI and other technologies. He responds to Marc Andreessen's manifesto on techno-optimism and the growth of the effective accelerationism (e/acc) movement, and offers a more nuanced perspective. Technology is often great for humanity, the essay argues, but AI could be an exception to that rule. Rather than giving governments control of AI so they can protect us, Buterin argues that we should build defensive technologies that provide security against catastrophic risks in a decentralized society. Cybersecurity, biosecurity, resilient physical infrastructure, and a robust information ecosystem are some of the technologies Buterin believes we should build to protect ourselves from AI risks. Technology has risks, but [...] --- Outline: (00:06) Defensive Accelerationism (03:55) Retrospective on the OpenAI Board Saga (07:58) Klobuchar and Thune's “light-touch” Senate bill (10:23) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #26: National Institutions for AI Safety | 15 Nov 2023 | 00:12:29 | |
Also, Results From the UK Summit, and New Releases From OpenAI and xAI. Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. This week's key stories include:
UK, US, and Singapore Establish National AI Safety Institutions Before regulating a new technology, governments often need time to gather information and consider their policy options. But during that time, the technology may diffuse through society, making it more difficult for governments to intervene. This process, termed the Collingridge Dilemma, is a fundamental challenge in technology policy. But recently [...] --- Outline: (00:36) UK, US, and Singapore Establish National AI Safety Institutions (03:53) UK Summit Ends with Consensus Statement and Future Commitments (05:39) New Models From xAI, OpenAI, and a New Chinese Startup (09:28) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #25: White House Executive Order on AI, UK AI Safety Summit, and Progress on Voluntary Evaluations of AI Risks. | 31 Oct 2023 | 00:11:37 | |
Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. White House Executive Order on AI While Congress has not voted on significant AI legislation this year, the White House has left their mark on AI policy. In June, they secured voluntary commitments on safety from leading AI companies. Now, the White House has released a new executive order on AI. It addresses a wide range of issues, and specifically targets catastrophic AI risks such as cyberattacks and biological weapons. Companies must disclose large training runs. Under the executive order, companies that intend to train “dual-use foundation models” using significantly more computing power than GPT-4 must take several precautions. First, they must notify the White House before training begins. Then [...] --- Outline: (00:13) White House Executive Order on AI (03:56) Kicking Off The UK AI Safety Summit (06:18) Progress on Voluntary Evaluations of AI Risks (08:52) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #56: Google Releases Veo 3 | 28 May 2025 | 00:08:37 | |
Plus, Opus 4 Demonstrates the Fragility of Voluntary Governance. In this edition: Google released a frontier video generation model at its annual developer conference; Anthropic's Claude Opus 4 demonstrates the danger of relying on voluntary governance. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. Google Releases Veo 3 Last week, Google made several AI announcements at I/O 2025, its annual developer conference. An announcement of particular note is Veo 3, Google's newest video generation model. Frontier video and audio generation. Veo 3 outperforms other models on human preference benchmarks, and generates both audio and video. Google showcasing a video generated with Veo 3. (Source)If you just look at benchmarks, Veo 3 is a substantial improvement over other systems. But relative benchmark improvement only tells part of the story—the absolute capabilities of systems ultimately determine their usefulness. Veo 3 looks like a marked qualitative [...] --- Outline: (00:33) Google Releases Veo 3 (03:25) Opus 4 Demonstrates the Fragility of Voluntary Governance --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad471014-fe58-4180-a67a-9b48862263b9_1600x602.pnghttps://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda24a5e2-92d6-490e-b74f-88fa68203799_1600x900.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #24: Kissinger Urges US-China Cooperation on AI, China’s New AI Law, US Export Controls, International Institutions, and Open Source AI. | 18 Oct 2023 | 00:13:00 | |
Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. China's New AI Law, US Export Controls, and Calls for Bilateral Cooperation China details how AI providers can fulfill their legal obligations. The Chinese government has passed several laws on AI. They’ve regulated recommendation algorithms and taken steps to mitigate the risk of deepfakes. Most recently, they issued a new law governing generative AI. It's less stringent than earlier draft version, but the law remains more comprehensive in AI regulation than any laws passed in the US, UK, or European Union. The law creates legal obligations for AI providers to respect intellectual property rights, avoid discrimination, and uphold socialist values. But as with many AI policy proposals, these are [...] --- Outline: (00:15) China's New AI Law, US Export Controls, and Calls for Bilateral Cooperation (04:58) Proposed International Institutions for AI (08:15) Open Source AI: Risks and Opportunities (11:25) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #23: New OpenAI Models, News from Anthropic, and Representation Engineering. | 04 Oct 2023 | 00:09:35 | |
Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. OpenAI releases GPT-4 with Vision and DALL·E-3, announces Red Teaming Network GPT-4 with vision and voice. When GPT-4 was initially announced in March, OpenAI demonstrated its ability to process and discuss images such as diagrams or photographs. This feature has now been integrated into GPT-4V. Users can now input images in addition to text, and the model will respond to both. Users can also speak to GPT-4V, and the model will respond verbally. GPT-4V may be more vulnerable to misuse via jailbreaks and adversarial attacks. Previous research has shown that multimodal models, which can process multiple forms of input such as both text and images, are more vulnerable to adversarial attacks than text-only models. GPT-4V's System Card includes some experiments [...] --- Outline: (00:11) OpenAI releases GPT-4 with Vision and DALL·E-3, announces Red Teaming Network (02:39) Writer's Guild of America Receives Protections Against AI Automation (03:42) Anthropic receives $1.25B investment from Amazon, and announces several new policies (06:21) Representation Engineering: A Top-Down Approach to AI Transparency (07:57) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #21: Google DeepMind’s GPT-4 Competitor, Military Investments in Autonomous Drones, The UK AI Safety Summit, and Case Studies in AI Policy. | 05 Sep 2023 | 00:09:52 | |
Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. Google DeepMind’s GPT-4 Competitor Computational power is a key driver of AI progress, and a new report suggests that Google’s upcoming GPT-4 competitor will be trained on unprecedented amounts of compute. The model, currently named Gemini, may be trained by the end of this year with 5x more computational power than GPT-4. By the end of next year, the report projects that Google will have the ability to train a model with 20x more compute than GPT-4. For reference, the compute difference between GPT-3 and GPT-4 was 100x. If these projections are true, Google’s new models could create a meaningful spike relative to current AI capabilities. Google’s position [...] --- Outline: (00:14) Google DeepMind’s GPT-4 Competitor (02:41) US Military Invests in Thousands of Autonomous Drones (04:37) United Kingdom Prepares for Global AI Safety Summit (06:15) Case Studies in AI Policy (08:55) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #20: LLM Proliferation, AI Deception, and Continuing Drivers of AI Capabilities. | 29 Aug 2023 | 00:15:37 | |
AI Deception: Examples, Risks, Solutions AI deception is the topic of a new paper from researchers at and affiliated with the Center for AI Safety. It surveys empirical examples of AI deception, then explores societal risks and potential solutions. The paper defines deception as “the systematic production of false beliefs in others as a means to accomplish some outcome other than the truth.” Importantly, this definition doesn't necessarily imply that AIs have beliefs or intentions. Instead, it focuses on patterns of behavior that regularly cause false beliefs and would be considered deceptive if exhibited by humans. Deception by Meta’s CICERO AI. Meta developed the AI system CICERO to play Diplomacy, a game where players build and betray alliances in [...] --- Outline: (00:11) AI Deception: Examples, Risks, Solutions (04:35) Proliferation of Large Language Models (09:25) Continuing Drivers of AI Capabilities (14:30) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| [Paper] “An Overview of Catastrophic AI Risks” by Dan Hendrycks, Mantas Mazeika and Thomas Woodside | 21 Aug 2023 | 03:03:29 | |
Rapid advancements in artificial intelligence (AI) have sparked growing concerns among experts, policymakers, and world leaders regarding the potential for increasingly advanced AI systems to pose catastrophic risks. Although numerous risks have been detailed separately, there is a pressing need for a systematic discussion and illustration of the potential dangers to better inform efforts to mitigate them. This paper provides an overview of the main sources of catastrophic AI risks, which we organize into four categories: malicious use, in which individuals or groups intentionally use AIs to cause harm; AI race, in which competitive environments compel actors to deploy unsafe AIs or cede control to AIs; organizational risks, highlighting how human factors and complex systems can increase the chances of catastrophic accidents; and rogue AIs, describing the inherent difficulty [...] --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| [Paper] “X-Risk Analysis for AI Research” by Dan Hendrycks and Mantas Mazeika | 21 Aug 2023 | 00:39:44 | |
Artificial intelligence (AI) has the potential to greatly improve society, but as with any powerful technology, it comes with heightened risks and responsibilities. Current AI research lacks a systematic discussion of how to manage long-tail risks from AI systems, including speculative long-term risks. Keeping in mind the potential benefits of AI, there is some concern that building ever more intelligent and powerful AI systems could eventually result in systems that are more powerful than us; some say this is like playing with fire and speculate that this could create existential risks (x-risks). To add precision and ground these discussions, we provide a guide for how to analyze AI x-risk, which consists of three parts: First, we review how systems can be made safer today, drawing on time-tested concepts from hazard analysis and systems safety that have been designed to steer large processes in safer directions. Next, we discuss strategies [...] --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| [Paper] “Unsolved Problems in ML Safety” by Dan Hendrycks, Nicholas Carlini, John Schulman and Jacob Steinhardt | 21 Aug 2023 | 00:53:14 | |
Machine learning (ML) systems are rapidly increasing in size, are acquiring new capabilities, and are increasingly deployed in high-stakes settings. As with other powerful technologies, safety for ML should be a leading research priority. In response to emerging safety challenges in ML, such as those introduced by recent large-scale models, we provide a new roadmap for ML Safety and refine the technical problems that the field needs to address. We present four problems ready for research, namely withstanding hazards (“Robustness”), identifying hazards (“Monitoring”), steering ML systems (“Alignment”), and reducing deployment hazards (“Systemic Safety”). Throughout, we clarify each problem’s motivation and provide concrete research directions. --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #19: US-China Competition on AI Chips, Measuring Language Agent Developments, Economic Analysis of Language Model Propaganda, and White House AI Cyber Challenge. | 15 Aug 2023 | 00:10:51 | |
US-China Competition on AI Chips Modern AI systems are trained on advanced computer chips which are designed and fabricated by only a handful of companies in the world. The US and China have been competing for access to these chips for years. Last October, the Biden administration partnered with international allies to severely limit China’s access to leading AI chips. Recently, there have been several interesting developments on AI chips. China has made several efforts to preserve their chip access, including smuggling, buying chips that are just under the legal limit of performance, and investing in their domestic chip industry. Meanwhile, the United States has struggled [...] --- Outline: (00:15) US-China Competition on AI Chips (04:09) Measuring Language Agents Developments (06:07) An Economic Analysis of Language Model Propaganda (08:11) White House Competition Applying AI to Cybersecurity (09:40) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #18: Challenges of Reinforcement Learning from Human Feedback, Microsoft’s Security Breach, and Conceptual Research on AI Safety. | 08 Aug 2023 | 00:11:02 | |
Challenges of Reinforcement Learning from Human Feedback If you’ve used ChatGPT, you might’ve noticed the “thumbs up” and “thumbs down” buttons next to each of its answers. Pressing these buttons provides data that OpenAI uses to improve their models through a technique called reinforcement learning from human feedback (RLHF). RLHF is popular for teaching models about human preferences, but it faces fundamental limitations. Different people have different preferences, but instead of modeling the diversity of human values, RLHF trains models to earn the approval of whoever happens to give feedback. Furthermore, as AI systems become more capable, they can learn to deceive human evaluators into giving undue approval. Here we discuss a new [...] --- Outline: (00:13) Challenges of Reinforcement Learning from Human Feedback (05:26) Microsoft’s Security Breach (06:59) Conceptual Research on AI Safety (09:25) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #17: Automatically Circumventing LLM Guardrails, the Frontier Model Forum, and Senate Hearing on AI Oversight. | 01 Aug 2023 | 00:15:44 | |
Automatically Circumventing LLM Guardrails Large language models (LLMs) can generate hazardous information, such as step-by-step instructions on how to create a pandemic pathogen. To combat the risk of malicious use, companies typically build safety guardrails intended to prevent LLMs from misbehaving. But these safety controls are almost useless against a new attack developed by researchers at Carnegie Mellon University and the Center for AI Safety. By studying the vulnerabilities in open source models such as Meta’s LLaMA 2, the researchers can automatically generate a nearly unlimited supply of “adversarial suffixes,” which are words and characters that cause any model’s safety controls to fail. This discovery calls into question the fundamental limits of safety [...] --- Outline: (00:12) Automatically Circumventing LLM Guardrails (05:40) AI Labs Announce the Frontier Model Forum (07:54) Senate Hearing on AI Oversight (14:42) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #55: Trump Administration Rescinds AI Diffusion Rule, Allows Chip Sales to Gulf States | 20 May 2025 | 00:09:18 | |
Plus, Bills on Whistleblower Protections, Chip Location Verification, and State Preemption. In this edition: The Trump Administration rescinds the Biden-era AI diffusion rule and sells AI chips to the UAE and Saudi Arabia; Federal lawmakers propose legislation on AI whistleblowers, location verification for AI chips, and prohibiting states from regulating AI. Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts. The Center for AI Safety is also excited to announce the Summer session of our AI Safety, Ethics, and Society course, running from June 23 to September 14. The course, based on our recently published textbook, is open to participants from all disciplines and countries, and is designed to accommodate full-time work or study. Applications for the Summer 2025 course are now open. The final application deadline is May 30th. Visit the course website to learn more and apply. Trump Administration Rescinds AI Diffusion [...] --- Outline: (01:12) Trump Administration Rescinds AI Diffusion Rule, Allows Chip Sales to Gulf States (04:14) Bills on Whistleblower Protections, Chip Location Verification, and State Preemption (06:56) In Other News --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. --- Images from the article: https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45cc31a2-d027-43bd-9f4f-2b26b23e051b_1600x1066.pngApple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app. | |||
| AISN #16: White House Secures Voluntary Commitments from Leading AI Labs, and Lessons from Oppenheimer . | 25 Jul 2023 | 00:12:02 | |
White House Unveils Voluntary Commitments to AI Safety from Leading AI Labs Last Friday, the White House announced a series of voluntary commitments from seven of the world's premier AI labs. Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI pledged to uphold these commitments, which are non-binding and pertain only to forthcoming "frontier models" superior to currently available AI systems. The White House also notes that the Biden-Harris Administration is developing an executive order alongside these voluntary commitments. The commitments are timely and technically well-informed, demonstrating the ability of federal policymakers to respond capably and quickly to AI risks. The Center for AI Safety supports these commitments as a precedent for cooperation on AI [...] --- Outline: (00:11) White House Unveils Voluntary Commitments to AI Safety from Leading AI Labs (05:05) Lessons from Oppenheimer (10:38) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #15: China and the US take action to regulate AI, results from a tournament forecasting AI risk, updates on xAI’s plan, and Meta releases its open-source and commercially available Llama 2. | 19 Jul 2023 | 00:12:09 | |
Both China and the US take action to regulate AI Last week, regulators in both China and the US took aim at generative AI services. These actions show that China and the US are both concerned with AI safety. Hopefully, this is a sign they can eventually coordinate. China’s new generative AI rules On Thursday, China’s government released new rules governing generative AI. China’s new rules, which are set to take effect on August 15th, regulate publicly-available generative AI services. The providers of such services will be criminally liable for the content their services generate. The rules specify illegal [...] --- Outline: (00:17) Both China and the US take action to regulate AI (00:36) China’s new generative AI rules (03:15) The FTC investigates OpenAI (05:01) Results from a tournament forecasting AI risk (08:18) Updates on xAI’s plan (09:05) Meta releases Llama 2, open-source and commercially available --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #14: OpenAI’s ‘Superalignment’ team, Musk’s xAI launches, and developments in military AI use . | 12 Jul 2023 | 00:09:07 | |
OpenAI announces a ‘superalignment’ team On July 5th, OpenAI announced the ‘Superalignment’ team: a new research team given the goal of aligning superintelligence, and armed with 20% of OpenAI’s compute. In this story, we’ll explain and discuss the team’s strategy. What is superintelligence? In their announcement, OpenAI distinguishes between ‘artificial general intelligence’ and ‘superintelligence.’ Briefly, ‘artificial general intelligence’ (AGI) is about breadth of performance. Generally intelligent systems perform well on a wide range of cognitive tasks. For example, humans are in many senses generally intelligent: we can learn how to drive a car, take a derivative, or play piano, even though evolution didn’t train us for those tasks. A superintelligent system would not only be [...] --- Outline: (00:11) OpenAI announces a ‘superalignment’ team (03:50) Musk launches xAI (05:12) Developments in Military AI Use --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||
| AISN #13: An interdisciplinary perspective on AI proxy failures, new competitors to ChatGPT, and prompting language models to misbehave. | 05 Jul 2023 | 00:17:34 | |
Interdisciplinary Perspective on AI Proxy Failures In this story, we discuss a recent paper on why proxy goals fail. First, we introduce proxy gaming, and then summarize the paper’s findings. Proxy gaming is a well-documented failure mode in AI safety. For example, social media platforms use AI systems to recommend content to users. These systems are sometimes built to maximize the amount of time a user spends on the platform. The idea is that the time the user spends on the platform approximates the quality of the content being recommended. However, a user might spend even more time on a platform because they’re responding to an enraging post or interacting [...] --- Outline: (00:13) Interdisciplinary Perspective on AI Proxy Failures (06:06) A Flurry of AI Fundraising and Model Releases (12:53) Adversarial Inputs Make Chatbots Misbehave (15:52) Links --- First published: Source: --- Want more? Check out our ML Safety Newsletter for technical safety research. Narrated by TYPE III AUDIO. | |||