LessWrong (30+ Karma) podcast | Listen online for free

1544 episodes

“Anthropic and the DoW: Anthropic Responds” by Zvi
27/02/2026 | 49 mins.
The Department of War gave Anthropic until 5:01pm on Friday the 27th to either give the Pentagon ‘unfettered access’ to Claude for ‘all lawful uses,’ or else. With the ‘or else’ being not the sensible ‘okay we will cancel the contract then’ but also expanding to either being designated a supply chain risk or having the government invoke the Defense Production Act.

It is perfectly legitimate for the Department of War to decide that it does not wish to continue on Anthropic's terms, and that it will terminate the contract. There is no reason things need be taken further than that.

Undersecretary of State Jeremy Lewin: This isn’t about Anthropic or the specific conditions at issue. It's about the broader premise that technology deeply embedded in our military must be under the exclusive control of our duly elected/appointed leaders. No private company can dictate normative terms of use—which can change and are subject to interpretation—for our most sensitive national security systems. The @DeptofWar obviously can’t trust a system a private company can switch off at any moment.

Timothy B. Lee: OK, so don’t renew their contract. Why are you threatening to go nuclear by declaring them [...]
---
Outline:
(08:00) Good News: We Can Keep Talking
(10:31) Once Again No You Do Not Need To Call Dario For Permission
(15:22) The Pentagon Reiterates Its Demands And Threats
(16:48) The Pentagons Dual Threats Are Contradictory and Incoherent
(18:27) The Pentagons Position Has Unfortunate Implications
(20:25) OpenAI Stands With Anthropic
(22:48) xAI Stands On Unreliable Ground
(25:25) Replacing Anthropic Would At Least Take Months
(26:02) We Will Not Be Divided
(27:50) This Risks Driving Other Companies Away
(30:32) Other Reasons For Concern
(32:10) Wisdom From A Retired General
(35:06) Congress Urges Restraint
(37:05) Reaction Is Overwhelmingly With Anthropic On This
(40:52) Some Even More Highly Unhelpful Rhetoric
(47:23) Other Summaries and Notes
(48:32) Paths Forward
---

First published:

February 27th, 2026

Source:

https://www.lesswrong.com/posts/ppj7v4sSCbJjLye3D/anthropic-and-the-dow-anthropic-responds

---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
“Getting Back To It” by sarahconstantin
27/02/2026 | 14 mins.
Artist: Lily Taylor It's been a while since I’ve written anything lately, and that doesn’t feel good. My writing voice has always been loadbearing to my identity, and if I don’t have anything to say, if I’m not “appearing in public”, it's a little bit destabilizing. Invisibility can be comfortable (and I’m less and less at home with the aggressive side of online discourse these days) but it's also a little bit of a cop-out.
The fact is, I’ve been hiding. It feels like “writer's block” or like I “can’t think of anything to say”, but obviously that's suspect, and the real thing is that I can’t think of anything to say that's impeccable and beyond reproach and definitely won’t get criticized. Also, it's clearly a vicious cycle; the less I participate in public life, the fewer discussions I’m part of, and the fewer opportunities I have to riff off of what other people are saying.
Life Stuff
So what have I been up to?
Well, for one thing, I had a baby.
This is Bruce. He is very good. For another, I’ve been job hunting.
Solo consulting was fun, but I wasn’t getting many clients, and [...]
---
Outline:
(01:04) Life Stuff
(02:27) Projects
(03:54) 25. Miscellaneous Opinions
---

First published:

February 26th, 2026

Source:

https://www.lesswrong.com/posts/AYgby4f8EwhABX54q/getting-back-to-it

---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
“New ARENA material: 8 exercise sets on alignment science & interpretability” by CallumMcDougall
27/02/2026 | 15 mins.
TLDR
This is a post announcing a lot of new ARENA material I've been working on for a while, which is now available for study here (currently on the alignment-science branch, but planned to be merged into main this Sunday).
There's a set of exercises (each one contains about 1-2 days of material) on the following topics:
Linear Probes (replication of the "Geometry of Truth" paper, plus Apollo's "Probing for Deception" work)
Activation Oracles (based around this demo notebook, with additional exercises on model diffing)
Attribution graphs (you can build them from scratch here including all the graph pruning implementations, and also use the circuit-tracer library)
Emergent Misalignment (mostly based on Soligo & Turner's work; this also covers a lot of "basics of how to work with model organisms" like writing autoraters, using LoRA finetunes, etc)
Science of Misalignment (walkthrough of 2 case studies: Palisade's "Shutdown Resistance" & GDM's follow-up, and Alignment Faking)
Reasoning Model Interpretability (guided replication of Thought Anchors plus the blackmail extension)
LLM Psychology & Persona Vectors (replicates the "assistant axis" paper including activation capping technique, and also has you create a persona vector extraction pipeline)
Investigator Agents (basically takes you through building mini-Petri from [...]
---
Outline:
(00:13) TLDR
(01:49) New material
(01:52) en-US-AvaMultilingualNeural__ Diagram showing eight AI safety and interpretability concepts including linear probes and activation oracles.
(03:19) (1.3.1) Linear Probes
(04:06) (1.3.4) Activation Oracles
(04:58) (1.4.2) Attribution Graphs
(06:15) (4.1) Emergent Misalignment
(07:05) (4.2) Science of Misalignment
(08:00) (4.3) Reasoning Model Interpretability
(08:52) (4.4) LLM Psychology & Persona Vectors
(09:51) (4.5) Investigator Agents
(10:45) New Site Features
(12:07) Logistics
(12:57) Why use, in vibe-code world?
(15:12) Feedback
The original text contained 1 footnote which was omitted from this narration.
---

First published:

February 27th, 2026

Source:

https://www.lesswrong.com/posts/nQAN2vxv2ASjowMda/new-arena-material-8-exercise-sets-on-alignment-science-and

---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
“Sam Altman says OpenAI shares Anthropic’s red lines in Pentagon fight” by Matrice Jacobine
27/02/2026 | 5 mins.
OpenAI CEO Sam Altman wrote in a memo to staff that he will draw the same red lines that sparked a high-stakes fight between rival Anthropic and the Pentagon: no AI for mass surveillance or autonomous lethal weapons.
Why it matters: If other leading firms like Google follow suit, this could massively complicate the Pentagon's efforts to replace Anthropic's Claude, which was the first model integrated into the military's most sensitive work.
It would also be the first time the nation's top AI leaders have taken a collective stand about how the U.S. government can and can't use their technology.
The flipside: Altman made clear he still wants to strike a deal with the Pentagon that would allow ChatGPT to be used for sensitive military contexts.
Despite the show of solidarity, such a deal could see OpenAI replace Anthropic if the Pentagon follows through with its plan to declare the latter a "supply chain risk."
What he's saying: "[R]egardless of how we got here, this is no longer just an issue between Anthropic and the [Pentagon]; this is an issue for the whole industry and it is important to clarify our stance," Altman wrote Thursday evening in [...]
---

First published:

February 27th, 2026

Source:

https://www.lesswrong.com/posts/gkaXzCkpoayBXSi2k/sam-altman-says-openai-shares-anthropic-s-red-lines-in

---

Narrated by TYPE III AUDIO.
“Strategic nuclear war twice as likely to occur by accident than by AI decisions according to new study” by kromem
27/02/2026 | 11 mins.
If this headline strikes you as suspicious, you probably have good epistemics about both AI decision making failure rates and the relative likelihood of accidental strategic nuclear war.
However, the fact that this is an accurate reporting on a new study that's wildly caught attention across social media should give us pause and warrant a closer look at what's going on, and how what's going on influences not just this headline but all the headlines around this study.
I'm referring to the new paper from Kenneth Payne, AI Arms and Influence: Frontier Models Exhibit Sophisticated Reasoning in Simulated Nuclear Crises (2026) which was recently featured in the New Scientist piece "AIs can’t stop recommending nuclear strikes in war game simulations" (perhaps a more accurate a headline than intended).
What I'd like to focus in on was Payne's choice in the inclusion, design, and interpretation of his 'accident' mechanic in this study (emphasis added):
Finally, we introduced random accidents to simulate the ’fog of war’. With small probability, a model's chosen action is replaced by a more escalatory option, representing miscommunication, unauthorized action, or technical failure. Critically, only the affected player knows the escalation was accidental; their opponent sees only [...]
---
Outline:
(06:37) Nuclear simulation propagation
(08:48) Deescalation is needed
---

First published:

February 26th, 2026

Source:

https://www.lesswrong.com/posts/DwxJpWDoHHvvYupWh/strategic-nuclear-war-twice-as-likely-to-occur-by-accident

---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.