Introducing Project Telos: Modeling, Measuring, and Intervening on Goal-directed Behavior in AI Systems
lesswrong.comยท2d
๐ฏReinforcement Learning
Flag this post
Halfhaven Digest #3
lesswrong.comยท2d
๐กRSS
Flag this post
Reflections on 4 years of meta-honesty
lesswrong.comยท17h
๐ฎMessage Queues
Flag this post
Summary and Comments on Anthropic's Pilot Sabotage Risk Report
lesswrong.comยท3d
๐๏ธObservability
Flag this post
Freewriting in my head, and overcoming the โtwinge of startingโ
lesswrong.comยท1d
๐๏ธZettelkasten
Flag this post
Uncertain Updates: October 2025
lesswrong.comยท4d
โกIncremental Computation
Flag this post
Me consuming five different forms of media at once to minimize the chance of a thought occurring
lesswrong.comยท14h
๐ฟDigital Gardens
Flag this post
RSS feeds discovery strategies
๐กRSS
Flag this post
FTL travel and scientific realism
lesswrong.comยท16h
๐๏ธObservability
Flag this post
Reason About Intelligence, Not AI
lesswrong.comยท3h
๐AI Interpretability
Flag this post
Asking Paul Fussell for Writing Advice
lesswrong.comยท1d
โCategory Theory
Flag this post
Ink without haven
lesswrong.comยท2d
โWriting
Flag this post
2025 Unofficial LW Community Census, Request for Comments
lesswrong.comยท17h
๐ฟDigital Gardens
Flag this post
Seattle Secular Solstice 2025 โ Dec 20th
lesswrong.comยท1d
๐ฟDigital Gardens
Flag this post
Ilya Sutskever Deposition Transcript
lesswrong.comยท1h
โWriting
Flag this post
Strategy-Stealing Argument Against AI Dealmaking
lesswrong.comยท1d
๐ฏReinforcement Learning
Flag this post
OpenAI Moves To Complete Potentially The Largest Theft In Human History
lesswrong.comยท2d
๐Open Source
Flag this post
Evidence on language model consciousness
lesswrong.comยท1d
๐AI Interpretability
Flag this post
Loading...Loading more...