Introducing Project Telos: Modeling, Measuring, and Intervening on Goal-directed Behavior in AI Systems
lesswrong.comยท3d
๐ŸŽฏReinforcement Learning
Flag this post
Just complaining about LLM sycophancy (filler episode)
lesswrong.comยท2h
โœWriting
Flag this post
Reason About Intelligence, Not AI
lesswrong.comยท1d
๐Ÿ”AI Interpretability
Flag this post
FTL travel and scientific realism
lesswrong.comยท1d
๐Ÿ‘๏ธObservability
Flag this post
Youโ€™re always stressed, your mind is always busy, you never have enough time
lesswrong.comยท2d
โœWriting
Flag this post
Seattle Secular Solstice 2025 โ€“ Dec 20th
lesswrong.comยท2d
๐ŸŒฟDigital Gardens
Flag this post
LLM Hallucinations: An Internal Tug of War
lesswrong.comยท4d
๐Ÿ”AI Interpretability
Flag this post
My YC Pitch
lesswrong.comยท1d
๐ŸŒOpen Source
Flag this post
On The Conservation of Rights
lesswrong.comยท4d
๐Ÿฆ€Rust
Flag this post
Reflections on 4 years of meta-honesty
lesswrong.comยท1d
๐Ÿ“ฎMessage Queues
Flag this post
Why Civilizations Are Unstable (And What This Means for AI Alignment)
lesswrong.comยท5d
๐Ÿ”AI Interpretability
Flag this post
Halfhaven Digest #3
lesswrong.comยท3d
๐Ÿ“กRSS
Flag this post
Strategy-Stealing Argument Against AI Dealmaking
lesswrong.comยท2d
๐ŸŽฏReinforcement Learning
Flag this post