Introducing Project Telos: Modeling, Measuring, and Intervening on Goal-directed Behavior in AI Systems
lesswrong.comยท3d
๐ŸŽฏReinforcement Learning
Flag this post
Decision theory when you can't make decisions
lesswrong.comยท2d
๐ŸŽฏReinforcement Learning
Flag this post
When Will AI Transform the Economy?
lesswrong.comยท6d
๐Ÿ”AI Interpretability
Flag this post
You think you are in control?
lesswrong.comยท7h
๐Ÿ Self-Hosting
Flag this post
Secretly Loyal AIs: Threat Vectors and Mitigation Strategies
lesswrong.comยท3d
๐Ÿ”ขHomomorphic Encryption
Flag this post
Human Values โ‰  Goodness
lesswrong.comยท1d
๐ŸŒฟDigital Gardens
Flag this post
AISLE discovered three new OpenSSL vulnerabilities
lesswrong.comยท4d
๐Ÿฆ€Rust
Flag this post
Why Civilizations Are Unstable (And What This Means for AI Alignment)
lesswrong.comยท5d
๐Ÿ”AI Interpretability
Flag this post
Uncertain Updates: October 2025
lesswrong.comยท5d
โšกIncremental Computation
Flag this post
Brainstorming 25 Questions I Am Interested In
lesswrong.comยท1d
๐Ÿ—ƒ๏ธZettelkasten
Flag this post
Reason About Intelligence, Not AI
lesswrong.comยท1d
๐Ÿ”AI Interpretability
Flag this post
Seattle Secular Solstice 2025 โ€“ Dec 20th
lesswrong.comยท2d
๐ŸŒฟDigital Gardens
Flag this post
OpenAI Moves To Complete Potentially The Largest Theft In Human History
lesswrong.comยท3d
๐ŸŒOpen Source
Flag this post
Why you shouldn't write a blog post every day for a month
lesswrong.comยท17h
โœWriting
Flag this post
The Memetics of AI Successionism
lesswrong.comยท6d
๐Ÿ”AI Interpretability
Flag this post
Me consuming five different forms of media at once to minimize the chance of a thought occurring
lesswrong.comยท1d
๐ŸŒฟDigital Gardens
Flag this post
Halfhaven Digest #3
lesswrong.comยท3d
๐Ÿ“กRSS
Flag this post
Please Do Not Sell B30A Chips to China
lesswrong.comยท5d
๐Ÿ”ŒEmbedded Systems
Flag this post