Introducing Project Telos: Modeling, Measuring, and Intervening on Goal-directed Behavior in AI Systems
lesswrong.comยท3d
๐ฏReinforcement Learning
Flag this post
Reflections on 4 years of meta-honesty
lesswrong.comยท1d
๐ฎMessage Queues
Flag this post
Vaccination against ASI
lesswrong.comยท1d
๐ฎMessage Queues
Flag this post
Halfhaven Digest #3
lesswrong.comยท2d
๐กRSS
Flag this post
Why I Transitioned: A Case Study
lesswrong.comยท1d
โCategory Theory
Flag this post
LLM Hallucinations: An Internal Tug of War
lesswrong.comยท4d
๐AI Interpretability
Flag this post
Asking Paul Fussell for Writing Advice
lesswrong.comยท2d
โCategory Theory
Flag this post
Seattle Secular Solstice 2025 โ Dec 20th
lesswrong.comยท1d
๐ฟDigital Gardens
Flag this post
AISLE discovered three new OpenSSL vulnerabilities
lesswrong.comยท3d
๐ฆRust
Flag this post
Model Parameters as a Steganographic Private Channel
lesswrong.comยท6d
๐ขHomomorphic Encryption
Flag this post
Why do AI models use so many em-dashes?
seangoedecke.comยท4d
Is it worrying that 95% of AI enterprise projects fail?
seangoedecke.comยท10h
A Bayesian Explanation of Causal Models
lesswrong.comยท6d
๐Dependent Types
Flag this post
Centralization begets stagnation
lesswrong.comยท3d
๐Distributed Systems
Flag this post
Temporarily Losing My Ego
lesswrong.comยท5d
๐ Self-Hosting
Flag this post
Anthropic's Pilot Sabotage Risk Report
lesswrong.comยท3d
๐๏ธObservability
Flag this post
Loading...Loading more...