A toy model of corrigibility
lesswrong.comΒ·4h
β‘Incremental Computation
Flag this post
Economics and Transformative AI (by Tom Cunningham)
lesswrong.comΒ·1d
πAI Interpretability
Flag this post
Weak-To-Strong Generalization
lesswrong.comΒ·20h
βCategory Theory
Flag this post
Reason About Intelligence, Not AI
lesswrong.comΒ·3h
πAI Interpretability
Flag this post
I Wondered Why I Procrastinate Even On Things I Am "Passionate" About
lesswrong.comΒ·7h
πΏDigital Gardens
Flag this post
Ohio House Bill 469
lesswrong.comΒ·6h
π’Homomorphic Encryption
Flag this post
Model welfare and open source
lesswrong.comΒ·20h
β‘Incremental Computation
Flag this post
25 Que
lesswrong.comΒ·10h
ποΈZettelkasten
Flag this post
Human Values β Goodness
lesswrong.comΒ·3h
πΏDigital Gardens
Flag this post
Secretly Loyal AIs: Threat Vectors and Mitigation Strategies
lesswrong.comΒ·1d
π’Homomorphic Encryption
Flag this post
Brainstorming 25 Questions I Am Interested In
lesswrong.comΒ·10h
ποΈZettelkasten
Flag this post
Introducing Project Telos: Modeling, Measuring, and Intervening on Goal-directed Behavior in AI Systems
lesswrong.comΒ·2d
π―Reinforcement Learning
Flag this post
2025 Unofficial LW Community Census, Request for Comments
lesswrong.comΒ·17h
πΏDigital Gardens
Flag this post
An intro to the Tensor Economics blog
lesswrong.comΒ·4d
π’Homomorphic Encryption
Flag this post
Youβre always stressed, your mind is always busy, you never have enough time
lesswrong.comΒ·1d
βWriting
Flag this post
Evidence on language model consciousness
lesswrong.comΒ·1d
πAI Interpretability
Flag this post
My YC Pitch
lesswrong.comΒ·12h
πOpen Source
Flag this post
Loading...Loading more...