Reason About Intelligence, Not AI
lesswrong.com·3h
🔍AI Interpretability
Flag this post
Weak-To-Strong Generalization
lesswrong.com·20h
∘Category Theory
Flag this post
2025 Unofficial LW Community Census, Request for Comments
lesswrong.com·17h
🌿Digital Gardens
Flag this post
Evidence on language model consciousness
lesswrong.com·1d
🔍AI Interpretability
Flag this post
Brainstorming 25 Questions I Am Interested In
lesswrong.com·10h
🗃️Zettelkasten
Flag this post
Human Values ≠ Goodness
lesswrong.com·3h
🌿Digital Gardens
Flag this post
Introducing Project Telos: Modeling, Measuring, and Intervening on Goal-directed Behavior in AI Systems
lesswrong.com·2d
🎯Reinforcement Learning
Flag this post
My YC Pitch
lesswrong.com·12h
🌐Open Source
Flag this post
25 Que
lesswrong.com·10h
🗃️Zettelkasten
Flag this post
Economics and Transformative AI (by Tom Cunningham)
lesswrong.com·1d
🔍AI Interpretability
Flag this post
A toy model of corrigibility
lesswrong.com·4h
⚡Incremental Computation
Flag this post
LLM Hallucinations: An Internal Tug of War
lesswrong.com·3d
🔍AI Interpretability
Flag this post
I Wondered Why I Procrastinate Even On Things I Am "Passionate" About
lesswrong.com·7h
🌿Digital Gardens
Flag this post
Secretly Loyal AIs: Threat Vectors and Mitigation Strategies
lesswrong.com·1d
🔢Homomorphic Encryption
Flag this post
Doom from a Solution to the Alignment Problem
lesswrong.com·6h
⚡Incremental Computation
Flag this post
Freewriting in my head, and overcoming the “twinge of starting”
lesswrong.com·1d
🗃️Zettelkasten
Flag this post
Vaccination against ASI
lesswrong.com·1d
📮Message Queues
Flag this post
Loading...Loading more...