Open-weight training practices and implications for CoT monitorability
lesswrong.comยท1h
๐ฏReinforcement Learning
Flag this post
Fragments Nov 3
martinfowler.comยท11h
A toy model of corrigibility
lesswrong.comยท1d
โกIncremental Computation
Flag this post
Is it worrying that 95% of AI enterprise projects fail?
seangoedecke.comยท1d
A prayer for engaging in conflict
lesswrong.comยท4h
โWriting
Flag this post
US Govt Whistleblower Guide
lesswrong.comยท5h
๐ขHomomorphic Encryption
Flag this post
Body Time and Daylight Savings Apologetics
lesswrong.comยท1d
๐ขHomomorphic Encryption
Flag this post
Weak-To-Strong Generalization
lesswrong.comยท2d
โCategory Theory
Flag this post
To improve Rationality, create Situations
lesswrong.comยท20h
๐๏ธZettelkasten
Flag this post
A glimpse of the other side
lesswrong.comยท1d
๐๏ธZettelkasten
Flag this post
Parleying with the Principled
lesswrong.comยท12h
๐ขHomomorphic Encryption
Flag this post
High-Resistance Systems to Change: Can a Political Strategy Apply to Personal Change?
lesswrong.comยท17h
โกIncremental Computation
Flag this post
The Mortifying Ordeal of Knowing Thyself
lesswrong.comยท7h
โWriting
Flag this post
Solving a problem with mindware
lesswrong.comยท21h
๐๏ธZettelkasten
Flag this post
Just complaining about LLM sycophancy (filler episode)
lesswrong.comยท15h
โWriting
Flag this post
Loading...Loading more...