Doom from a Solution to the Alignment Problem
lesswrong.comยท6h
โกIncremental Computation
Flag this post
LLM Hallucinations: An Internal Tug of War
lesswrong.comยท3d
๐AI Interpretability
Flag this post
Brainstorming 25 Questions I Am Interested In
lesswrong.comยท11h
๐๏ธZettelkasten
Flag this post
[CS 2881r] Can We Prompt Our Way to Safety? Comparing System Prompt Styles and Post-Training Effects on Safety Benchmarks
lesswrong.comยท5d
๐Dependent Types
Flag this post
When Will AI Transform the Economy?
lesswrong.comยท5d
๐AI Interpretability
Flag this post
Human Values โ Goodness
lesswrong.comยท3h
๐ฟDigital Gardens
Flag this post
Strategy-Stealing Argument Against AI Dealmaking
lesswrong.comยท1d
๐ฏReinforcement Learning
Flag this post
Agentic Monitoring for AI Control
lesswrong.comยท6d
๐๏ธObservability
Flag this post
Reflections on 4 years of meta-honesty
lesswrong.comยท17h
๐ฎMessage Queues
Flag this post
Uncertain Updates: October 2025
lesswrong.comยท4d
โกIncremental Computation
Flag this post
AISLE discovered three new OpenSSL vulnerabilities
lesswrong.comยท3d
๐ฆRust
Flag this post
Emergent Introspective Awareness in Large Language Models
lesswrong.comยท3d
๐๏ธZettelkasten
Flag this post
Q2 AI Benchmark Results: Pros Maintain Clear Lead
lesswrong.comยท5d
๐AI Interpretability
Flag this post
Why Is Printing So Bad?
lesswrong.comยท1d
โWriting
Flag this post
The Memetics of AI Successionism
lesswrong.comยท5d
๐AI Interpretability
Flag this post
Halfhaven Digest #3
lesswrong.comยท2d
๐กRSS
Flag this post
Loading...Loading more...