Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎯 AI Alignment
Value Learning, RLHF, Constitutional AI, Safety Research
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
183138
posts in
45.6
ms
AI &
Alignment
👨💻
AI Coding
chriscoyier.net
·
1d
·
Hacker News
Takes on
Automating
Alignment
👨💻
AI Coding
lesswrong.com
·
6d
Hot Research
Topics
in AI and ML in 2026 and Their
Philosophical
Connections
🤨
AI Criticism
omseeth.github.io
·
1d
·
Hacker News
Smarter Doesn’t Mean
Safer
: A Real
RLHF
Experiment on LLM Behavior
🎯
RLHF
medium.com
·
6d
Could AI language models be used to help
align
themselves
?
🤖
GenAI
3quarksdaily.com
·
4d
Diary of a "
Doomer
": 12+ years
arguing
about AI risk (part 3: the LLM era)
🤨
AI Criticism
lesswrong.com
·
21h
Import AI 454:
Automating
alignment research; safety study of a Chinese model;
HiFloat4
🛡️
AI Safety Evals
jack-clark.net
·
6d
Alignment
by
Default
?
🛡️
AI Safety
blog.cosmos-institute.org
·
6d
·
Hacker News
Mechanistic
interpretability
and lean
🎯
AI Reliability
alok.github.io
·
5d
Third
Symposium
on
AIT
& ML: AI Safety Applications
🛡️
AI Safety Evals
lesswrong.com
·
1d
Machine learning-driven alignment architecture of heterogeneous data with
transient
varying
semantics
⚙️
ML Infrastructure
nature.com
·
4d
AI models can learn harmful
traits
that
evade
safety filters
🛡️
AI Safety
earth.com
·
6d
Alignment
Faking
Replication
and Chain-of-Thought Monitoring Extensions
🐛
Fuzzing
lesswrong.com
·
3h
GCs
Are Way Beyond ‘Strategic.’ In AI Era, They Build
Alignment
⚖️
AI Governance
news.bloomberglaw.com
·
6d
Less human AI agents,
please
🎼
Agent Orchestration
gridthegrey.com
·
5d
·
DEV
How could I best use this
opportunity
? (AI Safety)
🛡️
AI Safety Evals
lesswrong.com
·
8h
The
Changing
North Star of AI Control
🏛
Sovereign AI Infrastructure
lesswrong.com
·
4d
Monday AI
Radar
#22
✍️
Prompt Engineering
lesswrong.com
·
5d
Pando
: A Controlled Benchmark for
Interpretability
Methods
🎯
AI Reliability
lesswrong.com
·
5d
Import AI 454:
Automating
alignment research; safety study of a Chinese model;
HiFloat4
🛡️
AI Safety Evals
importai.substack.com
·
6d
·
Substack
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help