Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎯 Alignment Research
AI alignment, RLHF, value alignment, reward modeling
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
186584
posts in
24.1
ms
The Alignment Target Problem:
Divergent
Moral
Judgments
of Humans, AI Systems, and Their Designers
🛡️
AI Safety
arxiv.org
·
2d
AI &
Alignment
🛡️
AI Safety
chriscoyier.net
·
5d
·
Hacker News
An
Alignment
Journal:
Adaptation
to AI
🛡️
AI Safety
lesswrong.com
·
2d
RLHF
Flow-GRPO implementation POC by
ifilipis
· Pull Request #808
🪝
eBPF
github.com
·
2d
·
r/StableDiffusion
reward-lens: A
Mechanistic
Interpretability
Library for Reward Models
🔍
AI Interpretability
arxiv.org
·
21h
Deep Learning Weekly: Issue 453
⚡
Edge AI
deeplearningweekly.com
·
10h
Reinforcement
fine-tuning
with LLM-as-a-judge
🪄
Prompt Engineering
aws.amazon.com
·
5h
🥇Top AI
Papers
of the Week
🇨🇳
Chinese AI
nlp.elvissaravia.com
·
4d
Rewarding
the
Scientific
Process: Process-Level Reward Modeling for Agentic Data Analysis
🔍
AI Interpretability
arxiv.org
·
2d
Collective
intelligence framework shows how human-AI teams may make better
decisions
🤝
Human-AI Collaboration
techxplore.com
·
5h
AI Infrastructure
Architect
·
Builder
· Author
🇨🇳
Chinese AI
markferraz.com
·
6h
·
Hacker News
Reward Models Are Secretly Value Functions:
Temporally
Coherent
Reward Modeling
🧠
Agent Memory
arxiv.org
·
2d
The
Inference
Economy:
Token
Use
💭
Reasoning Models
frontierai.substack.com
·
7h
·
Substack
Alignment Makes Models More
Decisive
Without Making Them More
Truthful
🛡️
AI Safety
zenodo.org
·
3d
·
r/singularity
The Human
Creativity
Benchmark –
Evaluating
Generative AI in Creative Work
🎭
Claude
contralabs.com
·
6h
·
Hacker News
Three Models of
RLHF
Annotation
: Extension, Evidence, and Authority
⚙️
MLOps
arxiv.org
·
1d
Alibaba's
Metis
agent cuts
redundant
AI tool calls from 98% to 2% — and gets more accurate doing it
🕵️
AI Agents
venturebeat.com
·
4h
New Content From <i>Current
Directions
in
Psychological
Science</i>
🌋
Existential Risk Research
psychologicalscience.org
·
10h
What
Sentences
Cause Alignment
Faking
?
🛡️
AI Safety
lesswrong.com
·
2d
Hidden States Know Where Reasoning
Diverges
: Credit Assignment via Span-Level
Wasserstein
Distance
🔢
BitNet
arxiv.org
·
2d
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help