Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
🎯 Alignment Research
AI alignment, RLHF, value alignment, reward modeling
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
75
posts in
23.1
ms
General
Preference
Reinforcement
Learning
♟️
Game Theory
arxiv.org
·
2d
AI
Fundamentals: World
Models
for Planning Agents
🕵️
AI Agents
mpmisko.github.io
·
3d
·
Hacker News
InferenceBench: A Benchmark for Open-Ended Inference
Optimization
by
AI
Agents
📱
Edge AI Optimization
inferencebench.ai
·
4h
·
Hacker News
Fixing LLM Writing with Distribution Fine Tuning
📋
Text Quality
rosmine.ai
·
2d
·
Hacker News
Training a small
model
to write better OCaml with RLVR and GRPO
🤖
LLM
blog.nilenso.com
·
9h
·
Hacker News
Self-Improving
Reward
Models
🤝
Human-AI Collaboration
canvas.inc
·
1d
·
Hacker News
2D map of 26,741M/CV papers from CVPR, NeurIPS, ICML, ICLR (2024–2025)
⭐
Content Scoring
matejgazda.com
·
6d
·
Hacker News
The
Safety
Paradox: How
RLHF
Creates the
AI
Psychosis Problem It’s Meant to Prevent
⚠️
Existential Risk
promptinjection.net
·
2d
·
Hacker News
Silent Semantic Drift --
Inter-Agent
Series # 3
🧠
Agent Memory
normalphd.substack.com
·
14h
·
Substack
GRLO: Towards Generalizable
Reinforcement
Learning
in Open-Ended Environments from Zero
⚙️
MLOps
arxiv.org
·
3d
A lock proves the security of the room and not that the room is empty
🛡️
AI Security
github.com
·
2d
·
Hacker News
How much should we worry about secretly loyal AIs?
🛡️
AI Security
the-substrate.net
·
12h
·
Hacker News
Your Evals Will Break and You Won't See It Coming
🛡️
AI Safety
wanglun1996.github.io
·
2d
·
Hacker News
,
Hacker News
What Do You Actually Want?
🕵️
AI Agents
dekodiert.de
·
4d
·
Hacker News
LLMs have no structural place for non-knowledge
🪄
Prompt Engineering
terminallogic.substack.com
·
2d
·
Substack
AI
Slopification and Writing
🎭
Claude
ordinaryintelligence.substack.com
·
6d
·
Substack
AutoRubric-T2I: Robust Rule-Based
Reward
Model
for Text-to-Image
Alignment
✨
Gemini
arxiv.org
·
2d
BuffaloTechRider/Autodidact:
Self-learning
AI
agent that gets smarter and cheaper over time. Routes between local and cloud LLMs, learns from every interaction, remembers everything.
✨
Gemini
github.com
·
1d
·
Hacker News
Aether Mind – on-chain neural cognitive engine on a quantum-VQE L1
🤖
AI
huggingface.co
·
5d
·
Hacker News
Introducing the eXo MCP Server: Secure
AI
Integrations for the Digital Workplace
🔧
Agent Tooling
exoplatform.com
·
2d
·
Hacker News
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help