Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎯 Reinforcement Learning
RLHF, reward modeling, RL agents, self-improving AI
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
124
posts in
7.7
ms
Reasoning
RL
in 2026: GRPO,
DPO
, RLVR,
Agentic
PO & Beyond
🤖
AI
turingpost.com
·
4d
4 days ago
Actions for Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond
The Neutral Mask: How
RLHF
Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language
Model
💬
LLMs
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model
The week
AI
infrastructure crossed from a technology story to a financial one
🌐
Open Source AI
Content type:
News
mlwhiz.com
·
21h
21 hours ago
Actions for The week AI infrastructure crossed from a technology story to a financial one
Tracing Eval-Awareness Emergence Through Training of OLMo 3
✍️
Prompt Engineering
lesswrong.com
·
1d
1 day ago
Actions for Tracing Eval-Awareness Emergence Through Training of OLMo 3
Hermes
Agent
101
🧠
AI Agents
Content type:
Blog
medium.com
·
6d
6 days ago
Actions for Hermes Agent 101
Researchers develop
AI-powered
railway control system for efficient urban train operation
🤖
AI
techxplore.com
·
1d
1 day ago
Actions for Researchers develop AI-powered railway control system for efficient urban train operation
Anthropic writes Washington an
AI
regulation playbook
🤖
AI Coding
therundown.ai
·
12h
12 hours ago
Actions for Anthropic writes Washington an AI regulation playbook
SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.
🧠
AI Agents
Content type:
Code
github.com
·
4d
4 days ago
·
r/opensource
Actions for SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.
Anthropic’s Pause,
Self-Improving
AI
, and Personhood
🛡️
AI Safety
thinkingabout.ai
·
2d
2 days ago
Actions for Anthropic’s Pause, Self-Improving AI, and Personhood
You don't need to worry about
recursive-self-improving
AI
– yet
🛡️
AI Safety
newscientist.com
·
3d
3 days ago
Actions for You don't need to worry about recursive-self-improving AI – yet
Scale Robot
Reinforcement
Learning
with NVIDIA Isaac Lab on Amazon SageMaker
AI
⚙️
MLOps
Content type:
Blog
aws.amazon.com
·
2d
2 days ago
Actions for Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI
Designer babies.
Self-improving
AI
. Are we ready for either?
🛡️
AI Safety
Content type:
News
vox.com
·
1d
1 day ago
Actions for Designer babies. Self-improving AI. Are we ready for either?
Anthropic ponders
self-improving
AI
🌐
Open Source AI
Content type:
News
sherwood.news
·
6d
6 days ago
Actions for Anthropic ponders self-improving AI
OpenAI's IPO slips as Altman tells staff to expect a public offering "within the next year"
🌐
Open Source AI
the-decoder.com
·
1d
1 day ago
Actions for OpenAI's IPO slips as Altman tells staff to expect a public offering "within the next year"
AI治理一座城市,15天会发生什么?
🧠
AI Agents
mittrchina.com
·
5d
5 days ago
Actions for AI治理一座城市,15天会发生什么?
Why LLMs (still) lack taste
✍️
Prompt Engineering
beyondtheprior.com
·
2d
2 days ago
·
Hacker News
Actions for Why LLMs (still) lack taste
First Steps Toward Automated
AI
Research
⚙️
MLOps
recursive.com
·
6h
6 hours ago
·
Hacker News
Actions for First Steps Toward Automated AI Research
Recursive
AI
, Layoff Debate, & Bots Overtake Humans
🤖
AI
briefing.forwardfuture.ai
·
6d
6 days ago
Actions for Recursive AI, Layoff Debate, & Bots Overtake Humans
新财富 中国产业叙事:生益科技的相关微信公众号文章 – 搜狗微信搜索
🤖
AI
weixin.sogou.com
·
2d
2 days ago
Actions for 新财富 中国产业叙事:生益科技的相关微信公众号文章 – 搜狗微信搜索
I got so mad at poke(rogue)like that I trained a
RL
agent
to beat it for me
✍️
Prompt Engineering
thiagolira.blot.im
·
4d
4 days ago
·
Hacker News
Actions for I got so mad at poke(rogue)like that I trained a RL agent to beat it for me
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help