Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎮 Reinforcement Learning
RL, AI Agents, Game Playing, Policy Optimization
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
381
posts in
7.0
ms
Performance Variation in
Deep
Reinforcement
Learning
🎓
RLHF
Content type:
Academic
arxiv.org
·
3d
3 days ago
Actions for Performance Variation in Deep Reinforcement Learning
How to Implement a Model-Free
RL
Algorithm: A Step-by-Step Guide
🎓
RLHF
Content type:
Blog
ujangriswanto08.medium.com
·
35m
35 minutes ago
Actions for How to Implement a Model-Free RL Algorithm: A Step-by-Step Guide
Q-Learning
(
Reinforcement
learning
): Bellman Equation,
Markov
Decision Processes, Q-Values, and…
🎓
RLHF
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for Q-Learning (Reinforcement learning): Bellman Equation, Markov Decision Processes, Q-Values, and…
Researchers develop
AI-powered
railway control system for efficient urban train operation
🤖
Agent
techxplore.com
·
16h
16 hours ago
Actions for Researchers develop AI-powered railway control system for efficient urban train operation
Agents
Need Work Data: A Primer on RLWD, or
Reinforcement
Learning
on Work Data
🤖
Agent
anjalishriva.com
·
1d
1 day ago
·
Hacker News
Actions for Agents Need Work Data: A Primer on RLWD, or Reinforcement Learning on Work Data
Reasoning
RL
in 2026: GRPO, DPO, RLVR,
Agentic
PO
& Beyond
🎓
RLHF
turingpost.com
·
3d
3 days ago
Actions for Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond
Propel: Breaking the Solver Bottleneck in Task-Generator
RL
✍️
Prompt Engineering
vmax.ai
·
7h
7 hours ago
·
Hacker News
Actions for Propel: Breaking the Solver Bottleneck in Task-Generator RL
Some Interesting Papers on RLVR
🎓
RLHF
lesswrong.com
·
1d
1 day ago
Actions for Some Interesting Papers on RLVR
SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.
🐍
Python
Content type:
Code
github.com
·
3d
3 days ago
·
r/opensource
Actions for SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.
Reinforcement-learning
signals support dynamic adaptive control during language switching
🎓
RLHF
Content type:
Academic
nature.com
·
1d
1 day ago
Actions for Reinforcement-learning signals support dynamic adaptive control during language switching
Import
AI
460:
Reward
hacking society, RSI data from Anthropic; and
RL-based
quadcopter racing
🤖
Agent
Content type:
News
Content type:
Blog
importai.substack.com
·
2d
2 days ago
·
Substack
Actions for Import AI 460: Reward hacking society, RSI data from Anthropic; and RL-based quadcopter racing
Reinforcement
Learning
and
Optimal
Control Book (RIP Dimitri Bertsekas)
🎓
RLHF
Content type:
Academic
web.mit.edu
·
5d
5 days ago
·
Hacker News
Actions for Reinforcement Learning and Optimal Control Book (RIP Dimitri Bertsekas)
Scale Robot
Reinforcement
Learning
with NVIDIA Isaac Lab on Amazon SageMaker
AI
⚓
Kubernetes
Content type:
Blog
aws.amazon.com
·
1d
1 day ago
Actions for Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI
NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running
Agents
🎯
Fine-tuning
Content type:
Blog
developer.nvidia.com
·
6d
6 days ago
·
Hacker News
Actions for NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents
Researchers trained an
open
source
AI
search
agent
, Harness-1, that outperforms GPT-5.4 on recalling relevant information
💭
Context Management
venturebeat.com
·
2d
2 days ago
·
Hacker News
Actions for Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information
See,
Act
, Correct: three levers for working with a code
agent
🤖
Agent
Content type:
Blog
blog.owulveryck.info
·
6d
6 days ago
·
Hacker News
,
Hacker News
Actions for See, Act, Correct: three levers for working with a code agent
DQN
Tutorial -
RL
Summer School 2026
✍️
Prompt Engineering
araffin.github.io
·
1d
1 day ago
Actions for DQN Tutorial - RL Summer School 2026
AI
Agent
Mastery & Coaching
🤖
Agent
ruv.io
·
3d
3 days ago
Actions for AI Agent Mastery & Coaching
A Human-Augmenting
Agentic
Workflow for Causal Inference
🤖
Agent
Content type:
Blog
netflixtechblog.medium.com
·
2d
2 days ago
Actions for A Human-Augmenting Agentic Workflow for Causal Inference
Agentic
RL
: Token-In, Token-Out Done Right
🤖
LLM
qgallouedec-tito.hf.space
·
1d
1 day ago
·
Hacker News
Actions for Agentic RL: Token-In, Token-Out Done Right
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help