Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎮 Reinforcement Learning
RL, Agents, Policy Optimization, Reward Functions
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
470
posts in
8.2
ms
Reasoning
RL
in 2026: GRPO,
DPO
, RLVR,
Agentic
PO & Beyond
🤖
AI
turingpost.com
·
3d
3 days ago
Actions for Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond
Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using
Reinforcement
Learning
🤖
AI
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using Reinforcement Learning
Researchers develop AI-powered railway control system for efficient urban train operation
🤖
Machine Learning
techxplore.com
·
9h
9 hours ago
Actions for Researchers develop AI-powered railway control system for efficient urban train operation
Tracing Eval-Awareness Emergence Through Training of OLMo 3
💬
LLM
lesswrong.com
·
12h
12 hours ago
Actions for Tracing Eval-Awareness Emergence Through Training of OLMo 3
Q-Learning
(
Reinforcement
learning
): Bellman Equation,
Markov
Decision Processes, Q-Values, and…
🤖
Machine Learning
Content type:
Blog
medium.com
·
1d
1 day ago
Actions for Q-Learning (Reinforcement learning): Bellman Equation, Markov Decision Processes, Q-Values, and…
Reinforcement
Learning
and
Optimal
Control Book (RIP Dimitri Bertsekas)
🤖
Machine Learning
Content type:
Academic
web.mit.edu
·
5d
5 days ago
·
Hacker News
Actions for Reinforcement Learning and Optimal Control Book (RIP Dimitri Bertsekas)
local AI
agents
for Cursor with pre-tuned marketplace/commu
🤖
AI
locaible.com
·
9h
9 hours ago
·
Hacker News
Actions for local AI agents for Cursor with pre-tuned marketplace/commu
Agents
Need Work Data: A Primer on RLWD, or
Reinforcement
Learning
on Work Data
💬
LLM
anjalishriva.com
·
1d
1 day ago
·
Hacker News
Actions for Agents Need Work Data: A Primer on RLWD, or Reinforcement Learning on Work Data
Good teachers don’t cheat
🤖
Machine Learning
Content type:
Blog
jasonkena.github.io
·
6d
6 days ago
·
Hacker News
Actions for Good teachers don’t cheat
Scale Robot
Reinforcement
Learning
with NVIDIA Isaac Lab on Amazon SageMaker AI
🔥
PyTorch
Content type:
Blog
aws.amazon.com
·
1d
1 day ago
Actions for Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI
AI
Agent
Mastery & Coaching
💬
LLM
ruv.io
·
2d
2 days ago
Actions for AI Agent Mastery & Coaching
SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.
🤖
AI
Content type:
Code
github.com
·
3d
3 days ago
·
r/opensource
Actions for SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.
Reinforcement
learning
in linear embedding space unlocks generalizable control across soft robot configurations
🤖
AI
Content type:
Academic
nature.com
·
2d
2 days ago
Actions for Reinforcement learning in linear embedding space unlocks generalizable control across soft robot configurations
See,
Act
, Correct: three levers for working with a code
agent
💬
LLM
Content type:
Blog
blog.owulveryck.info
·
6d
6 days ago
·
Hacker News
,
Hacker News
Actions for See, Act, Correct: three levers for working with a code agent
Researchers trained an open source AI search
agent
, Harness-1, that outperforms GPT-5.4 on recalling relevant information
💬
LLM
venturebeat.com
·
2d
2 days ago
·
Hacker News
Actions for Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information
NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running
Agents
💬
LLM
Content type:
Blog
developer.nvidia.com
·
6d
6 days ago
·
Hacker News
Actions for NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents
Fast and Highly Expressive
Policy
Learning
for Offline
Reinforcement
Learning
via Bootstrapped Flow
Q-Learning
🔥
PyTorch
Content type:
Academic
arxiv.org
·
18h
18 hours ago
Actions for Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning
Time-slip in AI sepsis models may inflate results, risking under- or overtreatment
🤖
Machine Learning
medicalxpress.com
·
5d
5 days ago
Actions for Time-slip in AI sepsis models may inflate results, risking under- or overtreatment
Test Your Skills Against an AI Air Hockey Robot
🤖
Machine Learning
Content type:
News
hackster.io
·
5d
5 days ago
Actions for Test Your Skills Against an AI Air Hockey Robot
Model predictive task sampling for efficient and robust adaptation
🔥
PyTorch
Content type:
Academic
nature.com
·
1d
1 day ago
Actions for Model predictive task sampling for efficient and robust adaptation
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help