Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎮 Reinforcement Learning
RLHF, Reward Models, Policy, Agents
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
453
posts in
24.1
ms
Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using
Reinforcement
Learning
🤖
Robotics
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using Reinforcement Learning
Reasoning
RL
in 2026: GRPO, DPO, RLVR,
Agentic
PO
& Beyond
🤖
AI
turingpost.com
·
3d
3 days ago
Actions for Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond
Researchers develop AI-powered railway control system for efficient urban train operation
🤖
Robotics
techxplore.com
·
13h
13 hours ago
Actions for Researchers develop AI-powered railway control system for efficient urban train operation
Agents
Need Work Data: A Primer on RLWD, or
Reinforcement
Learning
on Work Data
🕵️
LLM Agents
anjalishriva.com
·
1d
1 day ago
·
Hacker News
Actions for Agents Need Work Data: A Primer on RLWD, or Reinforcement Learning on Work Data
Reinforcement
Learning
and Optimal Control Book (RIP Dimitri Bertsekas)
📐
Math
Content type:
Academic
web.mit.edu
·
5d
5 days ago
·
Hacker News
Actions for Reinforcement Learning and Optimal Control Book (RIP Dimitri Bertsekas)
Q-Learning
(
Reinforcement
learning
): Bellman Equation,
Markov
Decision Processes, Q-Values, and…
🎲
Stochastic Processes
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for Q-Learning (Reinforcement learning): Bellman Equation, Markov Decision Processes, Q-Values, and…
SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.
📊
Data Visualization
Content type:
Code
github.com
·
3d
3 days ago
·
r/opensource
Actions for SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.
Fast and Highly Expressive
Policy
Learning
for Offline
Reinforcement
Learning
via Bootstrapped Flow
Q-Learning
🔥
PyTorch
Content type:
Academic
arxiv.org
·
22h
22 hours ago
Actions for Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning
See,
Act
, Correct: three levers for working with a code
agent
🤖
AI
Content type:
Blog
blog.owulveryck.info
·
6d
6 days ago
·
Hacker News
,
Hacker News
Actions for See, Act, Correct: three levers for working with a code agent
Flow-DPPO: Divergence Proximal
Policy
Optimization for Flow Matching
Models
📐
Optimization Theory
Content type:
Academic
arxiv.org
·
22h
22 hours ago
Actions for Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models
Discovering Interpretable Multi-Parameter Control
Policies
for Evolutionary Algorithms Using
Deep
Reinforcement
Learning
🤖
AI
Content type:
Academic
arxiv.org
·
22h
22 hours ago
Actions for Discovering Interpretable Multi-Parameter Control Policies for Evolutionary Algorithms Using Deep Reinforcement Learning
Performance Variation in
Deep
Reinforcement
Learning
🧠
LLM
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Performance Variation in Deep Reinforcement Learning
Development of COVID-19 Booster Vaccine
Policy
by Microsimulation and
Q-learning
📐
Semidefinite Programming
Content type:
Academic
arxiv.org
·
22h
22 hours ago
Actions for Development of COVID-19 Booster Vaccine Policy by Microsimulation and Q-learning
A Unifying Lens on
Reward
Uncertainty in
RLHF
🤖
AI
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for A Unifying Lens on Reward Uncertainty in RLHF
Event-Driven
Reinforcement
Learning
Enables Long-Horizon Control in Semiconductor Fabrication
📐
Optimization Theory
Content type:
Academic
arxiv.org
·
22h
22 hours ago
Actions for Event-Driven Reinforcement Learning Enables Long-Horizon Control in Semiconductor Fabrication
Deep
reinforcement
learning
for process design: Review and perspective
🕵️
LLM Agents
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Deep reinforcement learning for process design: Review and perspective
Test-Time Gradient Guidance of Flow
Policies
in
Reinforcement
Learning
🤖
Robotics
Content type:
Academic
arxiv.org
·
22h
22 hours ago
Actions for Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning
Self-Paced Curriculum
Reinforcement
Learning
for Autonomous Superbike Racing in Simulation
🎛️
Control Systems
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Self-Paced Curriculum Reinforcement Learning for Autonomous Superbike Racing in Simulation
Geometrically Averaged Hard Target Updates for Linear
Q-Learning
📐
Optimization Theory
Content type:
Academic
arxiv.org
·
22h
22 hours ago
Actions for Geometrically Averaged Hard Target Updates for Linear Q-Learning
Merging
model-based
control with
multi-agent
reinforcement
learning for
multi-agent
cooperative teaming strategies
🕵️
LLM Agents
Content type:
Academic
arxiv.org
·
5d
5 days ago
Actions for Merging model-based control with multi-agent reinforcement learning for multi-agent cooperative teaming strategies
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help