Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎮 Reinforcement Learning
RL, reward functions, policy gradient, agents, simulation
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
344
posts in
7.6
ms
Q-Learning
(
Reinforcement
learning
): Bellman Equation,
Markov
Decision Processes, Q-Values, and…
🚗
Autonomous Systems
Content type:
Blog
medium.com
·
1d
1 day ago
Actions for Q-Learning (Reinforcement learning): Bellman Equation, Markov Decision Processes, Q-Values, and…
Reinforcement
Learning
and
Optimal
Control Book (RIP Dimitri Bertsekas)
🚗
Autonomous Systems
Content type:
Academic
web.mit.edu
·
5d
5 days ago
·
Hacker News
Actions for Reinforcement Learning and Optimal Control Book (RIP Dimitri Bertsekas)
Geometry-Aware
Reinforcement
Learning
for 2D Irregular Nesting
🧠
AI Agents
Content type:
Academic
arxiv.org
·
6h
6 hours ago
Actions for Geometry-Aware Reinforcement Learning for 2D Irregular Nesting
Agents
Need Work Data: A Primer on RLWD, or
Reinforcement
Learning
on Work Data
🧠
AI Agents
anjalishriva.com
·
18h
18 hours ago
·
Hacker News
Actions for Agents Need Work Data: A Primer on RLWD, or Reinforcement Learning on Work Data
Reasoning
RL
in 2026: GRPO, DPO, RLVR,
Agentic
PO
& Beyond
🤖
AI
turingpost.com
·
2d
2 days ago
Actions for Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond
Time-slip in AI sepsis models may inflate results, risking under- or overtreatment
🧠
AI Agents
medicalxpress.com
·
4d
4 days ago
Actions for Time-slip in AI sepsis models may inflate results, risking under- or overtreatment
Scale Robot
Reinforcement
Learning
with NVIDIA Isaac Lab on Amazon SageMaker AI
🏗️
Infrastructure
Content type:
Blog
aws.amazon.com
·
14h
14 hours ago
Actions for Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI
Reinforcement
learning
in linear embedding space unlocks generalizable control across soft robot configurations
🤖
Robotics
Content type:
Academic
nature.com
·
2d
2 days ago
Actions for Reinforcement learning in linear embedding space unlocks generalizable control across soft robot configurations
Some Interesting Papers on RLVR
🧠
LLMs
lesswrong.com
·
15h
15 hours ago
Actions for Some Interesting Papers on RLVR
How to Train Your Goblin
🧠
LLMs
goblins.mchen.workers.dev
·
2d
2 days ago
·
Hacker News
,
Hacker News
Actions for How to Train Your Goblin
Core Automation co-founder Jerry Tworek jokes that Nvidia's CUDA translates to miracles in Polish
🔶
Hacker News
digg.com
·
5d
5 days ago
Actions for Core Automation co-founder Jerry Tworek jokes that Nvidia's CUDA translates to miracles in Polish
What is MBPO? A Beginner’s Guide to Efficient
Reinforcement
Learning
⚙️
MLOps
Content type:
Blog
ujangriswanto08.medium.com
·
5d
5 days ago
Actions for What is MBPO? A Beginner’s Guide to Efficient Reinforcement Learning
Variational
Proximal
Policy
Optimization
📡
Edge Computing
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Variational Proximal Policy Optimization
Towards Shutdownable
Agents
: Generalizing Stochastic Choice in
RL
Agents
and LLMs
🧠
AI Agents
lesswrong.com
·
6d
6 days ago
Actions for Towards Shutdownable Agents: Generalizing Stochastic Choice in RL Agents and LLMs
Deep
reinforcement
learning
for process design: Review and perspective
🧠
AI Agents
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Deep reinforcement learning for process design: Review and perspective
Performance Variation in
Deep
Reinforcement
Learning
🧠
LLMs
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Performance Variation in Deep Reinforcement Learning
Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using
Reinforcement
Learning
🤖
Robotics
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using Reinforcement Learning
Learning
to replenish: A hybrid
deep
reinforcement
learning
for dynamic inventory management in the pharmaceutical supply chains
🕸️
Distributed Systems
Content type:
Academic
arxiv.org
·
5d
5 days ago
Actions for Learning to replenish: A hybrid deep reinforcement learning for dynamic inventory management in the pharmaceutical supply chains
UNIQ: Conformal Calibration for Adaptive Conservatism in Offline
Reinforcement
Learning
🕸️
Distributed Systems
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for UNIQ: Conformal Calibration for Adaptive Conservatism in Offline Reinforcement Learning
Dynamic Multi-Pair Trading Strategy in Cryptocurrency Markets with
Deep
Reinforcement
Learning
🤖
AI
Content type:
Academic
arxiv.org
·
6d
6 days ago
Actions for Dynamic Multi-Pair Trading Strategy in Cryptocurrency Markets with Deep Reinforcement Learning
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help