Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎯 Reinforcement Learning
RL, reward learning, policy gradient, offline RL
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
74
posts in
7.7
ms
UNIQ: Conformal Calibration for Adaptive Conservatism in
Offline
Reinforcement
Learning
🌐
World Models
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for UNIQ: Conformal Calibration for Adaptive Conservatism in Offline Reinforcement Learning
Reinforcement
Learning
and
Optimal
Control Book (RIP Dimitri Bertsekas)
🌐
World Models
Content type:
Academic
web.mit.edu
·
6d
6 days ago
·
Hacker News
Actions for Reinforcement Learning and Optimal Control Book (RIP Dimitri Bertsekas)
Q-Learning
(
Reinforcement
learning
): Bellman Equation,
Markov
Decision Processes, Q-Values, and…
🌐
World Models
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for Q-Learning (Reinforcement learning): Bellman Equation, Markov Decision Processes, Q-Values, and…
Hrithik Roshan Signs With Anonymous Content
👁️
VLA Models
Content type:
News
deadline.com
·
21h
21 hours ago
Actions for Hrithik Roshan Signs With Anonymous Content
Reward-learning
algorithm hardwired into dopamine circuit
🌐
World Models
Content type:
News
thetransmitter.org
·
6d
6 days ago
Actions for Reward-learning algorithm hardwired into dopamine circuit
Researchers develop AI-powered railway control system for efficient urban train operation
🌐
World Models
techxplore.com
·
1d
1 day ago
Actions for Researchers develop AI-powered railway control system for efficient urban train operation
A Human-Augmenting
Agentic
Workflow for Causal Inference
🌐
World Models
Content type:
Blog
netflixtechblog.medium.com
·
2d
2 days ago
Actions for A Human-Augmenting Agentic Workflow for Causal Inference
Test Your Skills Against an AI Air Hockey Robot
🦿
Robot Learning
Content type:
News
hackster.io
·
6d
6 days ago
Actions for Test Your Skills Against an AI Air Hockey Robot
Scale Robot
Reinforcement
Learning
with NVIDIA Isaac Lab on Amazon SageMaker AI
🦿
Robot Learning
Content type:
Blog
aws.amazon.com
·
1d
1 day ago
Actions for Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI
Deterministic
Policy
Gradient
for
Learning
Equilibrium in Time-Inconsistent Control Problems
🌐
World Models
Content type:
Academic
arxiv.org
·
9h
9 hours ago
Actions for Deterministic Policy Gradient for Learning Equilibrium in Time-Inconsistent Control Problems
Reasoning
RL
in 2026: GRPO, DPO, RLVR,
Agentic
PO
& Beyond
🌐
World Models
turingpost.com
·
4d
4 days ago
Actions for Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond
Some Interesting Papers on RLVR
🌐
World Models
lesswrong.com
·
1d
1 day ago
Actions for Some Interesting Papers on RLVR
AI-powered living business intelligence network
🌐
World Models
atlasforgex.com
·
23h
23 hours ago
·
Hacker News
Actions for AI-powered living business intelligence network
Phi-Actor-Critic
: Steering General-Sum Games to Pareto-Efficient Correlated Equilibria
♟️
Game Theory
Content type:
Academic
arxiv.org
·
9h
9 hours ago
Actions for Phi-Actor-Critic: Steering General-Sum Games to Pareto-Efficient Correlated Equilibria
Repetition on the brain
🧠
Behavioral Economics
Content type:
Academic
nature.com
·
3d
3 days ago
Actions for Repetition on the brain
The New Advantage Emerging in a World That Refuses to Stand Still
🧠
Behavioral Economics
Content type:
News
globalbankingandfinance.com
·
2d
2 days ago
Actions for The New Advantage Emerging in a World That Refuses to Stand Still
We Should Take Text
Optimization
More Seriously
📄
AI Research
Content type:
Blog
yoonholee.com
·
3d
3 days ago
·
Hacker News
Actions for We Should Take Text Optimization More Seriously
Reinforcement
Learning
Disrupts
Gradient-Based
Adversarial Optimization
🌐
World Models
Content type:
Academic
arxiv.org
·
9h
9 hours ago
Actions for Reinforcement Learning Disrupts Gradient-Based Adversarial Optimization
Deep
Reinforcement
Learning
for Adaptive Power Allocation in ISAC Systems with Mobile Target
🌐
World Models
Content type:
Academic
arxiv.org
·
9h
9 hours ago
Actions for Deep Reinforcement Learning for Adaptive Power Allocation in ISAC Systems with Mobile Target
Space-sampled Value Decay: Forgetting Mechanisms for Non-stationary Deep
Reinforcement
Learning
🌐
World Models
Content type:
Academic
arxiv.org
·
9h
9 hours ago
Actions for Space-sampled Value Decay: Forgetting Mechanisms for Non-stationary Deep Reinforcement Learning
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help