Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
馃幆 Reinforcement Learning
RL, reward, policy gradient, agent, Q-learning
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
431
posts in
4.9
ms
Representation
Learning
Enables Scalable Multitask
Deep
Reinforcement
Learning
聽
馃攧
Continual Learning
聽
Content type:
Academic
arxiv.org
路
5d
5 days ago
Actions for Representation Learning Enables Scalable Multitask Deep Reinforcement Learning
An
Agency-Transferring
Model-Free
Policy
Enhancement Technique
聽
馃
Active Inference
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for An Agency-Transferring Model-Free Policy Enhancement Technique
QnRL: Quantum-Native
Reinforcement
Learning
聽
馃
Active Inference
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for QnRL: Quantum-Native Reinforcement Learning
Learning
to replenish: A hybrid
deep
reinforcement
learning
for dynamic inventory management in the pharmaceutical supply chains
聽
馃К
Evolutionary Computation
聽
Content type:
Academic
arxiv.org
路
5d
5 days ago
Actions for Learning to replenish: A hybrid deep reinforcement learning for dynamic inventory management in the pharmaceutical supply chains
UNIQ: Conformal Calibration for Adaptive Conservatism in Offline
Reinforcement
Learning
聽
馃
Active Inference
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for UNIQ: Conformal Calibration for Adaptive Conservatism in Offline Reinforcement Learning
On Advantage Estimates for Max@K
Policy
Gradients
聽
馃
Active Inference
聽
Content type:
Academic
arxiv.org
路
5d
5 days ago
Actions for On Advantage Estimates for Max@K Policy Gradients
Learning
Predictive Control with
Deep
Koopman Operators for Autonomous Vehicle Motion Planning
聽
鈿欙笍
Computational Mechanics
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Learning Predictive Control with Deep Koopman Operators for Autonomous Vehicle Motion Planning
Offline
Reinforcement
Learning
for Plasma Control in Nuclear Fusion: Codebase and Benchmark
聽
馃攧
Continual Learning
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Offline Reinforcement Learning for Plasma Control in Nuclear Fusion: Codebase and Benchmark
GARL: Game-Theoretic
Reinforcement
Learning
for
Multi-Agent
Strategic Prioritisation
聽
馃悵
Collective Intelligence
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for GARL: Game-Theoretic Reinforcement Learning for Multi-Agent Strategic Prioritisation
Cooperative Long Rope Skipping via
Multi-Agent
Reinforcement
Learning
聽
馃悵
Collective Intelligence
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Cooperative Long Rope Skipping via Multi-Agent Reinforcement Learning
Belief-Space Quantum-Inspired
Reinforcement
Learning
for Partially Observable Autonomous Cyber Defense in the Internet of Vehicles
聽
馃
Neuromorphic Computing
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Belief-Space Quantum-Inspired Reinforcement Learning for Partially Observable Autonomous Cyber Defense in the Internet of Vehicles
Fog of Love: Engineering Virtuous
Agent
Behavior with Affinity-based
Reinforcement
Learning
in a Game Environment
聽
馃悵
Collective Intelligence
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for Fog of Love: Engineering Virtuous Agent Behavior with Affinity-based Reinforcement Learning in a Game Environment
Reformulate LLM
Reinforcement
Learning
for Efficient Training under Black-box Discrepancy
聽
馃
Active Inference
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Reformulate LLM Reinforcement Learning for Efficient Training under Black-box Discrepancy
Policy-Conditioned
Counterfactual Credit for Verifiable
Reinforcement
Learning
of Long-Horizon Language Agents
聽
馃
Active Inference
聽
Content type:
Academic
arxiv.org
路
5d
5 days ago
Actions for Policy-Conditioned Counterfactual Credit for Verifiable Reinforcement Learning of Long-Horizon Language Agents
Agentic
Monte Carlo: Simulating
Reinforcement
Learning
for Black-Box Agents
聽
馃
Active Inference
聽
Content type:
Academic
arxiv.org
路
5d
5 days ago
Actions for Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents
COP-Q: Safety-First
Reinforcement
Learning
for Robot Control via Cholesky-Ordered Projection
聽
馃
Developmental Robotics
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for COP-Q: Safety-First Reinforcement Learning for Robot Control via Cholesky-Ordered Projection
RUBAS: Rubric-Based
Reinforcement
Learning
for
Agent
Safety
聽
鈿欙笍
Computational Mechanics
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for RUBAS: Rubric-Based Reinforcement Learning for Agent Safety
Reinforcement
Learning
from Rich Feedback with Distributional DAgger
聽
馃
Active Inference
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for Reinforcement Learning from Rich Feedback with Distributional DAgger
BiasGRPO: Stabilizing Bias Mitigation in High-Variance
Reward
Landscapes via Group-Relative
Policy
Optimization
聽
馃К
Evolutionary Computation
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for BiasGRPO: Stabilizing Bias Mitigation in High-Variance Reward Landscapes via Group-Relative Policy Optimization
Retry
Policy
Gradients
in Continuous Action Spaces
聽
馃
Active Inference
聽
Content type:
Academic
arxiv.org
路
5d
5 days ago
Actions for Retry Policy Gradients in Continuous Action Spaces
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help