Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
馃幆 Reinforcement Learning
RL, reward learning, policy gradient, offline RL
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
74
posts in
6.7
ms
Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using
Reinforcement
Learning
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using Reinforcement Learning
Structure-Conditioned
Actor-Critic
Branches for Quality-Diversity
Reinforcement
Learning
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for Structure-Conditioned Actor-Critic Branches for Quality-Diversity Reinforcement Learning
Policy-Conditioned
Counterfactual Credit for Verifiable
Reinforcement
Learning
of Long-Horizon Language Agents
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for Policy-Conditioned Counterfactual Credit for Verifiable Reinforcement Learning of Long-Horizon Language Agents
Flow-DPPO: Divergence
Proximal
Policy
Optimization
for Flow Matching Models
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models
A Barrier-Modulated Architecture for Safe Affine Formation Control in Second-Order
Multi-Agent
Systems
聽
鈾燂笍
Game Theory
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for A Barrier-Modulated Architecture for Safe Affine Formation Control in Second-Order Multi-Agent Systems
Selective-Advantage Entropy-Adaptive Horizon GRPO: Asymmetric Token-Level Discounting for Efficient
Reinforcement
Learning
of Language Models
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for Selective-Advantage Entropy-Adaptive Horizon GRPO: Asymmetric Token-Level Discounting for Efficient Reinforcement Learning of Language Models
SAW: Stage-Aware Dynamic Weighting for Multi-Objective
Reinforcement
Learning
in Large Language Models
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for SAW: Stage-Aware Dynamic Weighting for Multi-Objective Reinforcement Learning in Large Language Models
Learning
to replenish: A hybrid deep
reinforcement
learning
for dynamic inventory management in the pharmaceutical supply chains
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for Learning to replenish: A hybrid deep reinforcement learning for dynamic inventory management in the pharmaceutical supply chains
Belief-Space Quantum-Inspired
Reinforcement
Learning
for Partially Observable Autonomous Cyber Defense in the Internet of Vehicles
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for Belief-Space Quantum-Inspired Reinforcement Learning for Partially Observable Autonomous Cyber Defense in the Internet of Vehicles
OrderGrad:
Optimizing
Beyond the Mean with Order-Statistic
Policy
Gradient
Estimation
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for OrderGrad: Optimizing Beyond the Mean with Order-Statistic Policy Gradient Estimation
ARTA: Adaptive
Reinforcement-Learning-Based
Throttling
Agent
for RowHammer Vulnerabilities
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for ARTA: Adaptive Reinforcement-Learning-Based Throttling Agent for RowHammer Vulnerabilities
Representation
Learning
Enables Scalable Multitask Deep
Reinforcement
Learning
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for Representation Learning Enables Scalable Multitask Deep Reinforcement Learning
Mitigating Bias in Low-SNR Financial
Reinforcement
Learning
via Quantum Representations
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Mitigating Bias in Low-SNR Financial Reinforcement Learning via Quantum Representations
Discovering Interpretable Multi-Parameter Control
Policies
for Evolutionary Algorithms Using Deep
Reinforcement
Learning
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Discovering Interpretable Multi-Parameter Control Policies for Evolutionary Algorithms Using Deep Reinforcement Learning
Constrained Deep
Reinforcement
Learning
for Cognitive Radar Resource Management
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for Constrained Deep Reinforcement Learning for Cognitive Radar Resource Management
Alpha-RTL: Test-Time Training for RTL Hardware
Optimization
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for Alpha-RTL: Test-Time Training for RTL Hardware Optimization
SALT: When More Rollouts Don't Help in Group-Based
Policy
Optimization
and How to Make Them Matter
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for SALT: When More Rollouts Don't Help in Group-Based Policy Optimization and How to Make Them Matter
The Impact of Market Informedness on Market Makers' Profitability
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for The Impact of Market Informedness on Market Makers' Profitability
SocraticPO:
Policy
Optimization
via Interactive Guidance
聽
馃寪
World Models
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for SocraticPO: Policy Optimization via Interactive Guidance
Q-VGM: Q-Guided
Value-Gradient
Matching for Flow-Matching VLA
Policies
聽
馃搫
AI Research
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for Q-VGM: Q-Guided Value-Gradient Matching for Flow-Matching VLA Policies
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help