Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
馃幃 Reinforcement Learning
Q-Learning, Policy Gradients, OpenAI Gym, Reward Functions
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
432
posts in
8.6
ms
Beyond Uniform Token-Level Trust Region in LLM
Reinforcement
Learning
聽
馃
Machine learning
聽
Content type:
Academic
arxiv.org
路
22h
22 hours ago
Actions for Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning
QnRL: Quantum-Native
Reinforcement
Learning
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for QnRL: Quantum-Native Reinforcement Learning
Reasoning or Memorization? Direction-Aware Diversity Exploration in LLM
Reinforcement
Learning
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
22h
22 hours ago
Actions for Reasoning or Memorization? Direction-Aware Diversity Exploration in LLM Reinforcement Learning
On-sky demonstration of
reinforcement
learning
for adaptive optics control
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
22h
22 hours ago
Actions for On-sky demonstration of reinforcement learning for adaptive optics control
Variational
Proximal
Policy
Optimization
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Variational Proximal Policy Optimization
Dmsh: A Multi-Agent
Reinforcement
Learning
Framework for All-Quad Mesh Generation
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
22h
22 hours ago
Actions for Dmsh: A Multi-Agent Reinforcement Learning Framework for All-Quad Mesh Generation
Locomotion analysis of a quadruped interacting with the lunar granular surface
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
22h
22 hours ago
Actions for Locomotion analysis of a quadruped interacting with the lunar granular surface
Performance Variation in
Deep
Reinforcement
Learning
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for Performance Variation in Deep Reinforcement Learning
ARTA: Adaptive
Reinforcement-Learning-Based
Throttling Agent for RowHammer Vulnerabilities
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
22h
22 hours ago
Actions for ARTA: Adaptive Reinforcement-Learning-Based Throttling Agent for RowHammer Vulnerabilities
Belief-Space Quantum-Inspired
Reinforcement
Learning
for Partially Observable Autonomous Cyber Defense in the Internet of Vehicles
聽
馃幉
Bayesian statistics
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Belief-Space Quantum-Inspired Reinforcement Learning for Partially Observable Autonomous Cyber Defense in the Internet of Vehicles
Test-Time
Gradient
Guidance of Flow
Policies
in
Reinforcement
Learning
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
22h
22 hours ago
Actions for Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning
UNIQ: Conformal Calibration for Adaptive Conservatism in Offline
Reinforcement
Learning
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for UNIQ: Conformal Calibration for Adaptive Conservatism in Offline Reinforcement Learning
Uncertainty-Aware Motion Planning for Autonomous Driving in Mixed Traffic
Environment
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
22h
22 hours ago
Actions for Uncertainty-Aware Motion Planning for Autonomous Driving in Mixed Traffic Environment
Uncertainty-Aware LLM-Guided
Policy
Shaping for
Sparse-Reward
Reinforcement
Learning
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for Uncertainty-Aware LLM-Guided Policy Shaping for Sparse-Reward Reinforcement Learning
MARCH: Model-Assisted
Reinforcement
Learning
for the Perceptive Control of Humanoids over Sparse Footholds
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
22h
22 hours ago
Actions for MARCH: Model-Assisted Reinforcement Learning for the Perceptive Control of Humanoids over Sparse Footholds
Deep
reinforcement
learning
for process design: Review and perspective
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Deep reinforcement learning for process design: Review and perspective
SHAPO: Sharpness-Aware
Policy
Optimization
for Safe Exploration
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
22h
22 hours ago
Actions for SHAPO: Sharpness-Aware Policy Optimization for Safe Exploration
Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using
Reinforcement
Learning
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using Reinforcement Learning
TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic
Reinforcement
Learning
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
22h
22 hours ago
Actions for TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning
Path Planning Using
Deep
Deterministic
Policy
Gradient
: A Reinforcement Learning Approach
聽
馃搳
Optimization
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Path Planning Using Deep Deterministic Policy Gradient: A Reinforcement Learning Approach
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help