Reinforcement Learning

Feeds to Scour
SubscribedAll
Scoured 432 posts in 8.6 ms

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

馃Machine learningContent type: Academic
arxiv.org

QnRL: Quantum-Native Reinforcement Learning

馃搳OptimizationContent type: Academic
arxiv.org

Reasoning or Memorization? Direction-Aware Diversity Exploration in LLM Reinforcement Learning

馃搳OptimizationContent type: Academic
arxiv.org

On-sky demonstration of reinforcement learning for adaptive optics control

馃搳OptimizationContent type: Academic
arxiv.org

Variational Proximal Policy Optimization

馃搳OptimizationContent type: Academic
arxiv.org

Dmsh: A Multi-Agent Reinforcement Learning Framework for All-Quad Mesh Generation

馃搳OptimizationContent type: Academic
arxiv.org

Locomotion analysis of a quadruped interacting with the lunar granular surface

馃搳OptimizationContent type: Academic
arxiv.org

Performance Variation in Deep Reinforcement Learning

馃搳OptimizationContent type: Academic
arxiv.org

ARTA: Adaptive Reinforcement-Learning-Based Throttling Agent for RowHammer Vulnerabilities

馃搳OptimizationContent type: Academic
arxiv.org

Belief-Space Quantum-Inspired Reinforcement Learning for Partially Observable Autonomous Cyber Defense in the Internet of Vehicles

馃幉Bayesian statisticsContent type: Academic
arxiv.org

Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning

馃搳OptimizationContent type: Academic
arxiv.org

UNIQ: Conformal Calibration for Adaptive Conservatism in Offline Reinforcement Learning

馃搳OptimizationContent type: Academic
arxiv.org

Uncertainty-Aware Motion Planning for Autonomous Driving in Mixed Traffic Environment

馃搳OptimizationContent type: Academic
arxiv.org

Uncertainty-Aware LLM-Guided Policy Shaping for Sparse-Reward Reinforcement Learning

馃搳OptimizationContent type: Academic
arxiv.org

MARCH: Model-Assisted Reinforcement Learning for the Perceptive Control of Humanoids over Sparse Footholds

馃搳OptimizationContent type: Academic
arxiv.org

Deep reinforcement learning for process design: Review and perspective

馃搳OptimizationContent type: Academic
arxiv.org

SHAPO: Sharpness-Aware Policy Optimization for Safe Exploration

馃搳OptimizationContent type: Academic
arxiv.org

Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using Reinforcement Learning

馃搳OptimizationContent type: Academic
arxiv.org

TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning

馃搳OptimizationContent type: Academic
arxiv.org

Path Planning Using Deep Deterministic Policy Gradient: A Reinforcement Learning Approach

馃搳OptimizationContent type: Academic
arxiv.org
Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help