Reinforcement Learning

Feeds to Scour
SubscribedAll
Scoured 74 posts in 5.8 ms

Space-sampled Value Decay: Forgetting Mechanisms for Non-stationary Deep Reinforcement Learning

馃寪World ModelsContent type: Academic
arxiv.org

Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning

馃寪World ModelsContent type: Academic
arxiv.org

Geometrically Averaged Hard Target Updates for Linear Q-Learning

馃寪World ModelsContent type: Academic
arxiv.org

TT-DAC-PS: Twin-Target Deterministic Actor-Critic with Policy Smoothing for Optimal Trade Execution

馃寪World ModelsContent type: Academic
arxiv.org

Retry Policy Gradients in Continuous Action Spaces

馃Robot LearningContent type: Academic
arxiv.org

Offline Reinforcement Learning for Plasma Control in Nuclear Fusion: Codebase and Benchmark

馃Robot LearningContent type: Academic
arxiv.org

Path Planning Using Deep Deterministic Policy Gradient: A Reinforcement Learning Approach

馃寪World ModelsContent type: Academic
arxiv.org

Self-evolving LLM agents with in-distribution Optimization

馃寪World ModelsContent type: Academic
arxiv.org

Semi-Offline Reinforcement Learning for Optimized Text Generation

馃寪World ModelsContent type: Academic
arxiv.org

Failure Modes of Deep Multi-Agent RL in Asynchronous Pricing: Reproducible Triggers, Trace Diagnostics, and a Partial Fix

馃寪World ModelsContent type: Academic
arxiv.org

Development of COVID-19 Booster Vaccine Policy by Microsimulation and Q-learning

馃寪World ModelsContent type: Academic
arxiv.org

Merging model-based control with multi-agent reinforcement learning for multi-agent cooperative teaming strategies

馃寪World ModelsContent type: Academic
arxiv.org

SHAPO: Sharpness-Aware Policy Optimization for Safe Exploration

馃寪World ModelsContent type: Academic
arxiv.org

Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning

馃Robot LearningContent type: Academic
arxiv.org

On Advantage Estimates for Max@K Policy Gradients

馃寪World ModelsContent type: Academic
arxiv.org

Learning Predictive Control with Deep Koopman Operators for Autonomous Vehicle Motion Planning

馃寪World ModelsContent type: Academic
arxiv.org

Drag reduction or reward hacking? Recurrent multi-agent reinforcement learning that earns its reward

馃寪World ModelsContent type: Academic
arxiv.org

Dmsh: A Multi-Agent Reinforcement Learning Framework for All-Quad Mesh Generation

馃寪World ModelsContent type: Academic
arxiv.org

Self-Paced Curriculum Reinforcement Learning for Autonomous Superbike Racing in Simulation

馃寪World ModelsContent type: Academic
arxiv.org

Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using Reinforcement Learning

馃寪World ModelsContent type: Academic
arxiv.org

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help