Reinforcement Learning

Feeds to Scour
SubscribedAll
Scoured 395 posts in 22.9 ms

RLCSD: Reinforcement Learning with Contrastive On-Policy Self-Distillation

 Incremental Computation  Content type: Academic
arxiv.org·

Path Planning Using Deep Deterministic Policy Gradient: A Reinforcement Learning Approach

 Incremental Computation  Content type: Academic
arxiv.org·

APPO: Agentic Procedural Policy Optimization

 Incremental Computation  Content type: Academic
arxiv.org·

Uncertainty-Aware LLM-Guided Policy Shaping for Sparse-Reward Reinforcement Learning

 🔍AI Interpretability  Content type: Academic
arxiv.org·

CCKS: Consensus-based Communication and Knowledge Sharing

 🌍Distributed Systems  Content type: Academic
arxiv.org·

Fantastic Scientific Agents and How to Build Them: AgentBuild for Rietveld Refinement

 Incremental Computation  Content type: Academic
arxiv.org·

Self-Paced Curriculum Reinforcement Learning for Autonomous Superbike Racing in Simulation

 🦀Rust  Content type: Academic
arxiv.org·

Self-Evolving Scientific Agent Discovers Generalizable Physically-Reasoned Fluid Control

 Incremental Computation  Content type: Academic
arxiv.org·

Space-sampled Value Decay: Forgetting Mechanisms for Non-stationary Deep Reinforcement Learning

 Incremental Computation  Content type: Academic
arxiv.org·

SocraticPO: Policy Optimization via Interactive Guidance

 Incremental Computation  Content type: Academic
arxiv.org·

IAPO: Input Attribution-Aware Policy Optimization for Tool Use in Small Multimodal Agents

 Incremental Computation  Content type: Academic
arxiv.org·

An Agency-Transferring Model-Free Policy Enhancement Technique

 🤖Machine Learning  Content type: Academic
arxiv.org·

Improving Robotic Generalist Policies via Flow Reversal Steering

 Incremental Computation  Content type: Academic
arxiv.org·

Structure-Conditioned Actor-Critic Branches for Quality-Diversity Reinforcement Learning

 Incremental Computation  Content type: Academic
arxiv.org·

Geometry-Aware Reinforcement Learning for 2D Irregular Nesting

 Incremental Computation  Content type: Academic
arxiv.org·

Self-evolving LLM agents with in-distribution Optimization

 Incremental Computation  Content type: Academic
arxiv.org·

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

 Incremental Computation  Content type: Academic
arxiv.org·

3SPO: State-Score-Supervised Policy Optimization for LLM Agents

 Incremental Computation  Content type: Academic
arxiv.org·

UNIQ: Conformal Calibration for Adaptive Conservatism in Offline Reinforcement Learning

 Incremental Computation  Content type: Academic
arxiv.org·
Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help