Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎯 Reinforcement Learning
Q-learning, Policy Gradient, Reward Functions, TD Learning
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
395
posts in
8.0
ms
Reinforcement
Learning
Disrupts
Gradient-Based
Adversarial Optimization
🤖
Machine Learning
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Reinforcement Learning Disrupts Gradient-Based Adversarial Optimization
Stubborn: A Streamlined and Unified
Reinforcement
Learning
Framework for Robust Motion Tracking and Fall Recovery for Humanoids
⚡
Incremental Computation
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Stubborn: A Streamlined and Unified Reinforcement Learning Framework for Robust Motion Tracking and Fall Recovery for Humanoids
Geometrically Averaged Hard Target Updates for Linear
Q-Learning
⚡
Incremental Computation
Content type:
Academic
arxiv.org
·
3d
3 days ago
Actions for Geometrically Averaged Hard Target Updates for Linear Q-Learning
Improving Generalization and Data Efficiency with Diffusion in Offline
Multi-agent
RL
🌍
Distributed Systems
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Improving Generalization and Data Efficiency with Diffusion in Offline Multi-agent RL
Discovering Interpretable Multi-Parameter Control
Policies
for Evolutionary Algorithms Using
Deep
Reinforcement
Learning
🤖
Machine Learning
Content type:
Academic
arxiv.org
·
3d
3 days ago
Actions for Discovering Interpretable Multi-Parameter Control Policies for Evolutionary Algorithms Using Deep Reinforcement Learning
UniIntervene:
Agentic
Intervention for Efficient Real-World
Reinforcement
Learning
⚡
Incremental Computation
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for UniIntervene: Agentic Intervention for Efficient Real-World Reinforcement Learning
SHAPO: Sharpness-Aware
Policy
Optimization for Safe
Exploration
🔍
AI Interpretability
Content type:
Academic
arxiv.org
·
3d
3 days ago
Actions for SHAPO: Sharpness-Aware Policy Optimization for Safe Exploration
KinematicRL: A Sim-to-Real
Reinforcement
Learning
Framework For Social Navigation With Kinodynamic Feasibility
⚡
Incremental Computation
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for KinematicRL: A Sim-to-Real Reinforcement Learning Framework For Social Navigation With Kinodynamic Feasibility
Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with
On-Policy
Reinforcement
Learning
λ
Functional Programming
Content type:
Academic
arxiv.org
·
1d
1 day ago
·
Hacker News
Actions for Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning
Performance Variation in
Deep
Reinforcement
Learning
🤖
Machine Learning
Content type:
Academic
arxiv.org
·
5d
5 days ago
Actions for Performance Variation in Deep Reinforcement Learning
Keep
Policy
Gradient
in Charge: Sibling-Guided Credit Distillation for Long-Horizon Tool-Use
Agents
⚡
Incremental Computation
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Keep Policy Gradient in Charge: Sibling-Guided Credit Distillation for Long-Horizon Tool-Use Agents
Test-Time
Gradient
Guidance of Flow
Policies
in
Reinforcement
Learning
🔍
AI Interpretability
Content type:
Academic
arxiv.org
·
3d
3 days ago
·
Cited by 1 article
Actions for Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning
Phi-Actor-Critic: Steering General-Sum Games to Pareto-Efficient Correlated Equilibria
⚡
Incremental Computation
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Phi-Actor-Critic: Steering General-Sum Games to Pareto-Efficient Correlated Equilibria
Learning
to Adapt: Representation-Based
Reinforcement
Learning
for Multi-Task Skill Transfer
⚡
Incremental Computation
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Learning to Adapt: Representation-Based Reinforcement Learning for Multi-Task Skill Transfer
Reasoning or Memorization? Direction-Aware Diversity
Exploration
in LLM
Reinforcement
Learning
🔍
AI Interpretability
Content type:
Academic
arxiv.org
·
3d
3 days ago
Actions for Reasoning or Memorization? Direction-Aware Diversity Exploration in LLM Reinforcement Learning
Redesigning Regularization for Effective
Policy
Smoothing
🔍
AI Interpretability
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Redesigning Regularization for Effective Policy Smoothing
Reinforcement
Learning
for Flow-Matching
Policies
with Density Transport
🤖
Machine Learning
Content type:
Academic
arxiv.org
·
4d
4 days ago
Actions for Reinforcement Learning for Flow-Matching Policies with Density Transport
Critic Architecture Matters: Dual vs. Unified Critics for Humanoid Loco-Manipulation
⚡
Incremental Computation
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Critic Architecture Matters: Dual vs. Unified Critics for Humanoid Loco-Manipulation
PAWS: Preference
Learning
with Advantage-Weighted Segments
🔍
AI Interpretability
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for PAWS: Preference Learning with Advantage-Weighted Segments
Event-Driven
Reinforcement
Learning
Enables Long-Horizon Control in Semiconductor Fabrication
🌀
Complexity Science
Content type:
Academic
arxiv.org
·
3d
3 days ago
Actions for Event-Driven Reinforcement Learning Enables Long-Horizon Control in Semiconductor Fabrication
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help