Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
馃幃 Reinforcement Learning
RL, Agents, Policy Optimization, Reward Functions
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
469
posts in
7.5
ms
Test-Time
Gradient
Guidance of Flow
Policies
in
Reinforcement
Learning
聽
馃
AI
聽
Content type:
Academic
arxiv.org
路
21h
21 hours ago
Actions for Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning
Posting for authoring
聽
馃尡
Solarpunk
turingpost.com
路
3d
3 days ago
Actions for Posting for authoring
Reinforcement
learning
in linear embedding space unlocks generalizable control across soft robot configurations
聽
馃
AI
聽
Content type:
Academic
nature.com
路
3d
3 days ago
Actions for Reinforcement learning in linear embedding space unlocks generalizable control across soft robot configurations
Discovering Interpretable Multi-Parameter Control
Policies
for Evolutionary Algorithms Using
Deep
Reinforcement
Learning
聽
馃
Machine Learning
聽
Content type:
Academic
arxiv.org
路
21h
21 hours ago
Actions for Discovering Interpretable Multi-Parameter Control Policies for Evolutionary Algorithms Using Deep Reinforcement Learning
SHAPO: Sharpness-Aware
Policy
Optimization
for Safe
Exploration
聽
馃敟
PyTorch
聽
Content type:
Academic
arxiv.org
路
21h
21 hours ago
Actions for SHAPO: Sharpness-Aware Policy Optimization for Safe Exploration
Reasoning or Memorization? Direction-Aware Diversity
Exploration
in LLM
Reinforcement
Learning
聽
馃挰
LLM
聽
Content type:
Academic
arxiv.org
路
21h
21 hours ago
Actions for Reasoning or Memorization? Direction-Aware Diversity Exploration in LLM Reinforcement Learning
TT-DAC-PS: Twin-Target Deterministic
Actor-Critic
with
Policy
Smoothing for Optimal Trade Execution
聽
馃
AI
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for TT-DAC-PS: Twin-Target Deterministic Actor-Critic with Policy Smoothing for Optimal Trade Execution
Geometrically Averaged Hard Target Updates for Linear
Q-Learning
聽
馃
Machine Learning
聽
Content type:
Academic
arxiv.org
路
21h
21 hours ago
Actions for Geometrically Averaged Hard Target Updates for Linear Q-Learning
Flow-DPPO: Divergence Proximal
Policy
Optimization
for Flow Matching Models
聽
馃挰
LLM
聽
Content type:
Academic
arxiv.org
路
21h
21 hours ago
Actions for Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models
Performance Variation in
Deep
Reinforcement
Learning
聽
馃挰
LLM
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for Performance Variation in Deep Reinforcement Learning
Development of COVID-19 Booster Vaccine
Policy
by Microsimulation and
Q-learning
聽
馃
Machine Learning
聽
Content type:
Academic
arxiv.org
路
21h
21 hours ago
Actions for Development of COVID-19 Booster Vaccine Policy by Microsimulation and Q-learning
Structure-Conditioned
Actor-Critic
Branches for Quality-Diversity
Reinforcement
Learning
聽
馃
AI
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Structure-Conditioned Actor-Critic Branches for Quality-Diversity Reinforcement Learning
Dmsh: A
Multi-Agent
Reinforcement
Learning
Framework for All-Quad Mesh Generation
聽
馃
AI
聽
Content type:
Academic
arxiv.org
路
21h
21 hours ago
Actions for Dmsh: A Multi-Agent Reinforcement Learning Framework for All-Quad Mesh Generation
Self-Paced Curriculum
Reinforcement
Learning
for Autonomous Superbike Racing in Simulation
聽
馃敟
PyTorch
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Self-Paced Curriculum Reinforcement Learning for Autonomous Superbike Racing in Simulation
Event-Driven
Reinforcement
Learning
Enables Long-Horizon Control in Semiconductor Fabrication
聽
馃
AI
聽
Content type:
Academic
arxiv.org
路
21h
21 hours ago
Actions for Event-Driven Reinforcement Learning Enables Long-Horizon Control in Semiconductor Fabrication
Reinforcement
Learning
for Flow-Matching
Policies
with Density Transport
聽
馃
Machine Learning
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Reinforcement Learning for Flow-Matching Policies with Density Transport
3SPO: State-Score-Supervised
Policy
Optimization
for LLM
Agents
聽
馃挰
LLM
聽
Content type:
Academic
arxiv.org
路
21h
21 hours ago
Actions for 3SPO: State-Score-Supervised Policy Optimization for LLM Agents
Uncertainty-Aware LLM-Guided
Policy
Shaping for
Sparse-Reward
Reinforcement
Learning
聽
馃挰
LLM
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for Uncertainty-Aware LLM-Guided Policy Shaping for Sparse-Reward Reinforcement Learning
Geometry-Aware
Reinforcement
Learning
for 2D Irregular Nesting
聽
馃
AI
聽
Content type:
Academic
arxiv.org
路
21h
21 hours ago
Actions for Geometry-Aware Reinforcement Learning for 2D Irregular Nesting
Deep
reinforcement
learning
for process design: Review and perspective
聽
馃
Machine Learning
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Deep reinforcement learning for process design: Review and perspective
« Page 1
路
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help