Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎯 Reinforcement Learning
Q-learning, Policy Gradient, Reward Functions, TD Learning
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
430
posts in
6.6
ms
From Ticks to Flows: Dynamics of Neural
Reinforcement
Learning
in Continuous Environments
💬
Prompt Engineering
Content type:
Academic
arxiv.org
·
6d
6 days ago
Actions for From Ticks to Flows: Dynamics of Neural Reinforcement Learning in Continuous Environments
HARBOR: A Harness Framework for Agentic Robot
Reinforcement
Learning
💬
Prompt Engineering
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for HARBOR: A Harness Framework for Agentic Robot Reinforcement Learning
3SPO: State-Score-Supervised
Policy
Optimization for LLM Agents
🗣️
LLMs
Content type:
Academic
arxiv.org
·
18h
18 hours ago
Actions for 3SPO: State-Score-Supervised Policy Optimization for LLM Agents
Self-Distilled
Policy
Gradient
💬
Prompt Engineering
Content type:
Academic
arxiv.org
·
6d
6 days ago
Actions for Self-Distilled Policy Gradient
ARTA: Adaptive
Reinforcement-Learning-Based
Throttling Agent for RowHammer Vulnerabilities
💬
Prompt Engineering
Content type:
Academic
arxiv.org
·
18h
18 hours ago
Actions for ARTA: Adaptive Reinforcement-Learning-Based Throttling Agent for RowHammer Vulnerabilities
Self-Paced Curriculum
Reinforcement
Learning
for Autonomous Superbike Racing in Simulation
💬
Prompt Engineering
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Self-Paced Curriculum Reinforcement Learning for Autonomous Superbike Racing in Simulation
SHAPO: Sharpness-Aware
Policy
Optimization for Safe
Exploration
💬
Prompt Engineering
Content type:
Academic
arxiv.org
·
18h
18 hours ago
Actions for SHAPO: Sharpness-Aware Policy Optimization for Safe Exploration
On Advantage Estimates for Max@K
Policy
Gradients
🗣️
LLMs
Content type:
Academic
arxiv.org
·
5d
5 days ago
Actions for On Advantage Estimates for Max@K Policy Gradients
Claw-R1: A Step-Level Data Middleware System for Agentic
Reinforcement
Learning
🗣️
LLMs
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Claw-R1: A Step-Level Data Middleware System for Agentic Reinforcement Learning
Reasoning or Memorization? Direction-Aware Diversity
Exploration
in LLM
Reinforcement
Learning
🗣️
LLMs
Content type:
Academic
arxiv.org
·
18h
18 hours ago
Actions for Reasoning or Memorization? Direction-Aware Diversity Exploration in LLM Reinforcement Learning
Deep
reinforcement
learning
for
process
design: Review and perspective
💬
Prompt Engineering
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Deep reinforcement learning for process design: Review and perspective
RL
Excursions during Pre-Training:
Re-examining
Policy
Optimization for LLM training
🗣️
LLMs
Content type:
Academic
arxiv.org
·
6d
6 days ago
Actions for RL Excursions during Pre-Training: Re-examining Policy Optimization for LLM training
Mitigating Bias in Low-SNR Financial
Reinforcement
Learning
via Quantum Representations
🤖
AI
Content type:
Academic
arxiv.org
·
18h
18 hours ago
Actions for Mitigating Bias in Low-SNR Financial Reinforcement Learning via Quantum Representations
GIFT: LLM-Guided
State-Reward
Interface for Financial
Reinforcement
Learning
🗣️
LLMs
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for GIFT: LLM-Guided State-Reward Interface for Financial Reinforcement Learning
MODIP: Efficient Model-Based Optimization for Diffusion
Policies
🗣️
LLMs
Content type:
Academic
arxiv.org
·
18h
18 hours ago
Actions for MODIP: Efficient Model-Based Optimization for Diffusion Policies
Self-Optimizing Control of Continuous
Processes
Based on
Reinforcement
Learning
🧠
Machine Learning
Content type:
Academic
arxiv.org
·
6d
6 days ago
Actions for Self-Optimizing Control of Continuous Processes Based on Reinforcement Learning
Neuro-Symbolic Injection of LTLf Constraints in Autoregressive
Reinforcement
Learning
Policies
🗣️
LLMs
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Neuro-Symbolic Injection of LTLf Constraints in Autoregressive Reinforcement Learning Policies
Failure Modes of Deep Multi-Agent
RL
in Asynchronous Pricing: Reproducible Triggers, Trace Diagnostics, and a Partial Fix
🤖
AI
Content type:
Academic
arxiv.org
·
18h
18 hours ago
Actions for Failure Modes of Deep Multi-Agent RL in Asynchronous Pricing: Reproducible Triggers, Trace Diagnostics, and a Partial Fix
ConSteer-RL
: Steering Reasoning Capabilities in Large Language Models via Confidence-Aware
Reinforcement
Learning
🗣️
LLMs
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for ConSteer-RL: Steering Reasoning Capabilities in Large Language Models via Confidence-Aware Reinforcement Learning
Exact Unlearning in
Reinforcement
Learning
🤖
AI
Content type:
Academic
arxiv.org
·
6d
6 days ago
Actions for Exact Unlearning in Reinforcement Learning
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help