Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
🎮 Reinforcement Learning
RL, reward function, policy, agents, Q-learning
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
133
posts in
10.9
ms
Investigating
Action
Encodings in Recurrent Neural
Networks
in
Reinforcement
Learning
🔗
Deep Learning
arxiv.org
·
2d
DiffusionOPD: A Unified Perspective of
On-Policy
Distillation in Diffusion Models
🎨
Diffusion Models
arxiv.org
·
6d
Pedestrian-Aware LLM-Driven Behavioral Planning for Autonomous Vehicles
🧠
Neuromorphic Computing
arxiv.org
·
2d
A Heuristic Approach for Performance Tuning in
RL-based
Quadrotor Control via
Reward
Design and Termination Conditions
🦾
Robotics
arxiv.org
·
1d
Task-Semantic Graph-Driven Distributed
Agent
Networking
for Underwater Target Tracking
🔐
Cybersecurity
arxiv.org
·
3d
Sampling-Based Safe
Reinforcement
Learning
🛡️
AI Safety
arxiv.org
·
1d
Distributed Zeroth-Order
Policy
Gradient
for
Networked
Multi-agent Reinforcement Learning from Human Feedback
🔥
PyTorch
arxiv.org
·
3d
DISA: Offline Importance Sampling for Distribution-Matching
LLM-RL
🧠
Machine Learning
arxiv.org
·
2d
When Outcome Looks Right But Discipline Fails: Trace-Based Evaluation Under Hidden Competitor
State
⚖️
AI Ethics
arxiv.org
·
2d
AIS: Adaptive Importance Sampling for Quantized
RL
⚙️
MLOps
arxiv.org
·
6d
Learning
to Hand Off: Provably Convergent Workflow
Learning
under Interface Constraints
✨
Generative AI
arxiv.org
·
1d
Offline Contextual Bandits in the Presence of New
Actions
🧠
Machine Learning
arxiv.org
·
2d
Response-Conditioned Parallel-to-Sequential Orchestration for
Multi-Agent
Systems
🧠
Neuromorphic Computing
arxiv.org
·
3d
Fair-Aurora: Comparing Fairness Strategies for
Reinforcement
Learning-Based
Congestion Control in
Multi-Flow
Environments
⚖️
AI Ethics
arxiv.org
·
1d
Residual
Reinforcement
Learning
for Robot Teleoperation under Stochastic Delays
🦾
Robotics
arxiv.org
·
3d
Safe Deep
Reinforcement
Learning
for
Spacecraft
Reorientation with Pointing Keep-Out Constraint
🛡️
AI Safety
arxiv.org
·
1d
AMARIS: A Memory-Augmented Rubric Improvement System for Rubric-Based
Reinforcement
Learning
📝
LLMs
arxiv.org
·
2d
An Encoded Corrective Double Deep
Q-Networks
for
Multi-Agent
Control Systems
🔥
PyTorch
arxiv.org
·
6d
Equilibrium Selection in
Multi-Agent
Policy
Gradients via Opponent-Aware Basin Entry
🔥
PyTorch
arxiv.org
·
2d
Progressive Generalization Augmentation with Deeply Coupled
RND-PPO
and Domain-Prioritized Noise Injection for Robust Crop Management
Reinforcement
Learning
🧠
Neuromorphic Computing
arxiv.org
·
2d
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help