Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎮 Reinforcement Learning
RL, reward function, policy, agents, Q-learning
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
149962
posts in
23.3
ms
Markov
Decision
Processes
: The Language of Reinforcement Learning
🤖
AI
medium.com
·
4d
Value
Mirror
Descent
for Reinforcement Learning
🔥
PyTorch
arxiv.org
·
2d
Value-Guidance
MeanFlow
for
Offline
Multi-Agent Reinforcement Learning
🔒
Federated Learning
arxiv.org
·
8h
Multi-agent Reach-avoid
MDP
via Potential Games and
Low-rank
Policy Structure
🦾
Robotics
arxiv.org
·
8h
Enhancing
sample
efficiency in reinforcement-learning-based flow control: replacing the
critic
with an adaptive reduced-order model
🎨
Diffusion Models
arxiv.org
·
2d
Anticipatory
Reinforcement Learning: From Generative Path-Laws to
Distributional
Value Functions
🧠
Neuromorphic Computing
arxiv.org
·
3d
PriPG-RL
: Privileged Planner-Guided Reinforcement Learning for Partially
Observable
Systems with Anytime-Feasible MPC
⚛️
Quantum Computing
arxiv.org
·
8h
Adaptive
Incentive
Design with Regret
Minimization
⚖️
AI Ethics
arxiv.org
·
2d
Aligning
Agents via Planning: A Benchmark for
Trajectory-Level
Reward Modeling
✨
Generative AI
arxiv.org
·
8h
DROP:
Distributional
and Regular Optimism and
Pessimism
for Reinforcement Learning
🧠
Neuromorphic Computing
arxiv.org
·
1d
Behavior-Constrained Reinforcement Learning with
Receding-Horizon
Credit
Assignment
for High-Performance Control
🛡️
AI Safety
arxiv.org
·
4d
Reinforcement Learning with
Reward
Machines
for Sleep Control in Mobile Networks
📱
Edge AI
arxiv.org
·
8h
Predictive
Representations
for Skill Transfer in Reinforcement Learning
🧠
Machine Learning
arxiv.org
·
1d
Reinforcement Learning from Human Feedback: A
Statistical
Perspective
📝
LLMs
arxiv.org
·
4d
Offline
RL
for Adaptive Policy Retrieval in Prior
Authorization
🧠
Machine Learning
arxiv.org
·
2d
Target Policy Optimization
🧠
Machine Learning
arxiv.org
·
2d
Provable
Multi-Task Reinforcement Learning: A Representation Learning Framework with Low
Rank
Rewards
🔒
Federated Learning
arxiv.org
·
3d
MARL-GPT
: Foundation Model for Multi-Agent Reinforcement Learning
📝
LLMs
arxiv.org
·
2d
Thompson Sampling for Infinite-Horizon
Discounted
Decision
Processes
🎨
Diffusion Models
arxiv.org
·
1d
MC-CPO
:
Mastery-Conditioned
Constrained Policy Optimization
🔒
Federated Learning
arxiv.org
·
3d
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help