Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎮 Reinforcement Learning
RL, AI Agents, Game Playing, Policy Optimization
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
378
posts in
7.8
ms
Q-Learning
(
Reinforcement
learning
): Bellman Equation,
Markov
Decision Processes, Q-Values, and…
Â
🤖
Machine Learning
Â
Content type:
Blog
medium.com
·
1d
1 day ago
Actions for Q-Learning (Reinforcement learning): Bellman Equation, Markov Decision Processes, Q-Values, and…
Reasoning
RL
in 2026: GRPO, DPO, RLVR,
Agentic
PO
& Beyond
Â
🤖
AI
turingpost.com
·
3d
3 days ago
Actions for Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond
Fast and Highly Expressive
Policy
Learning
for Offline
Reinforcement
Learning
via Bootstrapped Flow
Q-Learning
Â
✨
Generative AI
Â
Content type:
Academic
arxiv.org
·
13h
13 hours ago
Actions for Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning
Researchers develop
AI-powered
railway
control
system for efficient urban train operation
Â
🤖
AI
techxplore.com
·
4h
4 hours ago
Actions for Researchers develop AI-powered railway control system for efficient urban train operation
Scale
Robot
Reinforcement
Learning
with NVIDIA Isaac Lab on Amazon SageMaker AI
Â
🤖
Machine Learning
Â
Content type:
Blog
aws.amazon.com
·
21h
21 hours ago
Actions for Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI
Agents
Need Work Data: A Primer on RLWD, or
Reinforcement
Learning
on Work Data
Â
🤖
Machine Learning
anjalishriva.com
·
1d
1 day ago
·
Hacker News
Actions for Agents Need Work Data: A Primer on RLWD, or Reinforcement Learning on Work Data
Reinforcement
Learning
and
Optimal
Control Book (RIP Dimitri Bertsekas)
Â
🤖
AI
Â
Content type:
Academic
web.mit.edu
·
5d
5 days ago
·
Hacker News
Actions for Reinforcement Learning and Optimal Control Book (RIP Dimitri Bertsekas)
Some Interesting Papers on RLVR
Â
✨
Generative AI
lesswrong.com
·
22h
22 hours ago
Actions for Some Interesting Papers on RLVR
Reinforcement
learning
in linear embedding space unlocks generalizable
control
across soft robot configurations
Â
🤖
AI
Â
Content type:
Academic
nature.com
·
2d
2 days ago
Actions for Reinforcement learning in linear embedding space unlocks generalizable control across soft robot configurations
Time-slip in
AI
sepsis models may inflate results, risking under- or overtreatment
Â
🤖
AI
medicalxpress.com
·
5d
5 days ago
Actions for Time-slip in AI sepsis models may inflate results, risking under- or overtreatment
SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.
Â
🤖
AI
Â
Content type:
Code
github.com
·
3d
3 days ago
·
r/opensource
Actions for SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.
Spotlight On: Dreamplug Technologies Private Limited (CRED), a New Principal Participating Organization
Â
💬
NLP
Â
Content type:
Blog
blog.pcisecuritystandards.org
·
2d
2 days ago
Actions for Spotlight On: Dreamplug Technologies Private Limited (CRED), a New Principal Participating Organization
Core Automation
co-founder
Jerry Tworek jokes that Nvidia's CUDA translates to miracles in Polish
Â
✨
Generative AI
digg.com
·
6d
6 days ago
Actions for Core Automation co-founder Jerry Tworek jokes that Nvidia's CUDA translates to miracles in Polish
Social intelligence Arises Between Minds
Â
✨
Generative AI
psychologytoday.com
·
2d
2 days ago
Actions for Social intelligence Arises Between Minds
Geometrically Averaged Hard Target Updates for Linear
Q-Learning
Â
🤖
Machine Learning
Â
Content type:
Academic
arxiv.org
·
13h
13 hours ago
Actions for Geometrically Averaged Hard Target Updates for Linear Q-Learning
AI
Paper Review: Training Language Models to Follow Instructions with Human Feedback (InstructGPT)
Â
🤖
AI
freecodecamp.org
·
6d
6 days ago
Actions for AI Paper Review: Training Language Models to Follow Instructions with Human Feedback (InstructGPT)
Edge
AI
enabled MIMO MC-CDMA for 6G
optimizing
spectrum and energy efficiency with SIC and
deep
reinforcement learning
Â
🤖
Machine Learning
Â
Content type:
Academic
nature.com
·
17h
17 hours ago
Actions for Edge AI enabled MIMO MC-CDMA for 6G optimizing spectrum and energy efficiency with SIC and deep reinforcement learning
How to Train Your Goblin
Â
🤖
AI
goblins.mchen.workers.dev
·
3d
3 days ago
·
Hacker News
,
Hacker News
Actions for How to Train Your Goblin
What is MBPO? A Beginner’s Guide to Efficient
Reinforcement
Learning
Â
🤖
Machine Learning
Â
Content type:
Blog
ujangriswanto08.medium.com
·
5d
5 days ago
Actions for What is MBPO? A Beginner’s Guide to Efficient Reinforcement Learning
Sequent: scale and automation for higher confidence in alignment
Â
🤖
AI
lesswrong.com
·
2h
2 hours ago
Actions for Sequent: scale and automation for higher confidence in alignment
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help