Skip to main content
Scour
Discover
Docs
Login
Sign Up
Discover
About
Docs
Changelog
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎮 Reinforcement Learning
Q-Learning, Policy Gradient, RL Agents, Game AI
Filter Results
Timeframe
Choose a timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
355
posts in
30.8
ms
🤖
Machine Learning
Environmental Research Letters
·
5d
5 days ago
ERRATUM:
Multi-agent
reinforcement
learning
using echo-state network and its application to pedestrian dynamics (2025 J. Stat. Mech. 043401)
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for ERRATUM: Multi-agent reinforcement learning using echo-state network and its application to pedestrian dynamics (2025 J. Stat. Mech. 043401)
🤖
Machine Learning
grahamjroy.medium.com
·
2d
2 days ago
Q-Learning
—
Learning
to
Act
Without a Map
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Q-Learning — Learning to Act Without a Map
🔥
PyTorch
rhp.bearblog.dev
·
11h
11 hours ago
Mini-spire: a fast Slay the Spire
RL
environment in C++
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Mini-spire: a fast Slay the Spire RL environment in C++
🤖
AI
sakana.ai
·
2h
2 hours ago
Sakana Fugu
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Sakana Fugu
🤖
AI
ujangriswanto08.medium.com
·
2d
2 days ago
How SARSA Trains Smarter
Agents
Through
On-Policy
Updates
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for How SARSA Trains Smarter Agents Through On-Policy Updates
🤖
AI
arxiv.org
·
3d
3 days ago
Augmenting
Game
AI
with
Deep
Reinforcement Learning
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Augmenting Game AI with Deep Reinforcement Learning
🤖
AI
medium.com
·
2h
2 hours ago
Reward hacking in
Reinforcement
learning
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Reward hacking in Reinforcement learning
🔬
Science
Nature
·
1d
1 day ago
Attention modulates value normalization in human
reinforcement
learning
by shaping reward encoding
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Attention modulates value normalization in human reinforcement learning by shaping reward encoding
🤖
Machine Learning
Databricks
·
5d
5 days ago
Agent
Bricks: Data +
AI
Summit 2026
Covered byÂ
SiliconANGLE
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Agent Bricks: Data + AI Summit 2026
🤖
AI
medium.com
·
2d
2 days ago
ICLR 2026 Test of Time: DDPG and the jump to continuous control
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for ICLR 2026 Test of Time: DDPG and the jump to continuous control
🤖
AI
abhishek-shankar.com
·
2d
2 days ago
The Best
Agent
Upgrade of the Year Wasn't a Model
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for The Best Agent Upgrade of the Year Wasn't a Model
🤖
AI
GitHub
·
3d
3 days ago
owainlewis/awesome-artificial-intelligence
CoversÂ
33Â stories
See all stories this covers
 includingÂ
Opencode – open-source alternative to Claude Code
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for owainlewis/awesome-artificial-intelligence
🤖
AI
The Batch
·
2d
2 days ago
Jun 19, 2026
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Jun 19, 2026
🔥
PyTorch
computerweekly.com
·
6d
6 days ago
Ineffable Intelligence strikes Google Cloud deal for Vera Rubin GPU power
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Ineffable Intelligence strikes Google Cloud deal for Vera Rubin GPU power
🤖
AI
theregister
·
1d
1 day ago
Why Amazon hates 'human-in-the-loop'
AI
governance
Covered byÂ
naked capitalism
,
TNW | Data-Security
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Why Amazon hates 'human-in-the-loop' AI governance
🤖
AI
musicallyut.xyz
·
6d
6 days ago
The Mote in
AI
's Eye: software engineering with
agents
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for The Mote in AI's Eye: software engineering with agents
🤖
Machine Learning
The Decoder
·
2d
2 days ago
Google
Deepmind
loses another top
AI
researcher as Nobel laureate John Jumper leaves for Anthropic
Covered byÂ
何夕2077的个人站
,
habr.com
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Google Deepmind loses another top AI researcher as Nobel laureate John Jumper leaves for Anthropic
🤖
AI
devblogs.microsoft.com
·
3d
3 days ago
Outcome-driven
learning
systems: Enterprise
RL
with OpenEnv and Foundry
CoversÂ
3Â stories
See all stories this covers
 includingÂ
SkillOpt: Executive Strategy for Self-Evolving Agent Skills
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Outcome-driven learning systems: Enterprise RL with OpenEnv and Foundry
🔥
PyTorch
sciencedirect.com
·
2d
2 days ago
Digital twin-driven
deep
reinforcement
learning
for coordinated scheduling and state prediction of distributed energy storage clusters
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Digital twin-driven deep reinforcement learning for coordinated scheduling and state prediction of distributed energy storage clusters
🤖
AI
chierhu.medium.com
·
4d
4 days ago
Scaling Self-Play with Self-Guidance: An
AlphaZero-Style
Path for Language Models
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Scaling Self-Play with Self-Guidance: An AlphaZero-Style Path for Language Models
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous post
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Discover
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help
Like
Save
Not for me
Report