Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎮 Reinforcement Learning
RL, AI Agents, Game Playing, Policy Optimization
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
423
posts in
6.9
ms
AI-powered
living business intelligence network
💼
business
atlasforgex.com
·
11h
11 hours ago
·
Hacker News
Actions for AI-powered living business intelligence network
Startup Ricursive to Create an End-to-End
AI
Model for Chip Design
🤖
Agentic AI
Content type:
News
eetimes.com
·
8h
8 hours ago
Actions for Startup Ricursive to Create an End-to-End AI Model for Chip Design
Robots are closing in on human-like judgments, addressing a key challenge in physical
AI
🤝
AI Agents
techxplore.com
·
6h
6 hours ago
Actions for Robots are closing in on human-like judgments, addressing a key challenge in physical AI
Test Your Skills Against an
AI
Air Hockey Robot
🤝
AI Agents
Content type:
News
hackster.io
·
6d
6 days ago
Actions for Test Your Skills Against an AI Air Hockey Robot
Researchers trained an open source
AI
search
agent
, Harness-1, that outperforms GPT-5.4 on recalling relevant information
🎛️
Fine-tuning
venturebeat.com
·
2d
2 days ago
·
Hacker News
Actions for Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information
NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running
Agents
🎛️
Fine-tuning
Content type:
Blog
developer.nvidia.com
·
6d
6 days ago
·
Hacker News
Actions for NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents
Tracing Eval-Awareness Emergence Through Training of OLMo 3
🎛️
Fine-tuning
lesswrong.com
·
15h
15 hours ago
Actions for Tracing Eval-Awareness Emergence Through Training of OLMo 3
Discovering Interpretable Multi-Parameter Control
Policies
for Evolutionary Algorithms Using
Deep
Reinforcement
Learning
🧠
Machine Learning
Content type:
Academic
arxiv.org
·
21h
21 hours ago
Actions for Discovering Interpretable Multi-Parameter Control Policies for Evolutionary Algorithms Using Deep Reinforcement Learning
Reinforcement
learning
in linear embedding space unlocks generalizable control across soft robot configurations
🎛️
Fine-tuning
Content type:
Academic
nature.com
·
3d
3 days ago
Actions for Reinforcement learning in linear embedding space unlocks generalizable control across soft robot configurations
Experts weigh in on Anthropic’s Fable 5, Mythos 5 releases
🤝
AI Agents
sdtimes.com
·
23h
23 hours ago
Actions for Experts weigh in on Anthropic’s Fable 5, Mythos 5 releases
Spotlight On: Dreamplug Technologies Private Limited (CRED), a New Principal Participating Organization
🚀
Startups
Content type:
Blog
blog.pcisecuritystandards.org
·
2d
2 days ago
Actions for Spotlight On: Dreamplug Technologies Private Limited (CRED), a New Principal Participating Organization
Sasha Rush explains targeted
on-policy
self-distillation, a
reinforcement
learning
technique that corrects specific LLM rollout errors
🎛️
Fine-tuning
digg.com
·
6d
6 days ago
Actions for Sasha Rush explains targeted on-policy self-distillation, a reinforcement learning technique that corrects specific LLM rollout errors
Weekly Research Recap
📈
Investing
Content type:
News
quantseeker.com
·
1d
1 day ago
Actions for Weekly Research Recap
I got so mad at poke(rogue)like that I trained a
RL
agent
to beat it for me
🧠
Machine Learning
thiagolira.blot.im
·
3d
3 days ago
·
Hacker News
Actions for I got so mad at poke(rogue)like that I trained a RL agent to beat it for me
Hrithik Roshan Signs With Anonymous Content
🦴
Biomechanics
Content type:
News
deadline.com
·
9h
9 hours ago
Actions for Hrithik Roshan Signs With Anonymous Content
Microsoft Research's Lens proves detailed captions matter more than raw scale for training efficient image generators
✍️
Prompt Engineering
Content type:
News
the-decoder.com
·
2d
2 days ago
Actions for Microsoft Research's Lens proves detailed captions matter more than raw scale for training efficient image generators
Flow-DPPO: Divergence Proximal
Policy
Optimization
for Flow Matching Models
🧠
Deep Learning
Content type:
Academic
arxiv.org
·
21h
21 hours ago
Actions for Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models
KJLdefeated/RL.cu
: RLVR training for LLM in CUDA/C++
🎛️
Fine-tuning
Content type:
Code
github.com
·
3d
3 days ago
·
Hacker News
Actions for KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++
AI
Ready? Google Ads Maturity Model.
🤝
AI Agents
kaushik.net
·
2d
2 days ago
Actions for AI Ready? Google Ads Maturity Model.
The Exploit Always Wins
🤝
AI Agents
Content type:
Blog
abhishek-shankar.com
·
5d
5 days ago
Actions for The Exploit Always Wins
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help