Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎯 Reinforcement Learning
RL, reward, policy, agent, Q-learning
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
389
posts in
7.7
ms
Spotlight On: Dreamplug Technologies Private Limited (CRED), a New Principal Participating Organization
🔧
Backend Dev
Content type:
Blog
blog.pcisecuritystandards.org
·
2d
2 days ago
Actions for Spotlight On: Dreamplug Technologies Private Limited (CRED), a New Principal Participating Organization
AI-powered living business intelligence
network
🤖
AI Engineering
atlasforgex.com
·
2h
2 hours ago
·
Hacker News
Actions for AI-powered living business intelligence network
Optimisation over non-stationary distributions creates weirder minds
🧠
LLM Research
lesswrong.com
·
4d
4 days ago
Actions for Optimisation over non-stationary distributions creates weirder minds
Dmsh: A
Multi-Agent
Reinforcement
Learning
Framework for All-Quad Mesh Generation
🤖
Robotics
Content type:
Academic
arxiv.org
·
12h
12 hours ago
Actions for Dmsh: A Multi-Agent Reinforcement Learning Framework for All-Quad Mesh Generation
How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies
🔮
Multimodal AI
Content type:
Blog
blogs.nvidia.com
·
2d
2 days ago
Actions for How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies
Core Automation co-founder Jerry Tworek jokes that Nvidia's CUDA translates to miracles in Polish
🎮
GPU Programming
digg.com
·
5d
5 days ago
Actions for Core Automation co-founder Jerry Tworek jokes that Nvidia's CUDA translates to miracles in Polish
You'
re
doing it wrong
🧠
LLM Research
Content type:
News
understandably.com
·
1d
1 day ago
Actions for You're doing it wrong
How to Train Your Goblin
🧠
LLM Research
goblins.mchen.workers.dev
·
3d
3 days ago
·
Hacker News
,
Hacker News
Actions for How to Train Your Goblin
Beyond Dexterity: Why Contact May Define the Next Era of Robotics
🤖
Robotics
Content type:
Video
Content type:
News
spectrum.ieee.org
·
1d
1 day ago
·
Hacker News
Actions for Beyond Dexterity: Why Contact May Define the Next Era of Robotics
Event-Driven
Reinforcement
Learning
Enables Long-Horizon Control in Semiconductor Fabrication
🤖
AI Engineering
Content type:
Academic
arxiv.org
·
12h
12 hours ago
Actions for Event-Driven Reinforcement Learning Enables Long-Horizon Control in Semiconductor Fabrication
The
Exploit
Always Wins
🎮
GPU Programming
Content type:
Blog
abhishek-shankar.com
·
5d
5 days ago
Actions for The Exploit Always Wins
DeepSeek
fundraising 💰, Meta model delays ⌛ , Gemma 4 12B 🤖
🤖
AI Engineering
tldr.tech
·
6d
6 days ago
Actions for DeepSeek fundraising 💰, Meta model delays ⌛ , Gemma 4 12B 🤖
Daimon Robotics and Galbot jointly launches RobOmni for benchmarking tactile perception and dexterous manipulation
🤖
Robotics
therobotreport.com
·
2d
2 days ago
Actions for Daimon Robotics and Galbot jointly launches RobOmni for benchmarking tactile perception and dexterous manipulation
SHAPO: Sharpness-Aware
Policy
Optimization for Safe
Exploration
🛡️
AI Safety
Content type:
Academic
arxiv.org
·
12h
12 hours ago
Actions for SHAPO: Sharpness-Aware Policy Optimization for Safe Exploration
Bridging Multi-Vector and
Learned-Sparse
Retrieval, A Diagnostic Framework for Robust Semantic IDs, and More!
🔮
Multimodal AI
Content type:
News
Content type:
Blog
recsys.substack.com
·
4d
4 days ago
·
Substack
Actions for Bridging Multi-Vector and Learned-Sparse Retrieval, A Diagnostic Framework for Robust Semantic IDs, and More!
Improve your
agent
’s tool-calling accuracy with SFT and DPO on Amazon SageMaker AI
🧠
LLM Research
Content type:
Blog
aws.amazon.com
·
1w
1 week ago
Actions for Improve your agent’s tool-calling accuracy with SFT and DPO on Amazon SageMaker AI
Vibe Diaries: Training Nanochat
🧠
LLM Research
vibediary.dev
·
1d
1 day ago
·
Hacker News
Actions for Vibe Diaries: Training Nanochat
The Effective Sample Size
🧠
LLM Research
alex.smola.org
·
5d
5 days ago
·
Hacker News
Actions for The Effective Sample Size
Google
DeepMind
's Susan Zhang argues abundant AI content shifts the premium from raw intelligence to human relationships and social dynamics
🧠
LLM Research
Content type:
News
digg.com
·
2d
2 days ago
Actions for Google DeepMind's Susan Zhang argues abundant AI content shifts the premium from raw intelligence to human relationships and social dynamics
SocraticPO:
Policy
Optimization via Interactive Guidance
🤖
AI Engineering
Content type:
Academic
arxiv.org
·
12h
12 hours ago
Actions for SocraticPO: Policy Optimization via Interactive Guidance
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help