Reinforcement Learning

Feeds to Scour
SubscribedAll
Scoured 389 posts in 7.7 ms

Spotlight On: Dreamplug Technologies Private Limited (CRED), a New Principal Participating Organization

 🔧Backend Dev  Content type: Blog

AI-powered living business intelligence network

 🤖AI Engineering

Optimisation over non-stationary distributions creates weirder minds

 🧠LLM Research
lesswrong.com·

Dmsh: A Multi-Agent Reinforcement Learning Framework for All-Quad Mesh Generation

 🤖Robotics  Content type: Academic
arxiv.org·

How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies

 🔮Multimodal AI  Content type: Blog
blogs.nvidia.com·

Core Automation co-founder Jerry Tworek jokes that Nvidia's CUDA translates to miracles in Polish

 🎮GPU Programming
digg.com·

You're doing it wrong

 🧠LLM Research  Content type: News
understandably.com·

Beyond Dexterity: Why Contact May Define the Next Era of Robotics

 🤖Robotics  Content type: Video  Content type: News

Event-Driven Reinforcement Learning Enables Long-Horizon Control in Semiconductor Fabrication

 🤖AI Engineering  Content type: Academic
arxiv.org·

The Exploit Always Wins

 🎮GPU Programming  Content type: Blog
abhishek-shankar.com·

DeepSeek fundraising 💰, Meta model delays ⌛ , Gemma 4 12B 🤖

 🤖AI Engineering
tldr.tech·

Daimon Robotics and Galbot jointly launches RobOmni for benchmarking tactile perception and dexterous manipulation

 🤖Robotics
therobotreport.com·

SHAPO: Sharpness-Aware Policy Optimization for Safe Exploration

 🛡️AI Safety  Content type: Academic
arxiv.org·

Bridging Multi-Vector and Learned-Sparse Retrieval, A Diagnostic Framework for Robust Semantic IDs, and More!

 🔮Multimodal AI  Content type: News  Content type: Blog

Improve your agent’s tool-calling accuracy with SFT and DPO on Amazon SageMaker AI

 🧠LLM Research  Content type: Blog
aws.amazon.com·

Vibe Diaries: Training Nanochat

 🧠LLM Research

The Effective Sample Size

 🧠LLM Research

Google DeepMind's Susan Zhang argues abundant AI content shifts the premium from raw intelligence to human relationships and social dynamics

 🧠LLM Research  Content type: News
digg.com·

SocraticPO: Policy Optimization via Interactive Guidance

 🤖AI Engineering  Content type: Academic
arxiv.org·
Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help