Reinforcement Learning

Feeds to Scour
SubscribedAll
Scoured 388 posts in 7.3 ms

Memoirs of a Learning Machine: Autobiographical Self-Training and the Self-Training Gap

 🧠AI Agents
zenodo.org··Hacker News
Less-relevant results

Propel: Breaking the Solver Bottleneck in Task-Generator RL

 💬LLMs
vmax.ai··Hacker News

Deterministic Policy Gradient for Learning Equilibrium in Time-Inconsistent Control Problems

 Concurrency  Content type: Academic
arxiv.org·

Beyond the Buzzwords: The Definitive Guide to Navigating the AI vs. Machine Learning Divide

 🤖Machine Learning  Content type: News  Content type: Blog

Reinforcement-learning signals support dynamic adaptive control during language switching

 🏗️AI Infrastructure  Content type: Academic
nature.com·

Social intelligence Arises Between Minds

 🧠AI Agents
psychologytoday.com·

Cohere open-sources a coding agent that runs on a single H100

 🤖AI
venturebeat.com·

Robots are closing in on human-like judgments, addressing a key challenge in physical AI

 🧠AI Agents
techxplore.com·

Microsoft just shared the frontier data engineering secrets

 🤖AI
mail.bycloud.ai·

How to Train Your Goblin

 🤖AI

Phi-Actor-Critic: Steering General-Sum Games to Pareto-Efficient Correlated Equilibria

 🧠AI Agents  Content type: Academic
arxiv.org·

Some Interesting Papers on RLVR

 💬LLMs
lesswrong.com·

Breaking free of a single datacenter: Practical geo-distributed AI operations with the k0smos platforms

 🏗️AI Infrastructure  Content type: Blog
cncf.io·

[NEW MODEL] SupraLabs just released Supra1.5-50M Base (Experimental)!

 🤖AI

Core Automation co-founder Jerry Tworek jokes that Nvidia's CUDA translates to miracles in Polish

 💻Tech Industry
digg.com·

Improving Generalization and Data Efficiency with Diffusion in Offline Multi-agent RL

 🧠AI Agents  Content type: Academic
arxiv.org·

Major Types of Machine Learning

 🤖Machine Learning  Content type: Blog
medium.com·

Test Your Skills Against an AI Air Hockey Robot

 🧠AI Agents  Content type: News
hackster.io·

Weekly Research Recap

 🏗️AI Infrastructure  Content type: News
quantseeker.com·

Verifiable Environments Are LEGO Bricks: Recursive Composition for Reasoning Generalization

 🏗️AI Infrastructure  Content type: Academic
arxiv.org·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help