🎮 Reinforcement Learning - chris1 · Scour

Fog of Love: Engineering Virtuous Agent Behavior with Affinity-based Reinforcement Learning in a Game Environment

🤖AI Agents Academic

Researchers develop AI-powered railway control system for efficient urban train operation

techxplore.com·

Q-Learning (Reinforcement learning): Bellman Equation, Markov Decision Processes, Q-Values, and…

♟️Game Theory Blog

·

Agents Need Work Data: A Primer on RLWD, or Reinforcement Learning on Work Data

anjalishriva.com··Hacker News

Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI

🤖AI Blog

aws.amazon.com·

Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond

🤖Machine Learning

turingpost.com·

AI Agent Mastery & Coaching

Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information

venturebeat.com··Hacker News

Some Interesting Papers on RLVR

lesswrong.com·

See, Act, Correct: three levers for working with a code agent

🤖AI Blog

blog.owulveryck.info··Hacker News, Hacker News

Social intelligence Arises Between Minds

psychologytoday.com·

Reinforcement Learning and Optimal Control Book (RIP Dimitri Bertsekas)

🧮Algorithms Academic

web.mit.edu··Hacker News

Reinforcement learning in linear embedding space unlocks generalizable control across soft robot configurations

🤖AI Agents Academic

NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents

🤖AI Agents Blog

developer.nvidia.com··Hacker News

DDPG from Scratch: 400-Line PyTorch Implementation

🤖Machine Learning

Less-relevant results

Why LLMs (still) lack taste

🤖Machine Learning

beyondtheprior.com··Hacker News

Time-slip in AI sepsis models may inflate results, risking under- or overtreatment

medicalxpress.com·

Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning

📐ML Theory Academic

Import AI 460: Reward hacking society, RSI data from Anthropic; and RL-based quadcopter racing

jack-clark.net·

Cohere open-sources a coding agent that runs on a single H100

venturebeat.com·

Log in to enable infinite scrolling