🎮 Reinforcement Learning - laurynas · Scour

GARL: Game-Theoretic Reinforcement Learning for Multi-Agent Strategic Prioritisation

🤝Multi-Agent Systems Academic

Agents Need Work Data: A Primer on RLWD, or Reinforcement Learning on Work Data

anjalishriva.com··Hacker News

Reinforcement Learning and Optimal Control Book (RIP Dimitri Bertsekas)

⚙Context engineering Academic

web.mit.edu··Hacker News

Import AI 460: Reward hacking society, RSI data from Anthropic; and RL-based quadcopter racing

🤝Multi-Agent Systems

jack-clark.net·

Good teachers don’t cheat

📐Theorem Proving Blog

jasonkena.github.io··Hacker News

How to Train Your Goblin

🎮Deterministic Simulation

goblins.mchen.workers.dev··Hacker News, Hacker News

Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning

⚙Context engineering Academic

See, Act, Correct: three levers for working with a code agent

🤖agents Blog

blog.owulveryck.info··Hacker News, Hacker News

Memoirs of a Learning Machine: Autobiographical Self-Training and the Self-Training Gap

zenodo.org··Hacker News

Geometrically Averaged Hard Target Updates for Linear Q-Learning

🎯Reranking Academic

Self-Paced Curriculum Reinforcement Learning for Autonomous Superbike Racing in Simulation

🤖agents Academic

Dmsh: A Multi-Agent Reinforcement Learning Framework for All-Quad Mesh Generation

🤝Multi-Agent Systems Academic

Event-Driven Reinforcement Learning Enables Long-Horizon Control in Semiconductor Fabrication

⚙Context engineering Academic

Progress-SQL: Improving Reinforcement Learning for Text-to-SQL via Progressive Rewards

⚙Context engineering Academic

Reasoning or Memorization? Direction-Aware Diversity Exploration in LLM Reinforcement Learning

⚙Context engineering Academic

Shape Formation for the Cooperative Transportation of Arbitrary Objects Using Multi-Agent Reinforcement Learning

🤝Multi-Agent Systems Academic

Failure Modes of Deep Multi-Agent RL in Asynchronous Pricing: Reproducible Triggers, Trace Diagnostics, and a Partial Fix

🤝Multi-Agent Systems Academic

Reinforcement Learning for Flow-Matching Policies with Density Transport

⚙Context engineering Academic

Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning

🔍AI Interpretability Academic

Offline Reinforcement Learning for Plasma Control in Nuclear Fusion: Codebase and Benchmark

⚙Context engineering Academic

Log in to enable infinite scrolling