🤖 reinforcement learning - ddboline · Scour

Reinforcement Learning and Optimal Control Book (RIP Dimitri Bertsekas)

🧩operations research Academic

web.mit.edu··Hacker News

Agents Need Work Data: A Primer on RLWD, or Reinforcement Learning on Work Data

🧩operations research

anjalishriva.com··Hacker News

Measuring Embedding Drift: Why Hybrid Search Saves Stale Models.

📊linear programming

pub.towardsai.net

·

Propel: Breaking the Solver Bottleneck in Task-Generator RL

📊linear programming

vmax.ai··Hacker News

Why LLMs (still) lack taste

🧩operations research

beyondtheprior.com··Hacker News

How to Train Your Goblin

📊linear programming

goblins.mchen.workers.dev··Hacker News, Hacker News

Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information

🧩operations research

venturebeat.com··Hacker News

See, Act, Correct: three levers for working with a code agent

🧩operations research Blog

blog.owulveryck.info··Hacker News, Hacker News

Agentic RL: Token-In, Token-Out Done Right

📊linear programming

qgallouedec-tito.hf.space··Hacker News

NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents

🧩operations research Blog

developer.nvidia.com··Hacker News

AI-powered living business intelligence network

🧩operations research

atlasforgex.com

··Hacker News

I got so mad at poke(rogue)like that I trained a RL agent to beat it for me

📊linear programming

thiagolira.blot.im··Hacker News

Beyond Dexterity: Why Contact May Define the Next Era of Robotics

🧩operations research Video News

spectrum.ieee.org

··Hacker News

Memoirs of a Learning Machine: Autobiographical Self-Training and the Self-Training Gap

🧩operations research

zenodo.org··Hacker News

Stack Overflow didn't just help AI learn to code

zozo123.github.io··Hacker News

Vibe Diaries: Training Nanochat

vibediary.dev··Hacker News

The Effective Sample Size

🧩operations research

alex.smola.org··Hacker News

Nvidia Nemotron 3 Ultra

research.nvidia.com··Hacker News

Apple's New AI Models Contain 'None' of Google's Gemini Assistant

📊linear programming News

macrumors.com··Hacker News

Arithmetic Pedagogy for Language Models

📊linear programming Academic

arxiv.org··Hacker News

Log in to enable infinite scrolling