🎯 Reinforcement Learning - daemsc · Scour

Policy Gradient for Continuous-Time Robust Markov Decision Processes

🤖AI Engineering Academic

Q-Learning (Reinforcement learning): Bellman Equation, Markov Decision Processes, Q-Values, and…

🤖AI Engineering Blog

·

Agents Need Work Data: A Primer on RLWD, or Reinforcement Learning on Work Data

🤖AI Engineering

anjalishriva.com··Hacker News

Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond

🧠LLM Research

turingpost.com·

Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI

🤖AI Engineering Blog

aws.amazon.com·

Towards Shutdownable Agents: Generalizing Stochastic Choice in RL Agents and LLMs

🧠LLM Research

lesswrong.com·

SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.

🤖AI Engineering Code

github.com··r/opensource

Cohere open-sources a coding agent that runs on a single H100

🧠LLM Research

venturebeat.com·

Reinforcement Learning and Optimal Control Book (RIP Dimitri Bertsekas)

🤖Robotics Academic

web.mit.edu··Hacker News

AI Agent Mastery & Coaching

🤖AI Engineering

Microsoft just shared the frontier data engineering secrets

🔮Multimodal AI

mail.bycloud.ai·

DDPG from Scratch: 400-Line PyTorch Implementation

🧠LLM Research

Reinforcement learning in linear embedding space unlocks generalizable control across soft robot configurations

🤖Robotics Academic

Good teachers don’t cheat

🛡️AI Safety Blog

jasonkena.github.io··Hacker News

Social intelligence Arises Between Minds

psychologytoday.com·

Weekly Research Recap

🧠LLM Research News

quantseeker.com·

I got so mad at poke(rogue)like that I trained a RL agent to beat it for me

🤖AI Engineering

thiagolira.blot.im··Hacker News

Geometry-Aware Reinforcement Learning for 2D Irregular Nesting

🤖AI Engineering Academic

See, Act, Correct: three levers for working with a code agent

🧠LLM Research Blog

blog.owulveryck.info··Hacker News, Hacker News

A Human-Augmenting Agentic Workflow for Causal Inference

🤖AI Engineering Blog

netflixtechblog.medium.com·

Log in to enable infinite scrolling