🎮 Reinforcement Learning - lmilekic · Scour

Test Your Skills Against an AI Air Hockey Robot

🦿Embodied AI News

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

✨Generative AI Academic

I got so mad at poke(rogue)like that I trained a RL agent to beat it for me

📊LLM Evaluation Blog

blog.thiagolira.com.br··Hacker News

2026 FIVB Volleyball Women's Nations League in Nanjing: Poland beats Czech Republic 3-0

📊LLM Evaluation

Deep reinforcement learning for process design: Review and perspective

✨Generative AI Academic

Sasha Rush explains targeted on-policy self-distillation, a reinforcement learning technique that corrects specific LLM rollout errors

📊LLM Evaluation

Model predictive task sampling for efficient and robust adaptation

⚙️Prompt Engineering Academic

SHAPO: Sharpness-Aware Policy Optimization for Safe Exploration

🔬ML Research Academic

KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++

🤖AI Code

github.com··Hacker News

ARTA: Adaptive Reinforcement-Learning-Based Throttling Agent for RowHammer Vulnerabilities

🤖AI Agents Academic

Nvidia Nemotron 3 Ultra

research.nvidia.com··Hacker News

Self-Paced Curriculum Reinforcement Learning for Autonomous Superbike Racing in Simulation

🤖AI Agents Academic

Bridging Multi-Vector and Learned-Sparse Retrieval, A Diagnostic Framework for Robust Semantic IDs, and More!

🧠LLMs News Blog

recsys.substack.com

Failure Modes of Deep Multi-Agent RL in Asynchronous Pricing: Reproducible Triggers, Trace Diagnostics, and a Partial Fix

📊LLM Evaluation Academic

DeepSeek fundraising 💰, Meta model delays ⌛ , Gemma 4 12B 🤖

UNIQ: Conformal Calibration for Adaptive Conservatism in Offline Reinforcement Learning

🔬ML Research Academic

Protest against ballot paper shortages enters 2nd day, demanding new election

💉Prompt Injection News

koreatimes.co.kr··r/news

Representation-Aware Advantage Estimation: Your Reward Model Provides More Than A Scalar Output

🤖AI Academic

Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using Reinforcement Learning

🦾Motion Planning Academic

Why Robotics Is a Pre-Paradigm Field

🦿Embodied AI News

whattotelltherobot.com··Hacker News

Sign up or log in to see more results

Log in to enable infinite scrolling