🎮 Reinforcement Learning - chris1 · Scour

Policy Improvement Reinforcement Learning ♟️Game Theory

How does Reinforcement Learning Affect Models 💬LLMs

lesswrong.com·3d

The Data Layer Tax for Robot Learning 🤖Machine Learning

rerun.io·14h·Hacker News

Every Model Learned by Gradient Descent Is Approximately a Kernel Machine 🤖Machine Learning

news.ycombinator.com·2h·Hacker News

How to build custom reasoning agents with a fraction of the compute 💬LLMs

venturebeat.com·2d

Reinforcement fine-tuning with LLM-as-a-judge 💬LLMs

aws.amazon.com·7h

Adaptive home energy management to self-motivated user preferences via iterative LLM-augmented reinforcement learning 🤖AI Agents

sciencedirect.com·5d

Learning diverse natural behaviors for enhancing the agility of quadrupedal robots 🤖AI Agents

There Will Be a Scientific Theory of Deep Learning 🤖AI

mail.bycloud.ai·1d

A new GitHub repo to detect reward hacking in RL models 🤖AI Agents

github.com·4d·Hacker News

Jaxpot: Train self-play RL agents FAST by parallelizing environments on GPU 🤖AI Agents

bardsai.substack.com·2d·Substack

On-Policy vs Off-Policy RL: PPO vs SAC on 5 Gymnasium Tasks 🤖AI Agents

tildalice.io·4d

The Policy Picks the Policy 🤖AI Agents

noise2signal.bearblog.dev·2d

DEEP Robotics 🧠Neural Networks

youtube.com·3d·r/singularity

Artificial Intelligence: Foundations of Computational Agents 🤖AI Agents

artint.info·3d·Hacker News

Lyapunov-Guided Self-Alignment: Test-Time Adaptation for Offline Safe Reinforcement Learning 💬LLMs

Fixing What LLMs Get Wrong (22 minute read) 💬LLMs

thebigdataguy.substack.com·4d·Substack

Boiler combustion optimization via offline reinforcement learning with an ensemble high-dimensional environment 🤖Machine Learning

sciencedirect.com·2d

RL, in pictures and videos 🤖AI Agents

Show HN: A live autonomous economic network for AI agents 🤖AI Agents

ainetwork-global.github.io·3d·Hacker News

Log in to enable infinite scrolling