RL Doesn't Work on Slurm (opens in new tab)
Online reinforcement learning for LLMs breaks Slurm's batch scheduling model. We'll discuss why, and what can be done about it.
Read the original articleOnline reinforcement learning for LLMs breaks Slurm's batch scheduling model. We'll discuss why, and what can be done about it.
Read the original article