Hierarchical reinforcement learning (opens in new tab)

Wherein the problem of long horizons is addressed by decomposing tasks, and Internal RL is introduced whereby a meta‑controller is employed to manipulate model residuals sparsely, compressing token horizons.