RL at 1T Scale: prime-rl Performance Deep Dive (opens in new tab)
prime-rl 0.6.0 trains trillion-parameter MoE models on heavy agentic workloads at the highest efficiency. A deep dive into the inference and training optimizations behind it — from FP8 and wide expert parallelism to P/D disaggregation, router replay, and 3-D parallelism.
Read the original article