RL at 1T Scale: prime-rl Performance Deep Dive (opens in new tab)

Covers 6 stories including Kimi K2.7-Code: open-source coding model with better token efficiency

prime-rl 0.6.0 trains trillion-parameter MoE models on heavy agentic workloads at the highest efficiency. A deep dive into the inference and training optimizations behind it — from FP8 and wide expert parallelism to P/D disaggregation, router replay, and 3-D parallelism.

Read the original article