RT by @AravSrinivas: Higher throughput. Lower cost. Same accuracy. (opens in new tab)

Higher throughput. Lower cost. Same accuracy. @perplexity_ai just put Qwen3 235B into production on NVIDIA GB200 NVL72. Check out how the team made it happen ⬇️ Perplexity (@perplexity_ai) We published new research on how we serve post-trained Qwen3 235B models on NVIDIA GB200 NVL72 Blackwell racks. GB200 is a major step up over Hopper for high-throughput inference on large MoE models, not just a training platform. —

Read the original article