Back to article

NVIDIA/Megatron-LM (opens in new tab)

Covered by 4 sources including Cloud Native Now, Hugging Face

Covered in 5 articles

Cloud Native Now·

Google OpenRL Tames AI Model Tuning, Kubernetes-Style

nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16

Discussed on Hacker News, Hacker News, and r/LocalLLaMA

NVIDIA has released Nemotron-TwoTower-30B-A3B-Base-BF16, an unusual diffusion-based language model built from the Nemotron 3 Nano 30B-A3B backbone.

Discussed on r/LocalLLaMA

NVIDIA Technical Blog·

Maximize AI Factory Energy Efficiency Through Full-Stack Inference and Training Optimizations

syfi.cs.washington.edu·

Introducing Piper: A Programmable Distributed Training System

Discussed on Hacker News