Distributed Training

Feeds to Scour
SubscribedAll
Scoured 22 posts in 6.1 ms

Introducing Piper: A Programmable Distributed Training System

 🏗️AI Infrastructure  Content type: Academic  Content type: Blog

Scaling Neural Network Verification with Tensor Parallelism and Fully Sharded Data Parallelism

 🤖Machine Learning  Content type: Academic
arxiv.org·
Less-relevant results

From tenant-aware to job-aware: scaling shared AI clusters with Cisco Nexus One

 🏗️AI Infrastructure  Content type: Blog
blogs.cisco.com·

From GPU to Token: The 8-Layer Observability Stack for AI Infrastructure

 GPU Computing  Content type: Blog
jimmysong.io·

Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell

 🟢NVIDIA  Content type: News  Content type: Blog
developer.nvidia.com·

FIFA World Cup 2026 hype kicks off fraud, fake apps, and ransomware targeting fans and businesses

 ☁️Cloud Computing  Content type: News
techradar.com
·

The Inference Alpha: Maximizing Frontier Models on AMD

 🏗️AI Infrastructure  Content type: Blog
digitalocean.com·

How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies

 🟢NVIDIA  Content type: Blog
blogs.nvidia.com·

Live spoken translation is finally practical — how real-time AI voice bridges language gaps in the moment

 🏗️AI Infrastructure  Content type: Blog

2x GH200 for LLM inference, Part 2: vLLM, DeepSeek V4 Flash, and MTP

 🏗️AI Infrastructure  Content type: Blog
dnhkng.github.io·

Unifying Local Communications and Local Updates for LLM Pretraining

 💬LLMs  Content type: Academic
arxiv.org·

Introduction to Collective Communications in AI Data Center Networking

 GPU Computing
networkphil.com·

Researchers Discover a Hidden Vitamin D Problem That Persists Year-Round

 ☁️Cloud Computing
scitechdaily.com·

Three ways that decentralized trial design strengthens the regulatory case for CNS devices

 💾AI Chips

Women who experience premature menopause are at greater risk of stroke and heart failure

 ☁️Cloud Computing  Content type: News
english.elpais.com·

FIFA World Cup 2026: Waterloo experts discuss impacts beyond the games

 🏛️Enterprise IT
uwaterloo.ca·

Nvidia DGX Spark GB10 – AI Models and Guide with vLLM and Autonomous Script

 🟢NVIDIA  Content type: Code
github.com··Hacker News

Hypothesis Testing and Estimation with Standardized Standard Errors

 🤖Machine Learning
replicationindex.com·

StageFrontier: Synchronization-Aware Stage Accounting for Distributed ML Training

 🧠Deep Learning  Content type: Academic
arxiv.org·

The Energy Supply Cliff is Alarmingly Near

 ☁️Cloud Computing
rusi.org·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help