🔗 Distributed Training - bugrakadirhan · Scour

Introducing Piper: A Programmable Distributed Training System

🖥️Systems ML Academic Blog

syfi.cs.washington.edu··Hacker News

Scaling Neural Network Verification with Tensor Parallelism and Fully Sharded Data Parallelism

🛠️ML Frameworks Academic

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

⚡ML Inference Code

github.com··Hacker News

Introduction to Collective Communications in AI Data Center Networking

🖥️Systems ML

networkphil.com·

CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?

🎮GPU Programming

uccl-project.github.io··Hacker News

New comment by bkjlblh in "Claude Fable 5"

🖥️Systems ML Discussion

news.ycombinator.com··Hacker News

2x GH200 for LLM inference, Part 2: vLLM, DeepSeek V4 Flash, and MTP

⚡ML Inference Blog

dnhkng.github.io·

Claude Fable 5 silently degrades its own performance on frontier AI work

🖥️Systems ML News Blog

mkotlikov.substack.com··Substack

From GPU to Token: The 8-Layer Observability Stack for AI Infrastructure

⚡ML Inference Blog

Learned Subspace Compression for Communication-Efficient Pipeline Parallelism

🖥️Systems ML Academic

If Claude Fable stops helping you, you’ll never know

🖥️Systems ML

simonwillison.net··Hacker News

Thoughts on Claude Fable's silent safeguards

🖥️Systems ML

lesswrong.com·

Alleged Fable sabotage of an ML project

🤖Machine Learning

xcancel.com··Hacker News

Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell

🧠Deep Learning News Blog

developer.nvidia.com·

Running LLM Inference on Kubernetes: What It Actually Takes

⚡ML Inference Blog

fairwinds.com·

Piper: A Programmable Distributed Training System

🖥️Systems ML Academic

Apple WWDC On-Device AI Deep Dive - Google Docs

🤖Machine Learning

gist.is··Hacker News

Less-relevant results

Anthropic's Fable a cautionary tale

🖥️Systems ML

From tenant-aware to job-aware: scaling shared AI clusters with Cisco Nexus One

🖥️Systems ML Blog

blogs.cisco.com·

Anatomy of a high-performance EP kernel

⚡ML Inference Blog

fergusfinn.com··Hacker News

Log in to enable infinite scrolling