Multi-GPU Communication, Collective Operations, Distributed Training, AllReduce

Feeds to Scour
SubscribedAll
Scoured 82690 posts in 423.4 ms
Hetccl Shows Scaling Of Multi-Vendor GPU Clusters For Large Language Models
quantumzeitgeist.com·11h
🔄ONNX
Preview
Report Post
Training LLMs with Fault Tolerant HSDP on 100,000 GPUs
arxiv.org·1d
CUDA Programming Patterns
Preview
Report Post
nilpunch/massive-ecs: Bitset-based ECS with rollbacks. C# library and Unity package.
github.com·9h
📜TorchScript
Preview
Report Post
Optimizing Communication for Mixture-of-Experts Training with Hybrid Expert Parallel
developer.nvidia.com·1d
🌊CUDA Streams
Preview
Report Post
Using Nsight Compute with large codebases - Part 2 : Profiling large code bases
blog.ncompass.tech·19h·
Discuss: Hacker News
🔍Nsight
Preview
Report Post
ML for Energy-Performance-Aware Scheduling On Heterogeneous Multicore Architectures (Cambridge)
semiengineering.com·1d
📈Occupancy Optimization
Preview
Report Post
Conformal Thinking: Risk Control for Reasoning on a Compute Budget
arxiv.org·7h
ONNX Runtime
Preview
Report Post
Adaptive Traffic Management Middleware via Predictive Consensus and Reinforcement Learning
dev.to·2h·
Discuss: DEV
ONNX Runtime
Preview
Report Post
Show HN: DeepInsight HITL AI research with collaboration and podcast generation
news.ycombinator.com·6h·
Discuss: Hacker News
ONNX Runtime
Preview
Report Post
Mekara: Workflows as Code Proof-of-Concept
meksys-dev.github.io·8h·
Discuss: Hacker News
🤖AI Coding Tools
Preview
Report Post
**Abstract:** This paper introduces a novel approach to stabilizing simulated spacetime geometries in high-performance computing environments by leveraging h...
freederia.com·27m
✂️CUTLASS
Preview
Report Post
Understanding LLM Inference Engines: Inside Nano-vLLM (Part 1)
neutree.ai·1d·
⏱️CUDA Events
Preview
Report Post
Writing an LLM from scratch, part 32a -- Interventions: training a baseline model
gilesthomas.com·10h·
Discuss: Hacker News
📊Gradient Accumulation
Preview
Report Post
Robotics will break AI infrastructure: Here's what comes next
theregister.com·20h
⏱️CUDA Events
Preview
Report Post
CRAM-Net: The Network that Thinks by Rewiring
github.com·16h·
Discuss: DEV
🎓Model Distillation
Preview
Report Post
**Abstract:** The increasing bandwidth demands of modern data centers necessitate sophisticated congestion control mechanisms capable of dynamically adapting...
freederia.com·10h
🌊CUDA Streams
Preview
Report Post
Scaling Video Encoding with Edge AI Power
dev.to·7h·
Discuss: DEV
Flash Attention
Preview
Report Post
Qwen3-Coder-Next offers vibe coders a powerful open source, ultra-sparse model with 10x higher throughput for repo tasks
venturebeat.com·13h
🤖AI Coding Tools
Preview
Report Post
Silicon coupled with open development platforms drives context-aware edge AI
edn.com·4h
ONNX Runtime
Preview
Report Post

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help