Multi-GPU Communication, Collective Operations, Distributed Training, AllReduce

My First Multi-GPU Kernel: Writing All-to-All for AMD MI300X
gau-nernst.github.ioยท18hยท
Discuss: Hacker News
๐ŸŽฏGPU Kernels
Flag this post
Understanding Federated Learning: Best Practices for Implementing Privacy-Preserving AI in C# Projects
dev.toยท11hยท
Discuss: DEV
๐Ÿ”„ONNX
Flag this post
Learning Sparse Approximate Inverse Preconditioners for Conjugate Gradient Solvers on GPUs
arxiv.orgยท15h
๐ŸŽฏTensor Cores
Flag this post
Synopsys and NVIDIA Forge AI Powered Future for Chip Design and Multiphysics Simulation
semiwiki.comยท6h
๐ŸŒŠCUDA Streams
Flag this post
A Softโ€‘Fork Proposal for Blockchainโ€‘Based Distributed AI Computation
hackernoon.comยท8h
๐ŸŽฏTensor Cores
Flag this post
Scaling Coding-Agent RL to 32x H100s. 160% Improvement on Stanford's TBench
github.comยท7hยท
๐Ÿค–AI Coding Tools
Flag this post
Evolving Ray and Kubernetes together for the future of distributed AI and ML
cloud.google.comยท3h
๐ŸŒDistributed Computing
Flag this post
Gated DeltaNet (Linear Attention variant in Qwen3-Next and Kimi Linear)
sebastianraschka.comยท16hยท
Discuss: r/LLM
๐Ÿ‘๏ธAttention Optimization
Flag this post
Amazon Secures $38 Billion Deal to Host OpenAI's NVIDIA GB200/GB300 AI Servers
techpowerup.comยท3h
๐ŸŒDistributed Computing
Flag this post
Deep Integration and the Convergence of Model Architecture and Hardware in AI
dev.toยท23hยท
Discuss: DEV
๐ŸŽฏTensor Cores
Flag this post
Deflanderization for Game Dialogue: Balancing Character Authenticity with TaskExecution in LLM-based NPCs
paperium.netยท5hยท
Discuss: DEV
๐Ÿค–AI Coding Tools
Flag this post
Dive into Systems
diveintosystems.orgยท3hยท
Discuss: Hacker News
โš™๏ธSystems Programming
Flag this post
ZkML Breakthrough: 13B Models Verified in 15 Minutes
lightcapai.medium.comยท1dยท
Discuss: Hacker News
๐ŸŽฏTensor Cores
Flag this post
Show HN: GPU-accelerated sandboxes for running AI coding agents in parallel [video]
youtube.comยท3dยท
Discuss: Hacker News
๐Ÿค–AI Coding Tools
Flag this post
Enforcing Architecture in an Agent-Driven Codebase
phoebe.workยท4hยท
Discuss: Hacker News
๐Ÿ—๏ธBuild Optimization
Flag this post
Armada Launches Bridge to Power the Next Generation of AI Infrastructure
prnewswire.comยท9h
๐Ÿ”งPTX
Flag this post
Doo: A Simple, Fast Programming Language Built on Rust and LLVM
news.ycombinator.comยท11hยท
Discuss: Hacker News
๐Ÿ•Ruff
Flag this post
I repurposed my old GPU for self-hosted AI and it changed my life
xda-developers.comยท4h
๐Ÿค–AI Coding Tools
Flag this post
GPU Pro โ€“ Master Your AI Workflow
github.comยท1dยท
๐Ÿ”Nsight
Flag this post