My First Multi-GPU Kernel: Writing All-to-All for AMD MI300X
gau-nernst.github.ioยท4hยท
Discuss: Hacker News
๐ŸŽฏGPU Kernels
Flag this post
Can-t stop till you get enough
cant.bearblog.devยท11hยท
Discuss: Hacker News
๐Ÿ“œTorchScript
Flag this post
Learning Sparse Approximate Inverse Preconditioners for Conjugate Gradient Solvers on GPUs
arxiv.orgยท42m
๐Ÿ”—NCCL
Flag this post
onedraw โ€” a GPU-driven 2D renderer
dev.toยท16hยท
Discuss: DEV
โœ‚๏ธCUTLASS
Flag this post
A hitchhiker's guide to CUDA programming
seanzhang.meยท3dยท
Discuss: Hacker News
๐ŸŽฏGPU Kernels
Flag this post
ZkML Breakthrough: 13B Models Verified in 15 Minutes
lightcapai.medium.comยท13hยท
Discuss: Hacker News
๐ŸŽฏTensor Cores
Flag this post
Gated DeltaNet (Linear Attention variant in Qwen3-Next and Kimi Linear)
sebastianraschka.comยท2hยท
Discuss: r/LLM
๐Ÿ‘๏ธAttention Optimization
Flag this post
I made a tensor runtime & inference framework in C (good for learning how inference works)
github.comยท4hยท
๐Ÿ“œTorchScript
Flag this post
Intel's killed-off BMG-X3/X4 GPUs: 3D stacked die, up to 40 GPU cores, 512MB Adamantine cache
tweaktown.comยท8h
๐Ÿ”งPTX
Flag this post
A Practitioner's Guide to Kolmogorov-Arnold Networks
arxiviq.substack.comยท11hยท
Discuss: Substack
๐Ÿ“‰Model Quantization
Flag this post
Scalable In-Memory Associative Processing for Graph Neural Network Inference
dev.toยท17hยท
Discuss: DEV
โšกFlash Attention
Flag this post
Performance evaluation of image convolution with gradient filters in OpenCL
milania.deยท4dยท
Discuss: Hacker News
๐Ÿ”Nsight
Flag this post
Deep Neural Watermarking for Robust Copyright Protection in 3D Point Clouds
arxiv.orgยท42m
๐ŸงฎcuDNN
Flag this post
Text rendering and effects using GPU-computed distances
blog.pkh.meยท1d
โœ‚๏ธCUTLASS
Flag this post
The Evolution of GPUs: How Floating-Point Changed Computing
dell.comยท15hยท
Discuss: Hacker News
๐ŸŽฏTensor Cores
Flag this post
ClipTagger-12B VLM: Frame Captioning Tutorial
dev.toยท13hยท
Discuss: DEV
๐Ÿ”„ONNX
Flag this post
Programming for Computations: Matlab/Octave
link.springer.comยท58mยท
Discuss: Hacker News
๐Ÿ”„SIMD Programming
Flag this post
[CrabGraph] A Modern, Safe, and Ergonomic Rust Cryptography Library
reddit.comยท17hยท
Discuss: r/rust
โœ‚๏ธCUTLASS
Flag this post
Integer overflow checking with C23
blog.gnoack.orgยท9h
๐Ÿ”ฌStatic Analysis
Flag this post