Inter-Process Communication, Shared Memory, Multi-Process, GPU Access

My First Multi-GPU Kernel: Writing All-to-All for AMD MI300X
gau-nernst.github.io·1d·
Discuss: Hacker News
🎯GPU Kernels
Flag this post
eBPF Tutorial by Example: Monitoring GPU Driver Activity with Kernel Tracepoints
dev.to·2h·
Discuss: DEV
⏱️CUDA Events
Flag this post
Uncrossed Multiflows and Applications to Disjoint Paths
arxiv.org·5h
📊CUDA Graphs
Flag this post
A hitchhiker's guide to CUDA programming
seanzhang.me·4d·
Discuss: Hacker News
🎯GPU Kernels
Flag this post
Show HN: Polyglot Docker dev environment setup – C/C++/Rust/Python
github.com·5h·
Discuss: Hacker News
💡LSP
Flag this post
The PVS system in our 3d metroid-like
reddit.com·8h·
Discuss: r/gamedev
🔧PTX
Flag this post
A Friendly Tour of Process Memory on Linux
0xkato.xyz·10h·
Discuss: Hacker News
📊Profiling Tools
Flag this post
Co-Simulation Framework for Parallel DNN Execution on Chiplet-Based Systems (UW–Madison, Washington State)
semiengineering.com·13h
🌊CUDA Streams
Flag this post
gRPC Python, AsyncIO and multiprocess
blog.est.im·7h
💡LSP
Flag this post
Troubleshooting multi-GPU with 2 RTX PRO 6000 Workstation Edition
reddit.com·1d·
Discuss: r/LocalLLaMA
⏱️CUDA Events
Flag this post
A Soft‑Fork Proposal for Blockchain‑Based Distributed AI Computation
hackernoon.com·22h
🎯Tensor Cores
Flag this post
Attention Is All You Need for KV Cache in Diffusion LLMs
paperium.net·5h·
Discuss: DEV
🎯Tensor Cores
Flag this post
Synopsys and NVIDIA Forge AI Powered Future for Chip Design and Multiphysics Simulation
semiwiki.com·20h
🌊CUDA Streams
Flag this post
Low-Level Hacks
blog.raycursive.com·7h·
Discuss: Hacker News
📊Profiling Tools
Flag this post
PCIe lanes are the real currency of modern PCs
xda-developers.com·1d
⏱️CUDA Events
Flag this post
Cycle-accurate 6502 emulator as coroutine in Rust
github.com·2d·
📊Profiling Tools
Flag this post
onedraw — a GPU-driven 2D renderer
dev.to·1d·
Discuss: DEV
✂️CUTLASS
Flag this post
H-FA: A Hybrid Floating-Point and Logarithmic Approach to Hardware Accelerated FlashAttention
arxiv.org·5h
Flash Attention
Flag this post
A portable picokernel for async I/O
ryansepassi.com·3d·
Discuss: Hacker News
📊Profiling Tools
Flag this post
flowengineR: A Modular and Extensible Framework for Fair and Reproducible Workflow Design in R
arxiv.org·5h
🔄ONNX
Flag this post