Inter-Process Communication, Shared Memory, Multi-Process, GPU Access

My First Multi-GPU Kernel: Writing All-to-All for AMD MI300X
gau-nernst.github.io·1d·
Discuss: Hacker News
🎯GPU Kernels
Flag this post
eBPF Tutorial by Example: Monitoring GPU Driver Activity with Kernel Tracepoints
dev.to·11h·
Discuss: DEV
⏱️CUDA Events
Flag this post
Arista Modular Switches Aim At Scale Across Networks, Hit Scale Out, Too
nextplatform.com·1h
🌊CUDA Streams
Flag this post
Inline vs. Pipeline Ray Tracing
evolvebenchmark.com·4h·
Discuss: Hacker News
⏱️CUDA Events
Flag this post
Uncrossed Multiflows and Applications to Disjoint Paths
arxiv.org·13h
📊CUDA Graphs
Flag this post
Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale
daft.ai·1h·
Discuss: Hacker News
🔗NCCL
Flag this post
Voxel Grid Visibility
cod.ifies.com·2h·
✂️CUTLASS
Flag this post
A hitchhiker's guide to CUDA programming
seanzhang.me·4d·
Discuss: Hacker News
🎯GPU Kernels
Flag this post
Running MiniMax-M2 locally - Existing Hardware Advice
reddit.com·1h·
Discuss: r/LocalLLaMA
🔧PTX
Flag this post
How NVIDIA GeForce RTX GPUs Power Modern Creative Workflows
blogs.nvidia.com·4h
⏱️CUDA Events
Flag this post
Show HN: Polyglot Docker dev environment setup – C/C++/Rust/Python
github.com·14h·
Discuss: Hacker News
💡LSP
Flag this post
A Friendly Tour of Process Memory on Linux
0xkato.xyz·19h·
Discuss: Hacker News
📊Profiling Tools
Flag this post
Video Invisible Watermarking at Scale
engineering.fb.com·51m·
Discuss: Hacker News
⏱️CUDA Events
Flag this post
Co-Simulation Framework for Parallel DNN Execution on Chiplet-Based Systems (UW–Madison, Washington State)
semiengineering.com·21h
🌊CUDA Streams
Flag this post
gRPC Python, AsyncIO and multiprocess
blog.est.im·16h
💡LSP
Flag this post
Giga Computing Announces Worldwide Availability of Its NVIDIA RTX PRO Server
prnewswire.com·41m
🔍Nsight
Flag this post
An eBPF Loophole: Using XDP for Egress Traffic
loopholelabs.io·2h·
Discuss: Hacker News
🌊CUDA Streams
Flag this post
A Deep Dive into Multi-Transport Protocol Abstraction in Python
dev.to·3h·
Discuss: DEV
💡LSP
Flag this post
GeForce GTX 1650 SUPER prototype with 1152 CUDA cores and PCIe 4.0 interface surfaces
videocardz.com·2h
⏱️CUDA Events
Flag this post
A Soft‑Fork Proposal for Blockchain‑Based Distributed AI Computation
hackernoon.com·1d
🎯Tensor Cores
Flag this post