A hitchhiker's guide to CUDA programming
🎯GPU Kernels
Flag this post
Show HN: GPU-accelerated sandboxes for running AI coding agents in parallel [video]
🔗NCCL
Flag this post
A unified threshold-constrained optimization framework for consistent and interpretable cross-machine condition monitoring
sciencedirect.com·13h
⏱️Benchmarking
Flag this post
TIL: For long-lived LLM sessions, swapping KV Cache to RAM is ~10x faster than recalculating it. Why isn't this a standard feature?
🔲Loop Tiling
Flag this post
Structurally Valid Log Generation using FSM-GFlowNets
arxiv.org·2d
🔄ONNX
Flag this post
NVIDIA and Samsung working even closer together, new semiconductor AI factory has 50,000+ GPUs
tweaktown.com·8h
🔍Nsight
Flag this post
Ambient CI, progress this year
blog.liw.fi·3h
🏗️Build Systems
Flag this post
A Coding Implementation of a Comprehensive Enterprise AI Benchmarking Framework to Evaluate...
marktechpost.com·1d
🤖AI Coding Tools
Flag this post
Utilizing Chiplet-Locality For Efficient Memory Mapping In MCM GPUs (ETRI, Sungkyunkwan Univ.)
semiengineering.com·2d
📈Occupancy Optimization
Flag this post
Loading...Loading more...