A hitchhiker's guide to CUDA programming
๐ฏGPU Kernels
Flag this post
GPU Pro โ Master Your AI Workflow
๐Nsight
Flag this post
Moving past speculation: How deterministic CPUs deliver predictable AI performance
venturebeat.comยท18h
๐ง CPU Architecture
Flag this post
Show HN: GPU-accelerated sandboxes for running AI coding agents in parallel [video]
๐NCCL
Flag this post
onedraw โ a GPU-driven 2D renderer
โ๏ธCUTLASS
Flag this post
Intel's killed-off BMG-X3/X4 GPUs: 3D stacked die, up to 40 GPU cores, 512MB Adamantine cache
tweaktown.comยท2h
๐งPTX
Flag this post
TIL: For long-lived LLM sessions, swapping KV Cache to RAM is ~10x faster than recalculating it. Why isn't this a standard feature?
๐ฒLoop Tiling
Flag this post
A unified threshold-constrained optimization framework for consistent and interpretable cross-machine condition monitoring
sciencedirect.comยท1d
โฑ๏ธBenchmarking
Flag this post
Scalable In-Memory Associative Processing for Graph Neural Network Inference
โกFlash Attention
Flag this post
Can-t stop till you get enough
๐TorchScript
Flag this post
Structurally Valid Log Generation using FSM-GFlowNets
arxiv.orgยท2d
๐ONNX
Flag this post
Project Banana
404wolf.comยท26m
๐Distributed Computing
Flag this post
Windows Task Manager is fine, but this is the tool I actually use
xda-developers.comยท7h
โ๏ธSystems Programming
Flag this post
Loading...Loading more...