A Two-Stage GPU Kernel Tuner Combining Semantic Refactoring and Search-Based Optimization
arxiv.org·14h
FlashAttention 4: Faster, Memory-Efficient Attention for LLMs
digitalocean.com·7h
Build Your Own Key-Value Storage Engine—Week 6
read.thecoder.cafe·6h
Discovering 100+ Compiler Defects in 72 Hours via LLM-Driven Semantic Logic Recomposition
arxiv.org·14h
BPF Verifier State Pruning: Timeline
pchaigno.github.io·1d
oneAPI DPC++ Compiler and Runtime architecture design — oneAPI DPC++ Compiler documentation
intel.github.io·1d
Addressing Critical Tradeoffs In NPU Design
semiengineering.com·11h
Binary Algorithms
exystence.net·18h
Loading...Loading more...