Operator Fusion, Memory Bandwidth, Graph Optimization, Intermediate Elimination

Don’t Just Normalize, Batch Normalize! A Guide to Stable Neural Networks
pub.towardsai.net·1d
📊Gradient Accumulation
Flag this post
CHIP8 – writing emulator, assembler, example game and VHDL hardware impl
blog.dominikrudnik.pl·20h·
Discuss: Hacker News
🔄SIMD Programming
Flag this post
Learning Sparse Approximate Inverse Preconditioners for Conjugate Gradient Solvers on GPUs
arxiv.org·1d
🔗NCCL
Flag this post
The Genetic Architecture of the Human Corpus Callosum and its Subregions
nature.com·5h
🧠BF16
Flag this post
Gated DeltaNet (Linear Attention variant in Qwen3-Next and Kimi Linear)
sebastianraschka.com·1d·
Discuss: r/LLM
👁️Attention Optimization
Flag this post
Hyper Hawkes Processes: Interpretable Models of Marked Temporal Point Processes
arxiv.org·12h
🏎️TensorRT
Flag this post
Benchmarking individual tree segmentation using multispectral airborne laser scanning data: the FGI-EMIT dataset
arxiv.org·12h
🏎️TensorRT
Flag this post
Running MiniMax-M2 locally - Existing Hardware Advice
reddit.com·22m·
Discuss: r/LocalLLaMA
🔧PTX
Flag this post
Efficiency vs. Alignment: Investigating Safety and Fairness Risks in Parameter-Efficient Fine-Tuning of LLMs
arxiv.org·12h
🔄ONNX
Flag this post
Why stop at 1 million tokens when you can have 10? My journey to extreme context on a gaming GPU. [P]
reddit.com·5h·
🏎️TensorRT
Flag this post
Terrain-Enhanced Resolution-aware Refinement Attention for Off-Road Segmentation
arxiv.org·12h
🧮cuDNN
Flag this post
Automated Variant Prioritization via Multi-Modal Feature Fusion and Bayesian Network Inference
dev.to·19h·
Discuss: DEV
🔄ONNX
Flag this post
Kimi Linear: An Expressive, Efficient Attention Architecture
arxiviq.substack.com·2d·
Discuss: Substack
🧩Attention Kernels
Flag this post
Comparing images with AVX
dev.to·2d·
Discuss: DEV
🔄SIMD Programming
Flag this post
Temporal Fusion Transformer for Multi-Horizon Probabilistic Forecasting of Weekly Retail Sales
arxiv.org·12h
🔄ONNX
Flag this post
Object-Aware 4D Human Motion Generation
arxiv.org·12h
🏎️TensorRT
Flag this post
GeneFlow: Translation of Single-cell Gene Expression to Histopathological Images via Rectified Flow
arxiv.org·12h
🔄ONNX
Flag this post