Operator Fusion, Memory Bandwidth, Graph Optimization, Intermediate Elimination

Co-Simulation Framework for Parallel DNN Execution on Chiplet-Based Systems (UW–Madison, Washington State)
semiengineering.com·23h
🌊CUDA Streams
Flag this post
Readable Code Is Unreadable
blog.wilsonb.com·12h·
Discuss: Hacker News
🔍Type Checkers
Flag this post
FG-CLIP 2: A Bilingual Fine-grained Vision-Language Alignment Model
paperium.net·1d·
Discuss: DEV
🧩Attention Kernels
Flag this post
Region-Aware Reconstruction Strategy for Pre-training fMRI Foundation Model
arxiv.org·15h
📊Gradient Accumulation
Flag this post
Fixed-point graph convolutional networks against adversarial attacks
arxiv.org·15h
🔀Operator Fusion
Flag this post
The Curvature Rate {\lambda}: A Scalar Measure of Input-Space Sharpness in Neural Networks
arxiv.org·15h
📉Model Quantization
Flag this post
Understanding Federated Learning: Best Practices for Implementing Privacy-Preserving AI in C# Projects
dev.to·1d·
Discuss: DEV
🔄ONNX
Flag this post
Unlock the Power of GANs: Train with Tiny Datasets!
dev.to·1h·
Discuss: DEV
📊Gradient Accumulation
Flag this post
Branched Signature Model
arxiv.org·15h
🔢cuBLAS
Flag this post
Learning Sparse Approximate Inverse Preconditioners for Conjugate Gradient Solvers on GPUs
arxiv.org·1d
🔗NCCL
Flag this post
Gated DeltaNet (Linear Attention variant in Qwen3-Next and Kimi Linear)
sebastianraschka.com·1d·
Discuss: r/LLM
👁️Attention Optimization
Flag this post
Hyper Hawkes Processes: Interpretable Models of Marked Temporal Point Processes
arxiv.org·15h
🏎️TensorRT
Flag this post
Benchmarking individual tree segmentation using multispectral airborne laser scanning data: the FGI-EMIT dataset
arxiv.org·15h
🏎️TensorRT
Flag this post
Running MiniMax-M2 locally - Existing Hardware Advice
reddit.com·3h·
Discuss: r/LocalLLaMA
🔧PTX
Flag this post
Efficiency vs. Alignment: Investigating Safety and Fairness Risks in Parameter-Efficient Fine-Tuning of LLMs
arxiv.org·15h
🔄ONNX
Flag this post
Why stop at 1 million tokens when you can have 10? My journey to extreme context on a gaming GPU. [P]
reddit.com·8h·
🏎️TensorRT
Flag this post
Terrain-Enhanced Resolution-aware Refinement Attention for Off-Road Segmentation
arxiv.org·15h
🧮cuDNN
Flag this post