Operator Fusion, Memory Bandwidth, Graph Optimization, Intermediate Elimination

Co-Simulation Framework for Parallel DNN Execution on Chiplet-Based Systems (UW–Madison, Washington State)
semiengineering.com·16h
🌊CUDA Streams
Flag this post
Readable Code Is Unreadable
blog.wilsonb.com·4h·
Discuss: Hacker News
🔍Type Checkers
Flag this post
FG-CLIP 2: A Bilingual Fine-grained Vision-Language Alignment Model
paperium.net·1d·
Discuss: DEV
🧩Attention Kernels
Flag this post
Kahn’s Algorithm and Cycle Detection in Directed Graphs
dev.to·8h·
Discuss: DEV
🔀Operator Fusion
Flag this post
A Comparative Analysis of LLM Adaptation: SFT, LoRA, and ICL in Data-Scarce Scenarios
arxiv.org·8h
🎓Model Distillation
Flag this post
FGO MythBusters: Explaining how Kalman Filter variants achieve the same performance as FGO in navigation applications
arxiv.org·8h
🧠BF16
Flag this post
Few-Shot Multimodal Medical Imaging: A Theoretical Framework
arxiv.org·8h
🏎️TensorRT
Flag this post
Region-Aware Reconstruction Strategy for Pre-training fMRI Foundation Model
arxiv.org·8h
📊Gradient Accumulation
Flag this post
Fixed-point graph convolutional networks against adversarial attacks
arxiv.org·8h
🔀Operator Fusion
Flag this post
The Curvature Rate {\lambda}: A Scalar Measure of Input-Space Sharpness in Neural Networks
arxiv.org·8h
📉Model Quantization
Flag this post
Understanding Federated Learning: Best Practices for Implementing Privacy-Preserving AI in C# Projects
dev.to·1d·
Discuss: DEV
🔄ONNX
Flag this post
Branched Signature Model
arxiv.org·8h
🔢cuBLAS
Flag this post
Learning Sparse Approximate Inverse Preconditioners for Conjugate Gradient Solvers on GPUs
arxiv.org·1d
🔗NCCL
Flag this post
The Genetic Architecture of the Human Corpus Callosum and its Subregions
nature.com·1h
🧠BF16
Flag this post
Gated DeltaNet (Linear Attention variant in Qwen3-Next and Kimi Linear)
sebastianraschka.com·1d·
Discuss: r/LLM
👁️Attention Optimization
Flag this post
Hyper Hawkes Processes: Interpretable Models of Marked Temporal Point Processes
arxiv.org·8h
🏎️TensorRT
Flag this post
Why stop at 1 million tokens when you can have 10? My journey to extreme context on a gaming GPU. [P]
reddit.com·1h·
🏎️TensorRT
Flag this post