Disciplined Biconvex Programming
arxiv.orgยท3h
๐Ÿ“‰Model Quantization
Flag this post
Attention Is All You Need for KV Cache in Diffusion LLMs
paperium.netยท4hยท
Discuss: DEV
๐ŸŽฏTensor Cores
Flag this post
Fast, Scalable LDA in C++ with Stochastic Variational Inference
github.comยท17hยท
Discuss: r/cpp
๐ŸŽ๏ธTensorRT
Flag this post
Enhanced spatial clustering of single-molecule localizations with graph neural networks
nature.comยท1d
๐Ÿ”€Operator Fusion
Flag this post
Connectivity Structure and Dynamics of Nonlinear Recurrent Neural Networks
journals.aps.orgยท8h
๐Ÿ“‰Model Quantization
Flag this post
Enhanced Richardson Extrapolation via Adaptive Kernel Regression and Uncertainty Quantification
dev.toยท18hยท
Discuss: DEV
๐Ÿ”„ONNX
Flag this post
My First Multi-GPU Kernel: Writing All-to-All for AMD MI300X
gau-nernst.github.ioยท1dยท
Discuss: Hacker News
๐ŸŽฏGPU Kernels
Flag this post
T3: Test-Time Model Merging in VLMs for Zero-Shot Medical Imaging Analysis
arxiv.orgยท1d
๐ŸŽ๏ธTensorRT
Flag this post
H-FA: A Hybrid Floating-Point and Logarithmic Approach to Hardware Accelerated FlashAttention
arxiv.orgยท3h
โšกFlash Attention
Flag this post
Matrix Phylogeny: Compact Spectral Fingerprints for Trap-Robust Preconditioner Selection
arxiv.orgยท3h
๐Ÿ”€Operator Fusion
Flag this post
A Practitioner's Guide to Kolmogorov-Arnold Networks
arxiviq.substack.comยท1dยท
Discuss: Substack
๐Ÿ“‰Model Quantization
Flag this post
Relation-Aware Bayesian Optimization of DBMS Configurations Guided by Affinity Scores
arxiv.orgยท1d
โšกONNX Runtime
Flag this post
Dynamic Model Selection for Trajectory Prediction via Pairwise Ranking and Meta-Features
arxiv.orgยท3h
๐Ÿ”€Operator Fusion
Flag this post
Explore More, Learn Better: Parallel MLLM Embeddings under Mutual Information Minimization
arxiv.orgยท3h
๐Ÿ› Ml-eng
Flag this post
How Transformer Models Detect Anomalies in System Logs
hackernoon.comยท14h
๐Ÿ“ŠGradient Accumulation
Flag this post
News for October 2025
ptreview.sublinear.infoยท9h
๐Ÿ”„ONNX
Flag this post
Hybrid-Attention models are the future for SLMs
inference.netยท6hยท
Discuss: Hacker News
โšกFlash Attention
Flag this post
Transformer-Based Decoding in Concatenated Coding Schemes Under Synchronization Errors
arxiv.orgยท3h
โšกFlash Attention
Flag this post
Predicting Encoding Energy from Low-Pass Anchors for Green Video Streaming
arxiv.orgยท3h
๐ŸŽ๏ธTensorRT
Flag this post