Linear Differential Vision Transformer: Learning Visual Contrasts via Pairwise Differentials
arxiv.org·3h
👁️Attention Optimization
Flag this post
Fast, Scalable LDA in C++ with Stochastic Variational Inference
github.com·17h·
Discuss: r/cpp
📊Gradient Accumulation
Flag this post
How to Create Your Own AI GPT: A Developer’s Guide
dev.to·2h·
Discuss: DEV
🤖AI Coding Tools
Flag this post
PhysMaster: Mastering Physical Representation for Video Generation viaReinforcement Learning
paperium.net·1d·
Discuss: DEV
📉Model Quantization
Flag this post
Gated DeltaNet (Linear Attention variant in Qwen3-Next and Kimi Linear)
sebastianraschka.com·1d·
Discuss: r/LLM
👁️Attention Optimization
Flag this post
How We Built a Custom Vision LLM to Improve Document Processing at Grab
engineering.grab.com·8h·
Discuss: Hacker News
🛠Ml-eng
Flag this post
Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail
arxiv.org·3h
ONNX Runtime
Flag this post
Enhanced Richardson Extrapolation via Adaptive Kernel Regression and Uncertainty Quantification
dev.to·18h·
Discuss: DEV
🔄ONNX
Flag this post
DEER: Disentangled Mixture of Experts with Instance-Adaptive Routing for Generalizable Machine-Generated Text Detection
arxiv.org·3h
🔄ONNX
Flag this post
Writing an LLM from scratch, part 27 – what's left, and what's next?
gilesthomas.com·7h·
Discuss: Hacker News
🎓Model Distillation
Flag this post
Post-training methods for language models
developers.redhat.com·1h
🎓Model Distillation
Flag this post
CoT-Saliency: Unified Chain-of-Thought Reasoning for Heterogeneous Saliency Tasks
arxiv.org·3h
👁️Attention Optimization
Flag this post
T3: Test-Time Model Merging in VLMs for Zero-Shot Medical Imaging Analysis
arxiv.org·1d
🔗Kernel Fusion
Flag this post
Hybrid-Attention models are the future for SLMs
inference.net·6h·
Discuss: Hacker News
Flash Attention
Flag this post
Can-t stop till you get enough
cant.bearblog.dev·1d·
Discuss: Hacker News
📜TorchScript
Flag this post
Beyond ImageNet: Understanding Cross-Dataset Robustness of Lightweight Vision Models
arxiv.org·3h
🧮cuDNN
Flag this post
Region-Aware Reconstruction Strategy for Pre-training fMRI Foundation Model
arxiv.org·3h
📊Gradient Accumulation
Flag this post
The Curvature Rate {\lambda}: A Scalar Measure of Input-Space Sharpness in Neural Networks
arxiv.org·3h
📉Model Quantization
Flag this post
Why Multimodal AI Broke the Data Pipeline — And How Daft Is Beating Ray and Spark to Fix It
hackernoon.com·1d
🧮cuDNN
Flag this post