Kimi Linear: An Expressive, Efficient Attention Architecture
arxiviq.substack.com·1d·
Discuss: Substack
🧩Attention Kernels
Flag this post
Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm EnablesFine-Grained Policy Optimization
paperium.net·12h·
Discuss: DEV
🧩Attention Kernels
Flag this post
ViSurf: Visual Supervised-and-Reinforcement Fine-Tuning for LargeVision-and-Language Models
dev.to·21h·
Discuss: DEV
🧩Attention Kernels
Flag this post
Can-t stop till you get enough
cant.bearblog.dev·4h·
Discuss: Hacker News
📜TorchScript
Flag this post
[D] Best (free) courses on neural networks
reddit.com·1d·
🧩Attention Kernels
Flag this post
Everything About Transformers
krupadave.com·3d
🧩Attention Kernels
Flag this post
The middle brother in classifier development: What is RandAugment?
openaccess.thecvf.com·11h·
Discuss: DEV
📊Gradient Accumulation
Flag this post
MobileNetV3 Paper Walkthrough: The Tiny Giant Getting Even Smarter
towardsdatascience.com·10h
🎯Tensor Cores
Flag this post
Your Transformer is Secretly an EOT Solver
elonlit.com·2d·
Discuss: Hacker News
Flash Attention
Flag this post
ViSurf: Visual Supervised-and-Reinforcement Fine-Tuning for LargeVision-and-Language Models
paperium.net·21h·
Discuss: DEV
🏎️TensorRT
Flag this post
A Minimal Route to Transformer Attention
neelsomaniblog.com·3d·
Discuss: Hacker News
🧩Attention Kernels
Flag this post
University of Surrey researchers mimic brain wiring to improve AI - BBC
news.google.com·10h
Flash Attention
Flag this post
Neural bases of sustained attention during naturalistic parent-infant interactions
nature.com·2d
🧩Attention Kernels
Flag this post
Unlocking AI Potential: Squeezing Giant Models into Tiny Spaces
dev.to·26m·
Discuss: DEV
📉Model Quantization
Flag this post
Vision = Language: I Decoded VLM Tokens to See What AI 'Sees' 🔬
reddit.com·4h·
Discuss: r/LocalLLaMA
🛠Ml-eng
Flag this post
Learning to program "recycles" preexisting F-P pop codes of logical algorithms
jneurosci.org·8h·
Discuss: Hacker News
📊Gradient Accumulation
Flag this post
Minimax pre-training lead explains why no linear attention
reddit.com·3d·
Discuss: r/LocalLLaMA
Flash Attention
Flag this post
Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations
arxiv.org·2d
🧮cuDNN
Flag this post
AI-Driven Marketing: How Intelligent Architecture Boosts Visibility and Impact
future.forem.com·16h·
Discuss: DEV
🤖AI Coding Tools
Flag this post
Product Designer's workflow for prototyping with Cursor
hvpandya.com·7h·
Discuss: Hacker News
🤖AI Coding Tools
Flag this post