Kimi Linear: An Expressive, Efficient Attention Architecture
arxiviq.substack.com·16h·
Discuss: Substack
🧩Attention Kernels
Flag this post
Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm EnablesFine-Grained Policy Optimization
paperium.net·4h·
Discuss: DEV
🧩Attention Kernels
Flag this post
ViSurf: Visual Supervised-and-Reinforcement Fine-Tuning for LargeVision-and-Language Models
dev.to·13h·
Discuss: DEV
🧩Attention Kernels
Flag this post
[D] Best (free) courses on neural networks
reddit.com·23h·
🧩Attention Kernels
Flag this post
Everything About Transformers
krupadave.com·3d
🧩Attention Kernels
Flag this post
The middle brother in classifier development: What is RandAugment?
openaccess.thecvf.com·3h·
Discuss: DEV
📊Gradient Accumulation
Flag this post
MobileNetV3 Paper Walkthrough: The Tiny Giant Getting Even Smarter
towardsdatascience.com·2h
🎯Tensor Cores
Flag this post
Your Transformer is Secretly an EOT Solver
elonlit.com·2d·
Discuss: Hacker News
Flash Attention
Flag this post
ViSurf: Visual Supervised-and-Reinforcement Fine-Tuning for LargeVision-and-Language Models
paperium.net·13h·
Discuss: DEV
🏎️TensorRT
Flag this post
A Minimal Route to Transformer Attention
neelsomaniblog.com·3d·
Discuss: Hacker News
🧩Attention Kernels
Flag this post
University of Surrey researchers mimic brain wiring to improve AI - BBC
news.google.com·2h
Flash Attention
Flag this post
Neural bases of sustained attention during naturalistic parent-infant interactions
nature.com·2d
🧩Attention Kernels
Flag this post
Scalable In-Memory Associative Processing for Graph Neural Network Inference
dev.to·3h·
Discuss: DEV
Flash Attention
Flag this post
Learning to program "recycles" preexisting F-P pop codes of logical algorithms
jneurosci.org·50m·
Discuss: Hacker News
📊Gradient Accumulation
Flag this post
Minimax pre-training lead explains why no linear attention
reddit.com·3d·
Discuss: r/LocalLLaMA
Flash Attention
Flag this post
Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations
arxiv.org·2d
🧮cuDNN
Flag this post
Weak-To-Strong Generalization
lesswrong.com·12h
📉Model Quantization
Flag this post
AI-Driven Marketing: How Intelligent Architecture Boosts Visibility and Impact
future.forem.com·8h·
Discuss: DEV
🤖AI Coding Tools
Flag this post
From hours to seconds: AI tools to detect animal calls
seangoedecke.com·8h·
Discuss: Hacker News
📉Model Quantization
Flag this post
When Five Dumb AIs Beat One Smart AI: The Case for Multi-Agent Systems
ksramalakshmi.medium.com·4h·
Discuss: r/LocalLLaMA
🔄ONNX
Flag this post