Kimi Linear: An Expressive, Efficient Attention Architecture
arxiviq.substack.com·11h·
Discuss: Substack
👁️Attention Optimization
Flag this post
Everything About Transformers
krupadave.com·3d
👁️Attention Optimization
Flag this post
ViSurf: Visual Supervised-and-Reinforcement Fine-Tuning for LargeVision-and-Language Models
dev.to·8h·
Discuss: DEV
👁️Attention Optimization
Flag this post
A Minimal Route to Transformer Attention
neelsomaniblog.com·3d·
Discuss: Hacker News
👁️Attention Optimization
Flag this post
[D] Best (free) courses on neural networks
reddit.com·18h·
👁️Attention Optimization
Flag this post
ViSurf: Visual Supervised-and-Reinforcement Fine-Tuning for LargeVision-and-Language Models
paperium.net·8h·
Discuss: DEV
🏎️TensorRT
Flag this post
Dual-format attentional template during preparation in human visual cortex
elifesciences.org·4d
Flash Attention
Flag this post
Semantic search with embeddings in PHP: a hands-on guide using Neuron AI and Ollama
ollama.com·12h·
Discuss: DEV
🛠Ml-eng
Flag this post
An underqualified reading list about the transformer architecture
fvictorio.github.io·2d·
Discuss: Hacker News
Flash Attention
Flag this post
Specialized structure of neural population codes in parietal cortex outputs
nature.com·1d
Flash Attention
Flag this post
Everything About Transformers
krupadave.com·3d·
👁️Attention Optimization
Flag this post
A unified threshold-constrained optimization framework for consistent and interpretable cross-machine condition monitoring
sciencedirect.com·13h
⏱️Benchmarking
Flag this post
**Breaking the Curse of Dimensionality: A Game-Changer for L
dev.to·1d·
Discuss: DEV
👁️Attention Optimization
Flag this post
Weak-To-Strong Generalization
lesswrong.com·7h
📉Model Quantization
Flag this post
Sparse Adaptive Attention “MoE”: How I Solved OpenAI’s $650B Problem With a £700 GPU
medium.com·4d·
Flash Attention
Flag this post
RF-DETR Under the Hood: The Insights of a Real-Time Transformer Detection
towardsdatascience.com·1d
👁️Attention Optimization
Flag this post
To grow, we must forget… but now AI remembers everything
doc.cc·19h
👁️Attention Optimization
Flag this post
Minimax pre-training lead explains why no linear attention
reddit.com·3d·
Discuss: r/LocalLLaMA
Flash Attention
Flag this post
Your Transformer is Secretly an EOT Solver
elonlit.com·2d·
Discuss: Hacker News
👁️Attention Optimization
Flag this post