Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
👁️ Attention Optimization
Flash Attention, Memory Efficient, Sparse Attention, Transformers
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
112155
posts in
487.2
ms
Understanding and Optimizing Attention-Based
Sparse
Matching for
Diverse
Local Features
arxiv.org
·
3d
🧩
Attention Kernels
Prism
:
Spectral-Aware
Block-Sparse Attention
arxiv.org
·
3d
🧩
Attention Kernels
Spot
The
Difference
seekingalpha.com
·
3h
📊
Profiling
Training-Free Real-Time Control for
Autoregressive
Video Generation
daydream.live
·
20h
·
Discuss:
Hacker News
🏎️
TensorRT
Architectural and Mathematical
Foundations
of Machine Learning: A
Rigorous
Synthesis of Theory, Geometry, and Implementation
chizkidd.github.io
·
1d
·
Discuss:
Hacker News
📉
Model Quantization
Larger
AI Models Are Not Always Better At
Remembering
Facts, Research Reveals
quantumzeitgeist.com
·
18h
🎓
Model Distillation
OpenAI introduces GPT‑5.3‑
Codex
‑Spark, an ultra-fast coding model powered by
Cerebras
neowin.net
·
6h
⚡
Flash Attention
A C implementation of the inference pipeline for the Mistral AI’s
Voxtral
Realtime
4B model
blog.adafruit.com
·
18h
🏎️
TensorRT
Arming the rebels with GPUs:
Gradium
,
Kyutai
, and Audio AI
amplifypartners.com
·
5h
·
Discuss:
Hacker News
🏎️
TensorRT
MiniMaxAI
MiniMax-M2.5 has
230b
parameters and 10b active parameters
openhands.dev
·
13h
·
Discuss:
r/LocalLLaMA
⏱️
Benchmarking
How
Andrej
Karpathy
Built a Working Transformer in 243 Lines of Code
analyticsvidhya.com
·
21h
📜
TorchScript
New Generative
Paradigm
:
Drifting
Model
mail.bycloud.ai
·
2d
📊
Gradient Accumulation
Transformer-Based Memory Forecasting: Leveraging
Anonymized
Aggregates
for Personal Insights
novice.media
·
1d
·
Discuss:
Hacker News
⚡
Flash Attention
Space Alignment Matters: The Missing Piece for
Inducing
Neural Collapse in
Long-Tailed
Learning
sonomarpa.sonoma.lib.ca.us
·
11h
📊
Gradient Accumulation
Show HN:
ProductFront-Streamlined
product discovery platform for maximum exposure
productfront.tech
·
1d
·
Discuss:
Hacker News
🤖
AI Coding Tools
Recursive
Language Models: Stop
Stuffing
the Context Window
nlp.elvissaravia.com
·
14h
⚡
ONNX Runtime
An
Ontology
of Representations: Limits of
Universality
lesswrong.com
·
13h
🔄
ONNX
Ming-flash-omni-2.0
: 100B MoE (6B active) omni-modal model - unified
speech/SFX/music
generation
huggingface.co
·
16h
·
Discuss:
r/LocalLLaMA
⚡
Flash Attention
Cuentos
: A Large-Scale Eye-Tracking Reading
Corpus
on Spanish Narrative Texts
nature.com
·
1d
🧩
Attention Kernels
Prompting Best Practices for
Instruction-Following
Rerankers
zeroentropy.dev
·
21h
·
Discuss:
Hacker News
🤖
AI Coding Tools
Sign up or log in to see more results
Sign Up
Login
« Page 2
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help