Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
👁️ Attention Optimization
Flash Attention, Memory Efficient, Sparse Attention, Transformers
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
175826
posts in
26.7
ms
Dynamic
Sparse
Attention: Access
Patterns
and Architecture
arxiv.org
·
1d
🧩
Attention Kernels
Learning to Recall with Transformers Beyond
Orthogonal
Embeddings
arxiv.org
·
18h
🧩
Attention Kernels
From
Seq2Seq
to
Infinite
Context: The 10-Year Evolution of Attention
pub.towardsai.net
·
1d
⚡
Flash Attention
The
Lab
: The
Attention
Problem
tavernresearch.com
·
11h
⚡
Flash Attention
What Comes After
Transformers
: Hybrid AI
Architecture
in 2026
philippdubach.com
·
2d
📊
Gradient Accumulation
MolmoPoint
: Better
pointing
architecture for vision-language models
allenai.org
·
7h
🧩
Attention Kernels
Build DeepSeek-V3: Multi-Head
Latent
Attention (
MLA
) Architecture
pyimagesearch.com
·
2d
📉
Model Quantization
VIOLET
: End-to-End Video-Language Transformers with Masked
Visual-tokenModeling
dev.to
·
4d
·
Discuss:
DEV
🧩
Attention Kernels
Mamba-3
together.ai
·
1d
·
Discuss:
Hacker News
,
r/LocalLLaMA
📉
Model Quantization
Built To Be An Ideal
Listener
, This AI Still Made The Same Mistakes As Human
Ears
studyfinds.com
·
7h
🧩
Attention Kernels
High-Efficiency
AI Image
Suites
trendhunter.com
·
1d
⚡
Flash Attention
Convergance
Theorem: Concept in
Perceptron
Algorithms and Margin of Dataset
medium.com
·
3h
📊
Gradient Accumulation
Show HN: The
Attention
Debt of AI
Tooling
(AI tools can increase
attention
cost)
wespiser.com
·
1d
·
Discuss:
Hacker News
,
Hacker News
🤖
AI Coding Tools
Eye movement benchmark data for
smooth-pursuit
classification
nature.com
·
2d
📊
Gradient Accumulation
Executing
programs inside transformers with
exponentially
faster inference
percepta.ai
·
5d
·
Discuss:
r/LocalLLaMA
⚡
Flash Attention
Computer Vision and Deep Learning: Part 5.5
rupaligarewal22.medium.com
·
21h
🧮
cuDNN
Memory For AI At The Edge
semiengineering.com
·
15h
⚡
Flash Attention
Show HN:
MaximusLLM
– Train
262k-vocab
LLMs on a single 16GB GPU
github.com
·
2d
·
Discuss:
Hacker News
⚡
Flash Attention
Seeing the Unseen: How
DeepStack
Revolutionizes
Vision Language Models
pub.towardsai.net
·
4h
🧩
Attention Kernels
MSADroid
: A pre-trained mamba-sparse self-attention model for android malware detection on long system call
sequences
sciencedirect.com
·
1d
📉
Model Quantization
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help