Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Attention Mechanisms
👁️ Attention Mechanisms
Specific
Self-Attention, Multi-Head Attention, KV Cache, Transformers
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
227
posts in
9.6
ms
When AI Agents “Pay
Attention
”
🤖
Transformers
psychologytoday.com
·
1d
1 day ago
Actions for When AI Agents “Pay Attention”
Intelligent inference scheduling with llm-d on Red Hat AI
🔧
LLVM
developers.redhat.com
·
20h
20 hours ago
Actions for Intelligent inference scheduling with llm-d on Red Hat AI
KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant
KV
cache
+ HIP-graph-safe
Flash-Attention
for llama.cpp, fully measured on real hardware.
🤖
AI
Content type:
Code
github.com
·
1d
1 day ago
·
Hacker News
Actions for KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant KV cache + HIP-graph-safe Flash-Attention for llama.cpp, fully measured on real hardware.
Attention
at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal
Transformer
Kernels
🤖
AI
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Attention at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal Transformer Kernels
Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency
🤖
AI
Content type:
News
Content type:
Blog
blog.google
·
6d
6 days ago
·
Hacker News
Actions for Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency
Generative AI in the Real World: Agentic Systems Fundamentals with Maarten Grootendorst
🤖
Transformers
Content type:
Audio
oreilly.com
·
20h
20 hours ago
Actions for Generative AI in the Real World: Agentic Systems Fundamentals with Maarten Grootendorst
Massive
AI Storage Demand Creates a New Memory Wall
🔍
RAG
Content type:
News
eetimes.com
·
1d
1 day ago
Actions for Massive AI Storage Demand Creates a New Memory Wall
Automated doubt 🤔, open code review 📝, how LLMs really work 🔨
🤖
Transformers
tldr.tech
·
3d
3 days ago
Actions for Automated doubt 🤔, open code review 📝, how LLMs really work 🔨
Context windows in AI: why every token is a budget decision
🤖
AI
Content type:
Blog
redis.io
·
1d
1 day ago
Actions for Context windows in AI: why every token is a budget decision
A system programmer’s guide to LLM inference
🌟
Ray Tracing
Content type:
Blog
blog.xiangpeng.systems
·
3d
3 days ago
·
Hacker News
Actions for A system programmer’s guide to LLM inference
What the ocean taught me about AI.
🤖
Transformers
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for What the ocean taught me about AI.
Lung-SRAD: Spectral-Aware Regularized Audio DASS with Dual-Axis Patch-Mix Contrastive Learning for Respiratory Sound Classification
📚
Compilers
Content type:
Academic
arxiv.org
·
16h
16 hours ago
Actions for Lung-SRAD: Spectral-Aware Regularized Audio DASS with Dual-Axis Patch-Mix Contrastive Learning for Respiratory Sound Classification
WEKA software speeds long context AI inferencing on Oracle’s public cloud
⚙
low-level programming
Content type:
News
blocksandfiles.com
·
1d
1 day ago
Actions for WEKA software speeds long context AI inferencing on Oracle’s public cloud
google/gemma-4-12B-it-qat-q4_0-gguf
🤖
AI
huggingface.co
·
6d
6 days ago
Actions for google/gemma-4-12B-it-qat-q4_0-gguf
Report: GKE Inference Gateway delivers up to 92% faster AI responses
🤖
AI
Content type:
Blog
cloud.google.com
·
2d
2 days ago
·
Hacker News
Actions for Report: GKE Inference Gateway delivers up to 92% faster AI responses
Youssof Altoukhi (@Youssofal_)
📊
Profiling
xcancel.com
·
4d
4 days ago
·
r/LocalLLaMA
Actions for Youssof Altoukhi (@Youssofal_)
everest-an/M1: AwareLiquid — MT-LNN with cloud-augmented memory, deliberation router, capsule v2, and Φ̂ reasoning trace. Improved successor to O1 (clean MT-LNN prototype).
🤖
AI
Content type:
Code
github.com
·
3h
3 hours ago
·
Hacker News
Actions for everest-an/M1: AwareLiquid — MT-LNN with cloud-augmented memory, deliberation router, capsule v2, and Φ̂ reasoning trace. Improved successor to O1 (clean MT-LNN prototype).
Markov Chains: The Grandparents of LLMs
🤖
Transformers
dmanco.dev
·
1d
1 day ago
·
Hacker News
Actions for Markov Chains: The Grandparents of LLMs
Handshake: Partner-Specific Protein-Protein Binding Site Prediction at
Scale
Using ProstT5 and Cross-Chain
Attention
🤖
Transformers
Content type:
Academic
biorxiv.org
·
4d
4 days ago
Actions for Handshake: Partner-Specific Protein-Protein Binding Site Prediction at Scale Using ProstT5 and Cross-Chain Attention
Contribution
Weights
: A Geometrical Analysis of
Self-Attention
Transformers
🤖
Transformers
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Contribution Weights: A Geometrical Analysis of Self-Attention Transformers
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help