Attention Mechanisms

Feeds to Scour
SubscribedAll
Scoured 227 posts in 9.6 ms

When AI Agents “Pay Attention

 🤖Transformers
psychologytoday.com·

Intelligent inference scheduling with llm-d on Red Hat AI

 🔧LLVM
developers.redhat.com·

KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant KV cache + HIP-graph-safe Flash-Attention for llama.cpp, fully measured on real hardware.

 🤖AI  Content type: Code
github.com··Hacker News

Attention at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal Transformer Kernels

 🤖AI  Content type: Academic
arxiv.org·

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

 🤖AI  Content type: News  Content type: Blog
blog.google··Hacker News

Generative AI in the Real World: Agentic Systems Fundamentals with Maarten Grootendorst

 🤖Transformers  Content type: Audio
oreilly.com·

Massive AI Storage Demand Creates a New Memory Wall

 🔍RAG  Content type: News
eetimes.com·

Automated doubt 🤔, open code review 📝, how LLMs really work 🔨

 🤖Transformers
tldr.tech·

Context windows in AI: why every token is a budget decision

 🤖AI  Content type: Blog
redis.io·

A system programmer’s guide to LLM inference

 🌟Ray Tracing  Content type: Blog

What the ocean taught me about AI.

 🤖Transformers  Content type: Blog
medium.com·

Lung-SRAD: Spectral-Aware Regularized Audio DASS with Dual-Axis Patch-Mix Contrastive Learning for Respiratory Sound Classification

 📚Compilers  Content type: Academic
arxiv.org·

WEKA software speeds long context AI inferencing on Oracle’s public cloud

 low-level programming  Content type: News
blocksandfiles.com·

google/gemma-4-12B-it-qat-q4_0-gguf

 🤖AI
huggingface.co·

Report: GKE Inference Gateway delivers up to 92% faster AI responses

 🤖AI  Content type: Blog

Youssof Altoukhi (@Youssofal_)

 📊Profiling
xcancel.com··r/LocalLLaMA

everest-an/M1: AwareLiquid — MT-LNN with cloud-augmented memory, deliberation router, capsule v2, and Φ̂ reasoning trace. Improved successor to O1 (clean MT-LNN prototype).

 🤖AI  Content type: Code
github.com··Hacker News

Markov Chains: The Grandparents of LLMs

 🤖Transformers
dmanco.dev··Hacker News

Handshake: Partner-Specific Protein-Protein Binding Site Prediction at Scale Using ProstT5 and Cross-Chain Attention

 🤖Transformers  Content type: Academic
biorxiv.org·

Contribution Weights: A Geometrical Analysis of Self-Attention Transformers

 🤖Transformers  Content type: Academic
arxiv.org·
Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help