Attention Mechanisms

Feeds to Scour
SubscribedAll
Scoured 97 posts in 6.6 ms

A Mean-Field Analysis of Multi-Head Self-Attention under Cross-Entropy Training

 🤖Transformer Architecture  Content type: Academic
arxiv.org·

markusheimerl/gpt: A generative pretrained transformer implementation

 🤖Transformer Architecture  Content type: Code
github.com··Hacker News
Less-relevant results

Machine learning from scratch, what to build before using scikit-learn

 🧠Neural Network Architectures  Content type: Tutorial
iwtlp.com··DEV

Your LLM Isn’t Reading Your Manners — It’s Counting Your Tokens

 🤖Transformer Architecture  Content type: Blog
medium.com
·

ELI5 is a terrible learning prompt, here's the structural reason it fails and a 4-level replacement that actually sticks

 🤖Transformer Architecture  Content type: Blog  Content type: Tutorial

VelocityFM: Short-Horizon Protein Trajectory Prediction via Flow Matching in Velocity Space

 🤖Transformer Architecture  Content type: Academic
biorxiv.org·

Apple WWDC On-Device AI Deep Dive - Google Docs

 🧠Neural Network Architectures
gist.is··Hacker News

The Sequence Knowledge #874: Transformers or Not?

 🤖Transformer Architecture

BeeLlama.cpp DFlash on Strix Halo: 2.7x Gemma 31B, But MTP Is Still Faster

 🔮ML
sleepingrobots.com·

How LLMs Actually Work: A Friendly Map for Humans • oreoro

 🤖Transformer Architecture

Contribution Weights: A Geometrical Analysis of Self-Attention Transformers

 🤖Transformer Architecture  Content type: Academic
arxiv.org·

Analyzing the geometric dependence of thermoelastic Q -factor in micro hemispherical resonators via a data-augmented CNN-transformer model

 🤖Transformer Architecture  Content type: Academic
nature.com·

The Inference Alpha: Maximizing Frontier Models on AMD

 🤖Transformer Architecture  Content type: Blog
digitalocean.com·

princezuda/-RequiemGPT-: Fully open source and open weights built and trained by fable five with one prompt. An experience in how AI actually works

 🐍Python  Content type: Code
github.com··Hacker News

Context windows in AI: why every token is a budget decision

 🤖Transformer Architecture  Content type: Blog
redis.io·

Look Less, Reason More: Block-wise Attention Skipping for Efficient Multimodal LLMs

 🤖Transformer Architecture  Content type: Academic
arxiv.org·

Efficient and Training-Free Single-Image Diffusion Models

 🎲Synthetic Data Generation

DDI_single: Single-Sequence-Based Protein Domain Assembly

 🚀Model Deployment  Content type: Academic
biorxiv.org·

Stateful Swarms: How Persistent Memory Beats Traditional Agent Architectures

 🤖Transformer Architecture  Content type: News

Breaking tunnel vision, imaging AI lifts fluorescence image restoration accuracy and speed

 🧠Neural Network Architectures
phys.org·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help