Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
🤖 Transformer Architecture
Specific
Attention, BERT, GPT, Sequence Models
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
198213
posts in
23.1
ms
A deep
dive
into the
Transformer
architecture
🧠
LLM Reasoning
blog.algomaster.io
·
5d
Attention
in
transformers
, step-by-step | Deep ...
🔍
Vector Search
3blue1brown.com
·
16h
Towards
Generalization
of Block Attention via Automatic Segmentation and Block
Distillation
🧠
Deep Learning
arxiv.org
·
1d
AI Paper Review: Language Models are Few-Shot
Learners
(
GPT-3
)
💬
Prompt Engineering
freecodecamp.org
·
9h
needle/docs/simple
_attention_
networks.md
at main
🤖
Local LLMs
github.com
·
2d
Explainable
AI:
Visualizing
Attention in Transformers
💬
Natural Language Processing
mlops.community
·
5d
The usual
implementaiton
of attention transformers (
SDPA
) is kind of bad, actually
🔢
Kolmogorov Complexity
gist.github.com
·
1d
·
Hacker News
AI 101: Your Ultimate Guide to Attention: Mechanism,
QKV
, and
KV
Cache
💬
Prompt Engineering
turingpost.com
·
5d
Tracing
Attention
Computation
Through Feature Interactions
💬
Prompt Engineering
transformer-circuits.pub
·
4d
SymbioNet
: Neuro-symbolic learning with morphological attention for interpretable acute
lymphoblastic
leukemia classification
🔍
Vector Search
sciencedirect.com
·
4d
Think In Diffusion:
Continuous
Latent
Diffusion Language Model
🎭
Anthropic Claude
mail.bycloud.ai
·
6d
Artificial Neural Networks (
ANNs
) and Deep Learning
Foundations
🧠
Deep Learning
medium.com
·
3d
Grokking
as Structural Inference:
Transformers
Need Bayesian Lottery Tickets
🔢
Kolmogorov Complexity
arxiv.org
·
1d
Attention
Dispersion
in Dynamic Graph Transformers: Diagnosis and a
Transferable
Fix
🔍
Vector Search
arxiv.org
·
1d
Transformer
Scalability
Crisis: The First Comprehensive
Empirical
Analysis of Performance Walls in Modern Language Models
🔢
Kolmogorov Complexity
arxiv.org
·
1d
STS
: Efficient Sparse Attention with Speculative Token
Sparsity
📊
TF-IDF
arxiv.org
·
1d
GiLT
:
Augmenting
Transformer Language Models with Dependency Graphs
🔗
RAG
arxiv.org
·
1d
RoiMAM
: Region-of-Interest Medical
Attention
Model for Efficient Vision-Language Understanding
👁️
Computer Vision
arxiv.org
·
1d
GQLA
: Group-Query
Latent
Attention for Hardware-Adaptive Large Language Model Decoding
🔢
Kolmogorov Complexity
arxiv.org
·
1d
Neural
Activation
Patterns Across Language Model
Architectures
: A Comprehensive Analysis of Cognitive Task Performance
🧩
Cognitive Architecture
arxiv.org
·
1d
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help