Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🤖 Transformers
Specific
Attention Mechanism, BERT, GPT Architecture, Sequence Models
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
170973
posts in
52.5
ms
How
Transformers
Work
💬
Natural Language Processing
medium.com
·
2d
On The Application of
Linear
Attention in Multimodal
Transformers
🤖
TVM
arxiv.org
·
17h
From
RNNs
to
Transformers
🤖
TVM
blog.aeilot.top
·
6d
Paper page - Attention Sink in Transformers: A Survey on
Utilization
,
Interpretation
, and Mitigation
⚡
LMAX Disruptor
huggingface.co
·
1h
perceptrons-to-transformers/08-rnn
at main ·
rnilav/perceptrons-to-transformers
🔥
PyTorch
github.com
·
1d
·
DEV
Task
Bert
🦗
Pest
producthunt.com
·
6d
Token
Classification
💬
Natural Language Processing
medium.com
·
4d
INCRT
: An Incremental Transformer That
Determines
Its Own Architecture
⚡
Incremental Computation
arxiv.org
·
17h
Hierarchical Kernel Transformer: Multi-Scale Attention with an
Information-Theoretic
Approximation
Analysis
🧮
Embeddings
arxiv.org
·
1d
milanm/AutoGrad-Engine
: A complete GPT language model (training and inference) in ~600 lines of pure C#, zero dependencies
🤖
TVM
github.com
·
5d
·
Hacker News
Layerwise
Dynamics for In-Context Classification in
Transformers
🧮
Embeddings
arxiv.org
·
17h
Uncertainty-Aware
Transformers
:
Conformal
Prediction for Language Models
📝
Parser Combinators
arxiv.org
·
1d
Attention Sink in Transformers: A Survey on
Utilization
,
Interpretation
, and Mitigation
📦
Folly
arxiv.org
·
17h
EquiformerV3
: Scaling Efficient, Expressive, and General SE(3)-
Equivariant
Graph Attention Transformers
🤖
TVM
arxiv.org
·
1d
Revisiting
Anisotropy
in Language Transformers: The Geometry of Learning Dynamics
🧮
Embeddings
arxiv.org
·
1d
Loop, Think, &
Generalize
: Implicit Reasoning in
Recurrent-Depth
Transformers
🤖
TVM
arxiv.org
·
4d
Short Data, Long Context:
Distilling
Positional
Knowledge in Transformers
🌳
Pratt Parsing
arxiv.org
·
6d
On the Geometry of
Positional
Encodings
in Transformers
🧮
Embeddings
arxiv.org
·
6d
Transformer See, Transformer Do:
Copying
as an Intermediate Step in Learning
Analogical
Reasoning
🌳
Pratt Parsing
arxiv.org
·
5d
LAG-XAI: A Lie-Inspired
Affine
Geometric Framework for Interpretable
Paraphrasing
in Transformer Latent Spaces
🧮
Embeddings
arxiv.org
·
6d
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help