Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
🤖 Transformer Architecture
Specific
Attention, BERT, GPT, Sequence Models
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
200449
posts in
22.3
ms
STS
: Efficient Sparse Attention with Speculative Token
Sparsity
📊
TF-IDF
arxiv.org
·
1d
The Expressive Power of Low Precision
Softmax
Transformers with (
Summarized
) Chain-of-Thought
🔢
Kolmogorov Complexity
arxiv.org
·
2h
SparseSAM
: Structured
Sparsification
of Activations in Segment Anything Models
👁️
Computer Vision
arxiv.org
·
2h
GiLT
:
Augmenting
Transformer Language Models with Dependency Graphs
🔗
RAG
arxiv.org
·
1d
Geometric
Factual
Recall in Transformers
🔢
Kolmogorov Complexity
arxiv.org
·
6d
RoiMAM
: Region-of-Interest Medical
Attention
Model for Efficient Vision-Language Understanding
👁️
Computer Vision
arxiv.org
·
1d
GQLA
: Group-Query
Latent
Attention for Hardware-Adaptive Large Language Model Decoding
🔢
Kolmogorov Complexity
arxiv.org
·
1d
Transformer
Interpretability
from Perspective of Attention and
Gradient
👁️
Computer Vision
arxiv.org
·
6d
Variational
Linear Attention: Stable
Associative
Memory for Long-Context Transformers
🔢
Kolmogorov Complexity
arxiv.org
·
6d
Representative
Attention For Vision
Transformers
👁️
Computer Vision
arxiv.org
·
4d
Clustering
in pure-attention
hardmax
transformers and its role in sentiment analysis
🔢
Kolmogorov Complexity
arxiv.org
·
5d
DemaFormer
:
Damped
Exponential Moving Average Transformer with Energy-Based Modeling for Temporal Language Grounding
💬
Natural Language Processing
arxiv.org
·
6d
Elastic
Attention
Cores
for Scalable Vision Transformers
👁️
Computer Vision
arxiv.org
·
6d
AttnGen
: Attention-Guided
Saliency
Learning for Interpretable Genomic Sequence Classification
🔢
Kolmogorov Complexity
arxiv.org
·
4d
Latent
Chain-of-Thought
Improves
Structured-Data Transformers
🧠
LLM Reasoning
arxiv.org
·
6d
Breaking Global Self-Attention
Bottlenecks
in Transformer-based
Spiking
Neural Networks with Local Structure-Aware Self-Attention
🎭
Anthropic Claude
arxiv.org
·
4d
Conditional
Attribute
Estimation with
Autoregressive
Sequence Models
🔢
Kolmogorov Complexity
arxiv.org
·
4d
ECG-NAT
: A Self-supervised Neighborhood Attention Transformer for Multi-lead
Electrocardiogram
Classification
🔍
Vector Search
arxiv.org
·
5d
Pretraining Language Models with
Subword
Regularization: An Empirical Study of
BPE
Dropout in Low-Resource NLP
✂️
Tokenization
arxiv.org
·
5d
A
Composite
Activation
Function for Learning Stable Binary Representations
🔍
Vector Search
arxiv.org
·
6d
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help