Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🤖 Transformer Architecture
Specific
Attention, BERT, GPT, Sequence Models
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
174758
posts in
112.7
ms
ViT-AdaLA
: Adapting Vision Transformers with Linear Attention
arxiv.org
·
20h
🔍
Vector Search
Diverging
Transformer Predictions for Human Sentence Processing: A Comprehensive Analysis of Agreement
Attraction
Effects
arxiv.org
·
20h
🔗
RAG
From
RNN
to Attention: A Practical Journey Through
Sequence
Models
medium.com
·
3d
💬
Prompt Engineering
From
Seq2Seq
to
Infinite
Context: The 10-Year Evolution of Attention
pub.towardsai.net
·
1d
💬
Prompt Engineering
Build DeepSeek-V3: Multi-Head
Latent
Attention (
MLA
) Architecture
pyimagesearch.com
·
2d
🥶
Cold Start Problem
Why AND How??? DEEP LANGUAGE AI MODELS USE NEURAL NETWORKS WHICH WORKS LIKE HUMAN BRAIN!!!
medium.com
·
13h
🧠
Deep Learning
OpenAI launches GPT-5.4 mini and
nano
for faster, cost-efficient AI
workloads
alternativeto.net
·
11h
🤖
Local LLMs
Understanding
Seq2Seq
Neural Networks – Part 4: The
Encoder
and the Context Vector
dev.to
·
2d
·
Discuss:
DEV
🔍
Vector Search
Open source
Mamba
3 arrives to
surpass
Transformer architecture with nearly 4% improved language modeling, reduced latency
venturebeat.com
·
1d
🔢
Kolmogorov Complexity
MSADroid
: A pre-trained mamba-sparse self-attention model for android malware detection on long system call
sequences
sciencedirect.com
·
1d
🔢
Kolmogorov Complexity
The
Lab
: The
Attention
Problem
tavernresearch.com
·
13h
💬
Prompt Engineering
Show HN: The
Attention
Debt of AI
Tooling
(AI tools can increase
attention
cost)
wespiser.com
·
1d
·
Discuss:
Hacker News
,
Hacker News
💬
Prompt Engineering
Modelwerk
: Neural Networks as
Machinery
dehora.net
·
3d
·
Discuss:
Hacker News
🎭
Anthropic Claude
AI
Stumbles
On 1 In 4
Structured
Coding Tasks: Are Developers Paying Attention?
studyfinds.com
·
1d
💬
Prompt Engineering
How AI's post-training process suppresses the creativity and
whimsicality
seen in earlier models like GPT-2, leading to bad writing from many top AI models (
Jas
...
techmeme.com
·
13h
🎭
Anthropic Claude
ECG-GTMD
: A
CVD
diagnosis model based on graph-transformer with multi-domain fusion of spatiotemporal and band features
sciencedirect.com
·
1d
🕸️
Graph Theory
The 'LLM Architecture Gallery'
illustrates
the
architectures
of various large-scale language models such as GPT, Llama, and Grok.
gigazine.net
·
2d
🧠
LLM Reasoning
Seizing
the means of
attention
reveriesofahuman.com
·
1d
🧭
Content Discovery
PaTH Attention: How
Householder
Transformations Beat
RoPE
ai.gopubby.com
·
3d
💬
Prompt Engineering
Understanding
Seq2Seq
Neural Networks – Part 3: Stacking
LSTMs
in the Encoder
dev.to
·
3d
·
Discuss:
DEV
🔍
Vector Search
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help