Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🤖 Transformers
Specific
Attention Mechanism, BERT, GPT Architecture, Sequence Models
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
135101
posts in
38.5
ms
BERT
vs. Transformers: A Complete Architectural and Mathematical
Dissection
pub.towardsai.net
·
1d
🌳
Pratt Parsing
The
Transformer
Architecture,
Visualized
vizuaranewsletter.com
·
2d
·
Discuss:
Hacker News
💬
Natural Language Processing
Math needs thinking time,
everyday
knowledge needs memory, and a new
Transformer
architecture aims to deliver both
the-decoder.com
·
11h
🧠
Memory Models
Identifying
Early Warning Signs of Attention Mechanism
Instability
adiyogiarts.com
·
1d
·
Discuss:
DEV
📈
Delta Encoding
Hallucinations
n AI Neural Networks
youtube.com
·
5h
📱
Edge AI
Embeddings
in
NLP
: How Machines Turn Words Into Meaning
medium.com
·
2d
🧮
Embeddings
Engineering
Verifiable
Modularity
in Transformers via Per-Layer Supervision
arxiv.org
·
2d
⚓
Anchors
Finding features in Transformers: Contrastive directions
elicit
stronger low-level perturbation responses than
baselines
lesswrong.com
·
1d
🤖
TVM
FFN
signals: Transformers reveal they are
guessing
before generation
orsonai.com
·
20h
·
Discuss:
Hacker News
🤖
TVM
Hand
Tracing
Transformer
Architecture like Good Old days
pub.towardsai.net
·
13h
💾
Retro Computing
Understanding
Seq2Seq
Neural Networks – Part 6: Decoder
Outputs
and the Fully Connected Layer
dev.to
·
1d
·
Discuss:
DEV
🤖
TVM
Optimal Splitting of Language Models from
Mixtures
to
Specialized
Domains
arxiv.org
·
2d
📝
Parsing
Modelwerk
: Beyond
Transformers
dehora.net
·
3d
💬
Prompt Engineering
Online
supervised
learning of temporal patterns in
biological
neural networks under feedback control
pnas.org
·
3d
⏱️
Time Series Analysis
Less-relevant results
A better method for
identifying
overconfident
large language models
techxplore.com
·
3d
🧠
Machine Learning
Transformers
as
Constrained
Optimization
jiha-kim.github.io
·
3d
·
Discuss:
Hacker News
🤖
TVM
Intrinsic stabilization of
synaptic
plasticity
improves learning and robustness in artificial neural networks
nature.com
·
3d
👁️
Computer Vision
OpenAI launches GPT-5.4 mini and
nano
for faster, cost-efficient AI
workloads
alternativeto.net
·
4d
🦙
Ollama
Breaking the 100M Token Limit:
EverMind
's
MSA
Architecture Achieves Efficient End-to-End Long-Term Memory for LLMs
mykxlg.com
·
3d
🧠
PIM
Eamon2009/Transformer-language-model
: An educational implementation of a GPT-style language model built from scratch using PyTorch to understand how transformer-based AI models work. No pre-trained
weights
. No fine-tuning,can be trained on $300 laptop
github.com
·
1d
·
Discuss:
r/LocalLLaMA
🔥
PyTorch
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help