Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🤖 Transformer Architecture
Specific
Attention, BERT, GPT, Sequence Models
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
17737
posts in
98.2
ms
Understanding Attention Mechanisms – Part 2: Comparing
Encoder
and Decoder
Outputs
🤖
Transformers
dev.to
·
6d
·
DEV
·
…
Scaling seismic foundation models on AWS: Distributed training with Amazon
SageMaker
HyperPod
and expanding context windows
☁️
Hyperscaler Infra
aws.amazon.com
·
11h
·
…
AI by Hand Library ~ Attention, MHA,
MQA
,
GQA
🤖
AI Tools
byhand.ai
·
2d
·
…
OrionsLock/SALOMI
: Research code for extreme low-bit transformer quantization and inference.
🤖
LLM Inference
github.com
·
21h
·
Hacker News
·
…
not much
happened
today
✍️
Prompt Engineering
news.smol.ai
·
1d
·
…
Training a
Transformer
with
1970s-era
Technology
🤖
Transformers
hackaday.com
·
3d
·
…
We
reverse-engineered
KAIROS
from the Claude Code leak. Here's the open version.
📋
AGENTS.md
cathedral-ai.com
·
13h
·
DEV
·
…
AI Safety Guide for TRUE
Beginners
by TRUE
begginers
🛡️
AI Safety
lesswrong.com
·
5d
·
…
ReCUBE
Benchmark Reveals GPT-5 Scores Only 37.6% on
Repository-Level
Code Generation
🤖
Large Language Models
gentic.news
·
3d
·
DEV
·
…
Understanding Attention Mechanisms – Part 4: Turning
Similarity
Scores into Attention
Weights
🎯
RLHF
dev.to
·
2d
·
DEV
·
…
Self
Attention
Flow
~ New Release!
🧘
Mindfulness
byhand.ai
·
6d
·
…
RBF
Attention Reveals Dot‑Product's Hidden
Norm
Bias
🤖
Machine Learning
dev.to
·
23h
·
DEV
·
…
The
Internalization
of Gradients: From
Prebiotic
Chemistry to Mesa-Optimizers
🔗
Network Effects
lesswrong.com
·
3d
·
…
Context
Is All You Have: How LLM
Attention
Actually Works
🧠
Context Engineering
dev.to
·
1d
·
DEV
·
…
Why Every
Token
Costs More Than You Think
💸
Inference Costs
dev.to
·
10h
·
DEV
·
…
Understanding Attention
Mechanisms
– Part 5: How Attention
Produces
the First Output
🧬
Cognitive Science
dev.to
·
1d
·
DEV
·
…
How
Bifrost
Reduces GPT Costs and Response Times with Semantic
Caching
🦙
Ollama
dev.to
·
1d
·
DEV
·
…
Residual
Attention U-Net for Automated Multi-Class Segmentation of
COVID-19Chest
CT Images
🤖
Transformers
dev.to
·
6d
·
DEV
·
…
Self-improving
Coding Agents
✍️
Prompt Engineering
dev.to
·
6d
·
DEV
·
…
I Built an AI Resume
Analyzer
with GPT-5 vs
GPT-4o-mini
… and the Results Surprised Me 🚀
✍️
Prompt Engineering
dev.to
·
6d
·
DEV
·
…
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help