Skip to main content
Scour
Browse
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🤖 Transformer Architecture
Self-Attention, BERT, GPT, Multi-Head Attention
Filter Results
Timeframe
Hot
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
80672
posts in
316.7
ms
Explicit
Multi-head Attention for Inter-head
Interaction
in Large Language Models
arxiv.org
·
1d
🔄
Sequence-to-Sequence Models
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Can
Transformers
Learn
Causality
? Part 3: What This Means For Deployment and Practice
philippmuller.bearblog.dev
·
1d
🔄
LSTM Networks
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Memory Retrieval in Transformers: Insights from The Encoding
Specificity
Principle
arxiv.org
·
6h
👁️
Attention Mechanisms
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
DeepViT
: Towards
Deeper
Vision Transformer
dev.to
·
10h
·
Discuss:
DEV
🧠
Deep Learning
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Speed Up Training using the
PyTorch
Reference
API
arcsin.bearblog.dev
·
1d
👁️
Attention Mechanisms
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
**Abstract:** This paper introduces a novel approach to dynamic
facial
expression
synthesis and real-time control leveraging Multi-Modal Attention-Guided Gen...
freederia.com
·
20h
🎲
Synthetic Data Generation
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Giving AI the
ability
to monitor its own thought process could help it think like
humans
livescience.com
·
16h
👁️
Attention Mechanisms
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
New graph
attention
network models higher-order
relationships
in complex graph data
techxplore.com
·
17h
🕸️
Graph Neural Networks
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Transformer Series - Blog #4 How the word "Bank"
knows
what it means: Self-Attention explained
intuitively
dev.to
·
1d
·
Discuss:
DEV
👁️
Attention Mechanisms
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Training a
67M-parameter
transformer on an M4 Mac Mini
geddydukes.com
·
18h
·
Discuss:
Hacker News
🚀
Model Deployment
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Efficient
GAN-Based
Anomaly
Detection
paperium.net
·
11h
·
Discuss:
DEV
🎲
Synthetic Data Generation
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Getting a custom
PyTorch
LLM onto the Hugging Face Hub (Transformers:
AutoModel
, pipeline, and Trainer)
gilesthomas.com
·
12h
·
Discuss:
Hacker News
🚀
Model Deployment
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
CNeuroMod-THINGS
, a
densely-sampled
fMRI dataset for visual neuroscience
nature.com
·
40m
👁️
Attention Mechanisms
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
AI that talks to
itself
learns
faster and smarter
sciencedaily.com
·
1d
🧠
Deep Learning
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
seemore
:
Implement
a Vision Language Model from Scratch
huggingface.co
·
3d
·
Discuss:
Hacker News
👁️
Attention Mechanisms
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Reflect
Achieves
Constitutional
Alignment for Large Language Models Without Training Data
quantumzeitgeist.com
·
10h
🚀
Model Deployment
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Understanding Multi-Head
Latent
Attention (
MLA
)
shreyansh26.github.io
·
3d
·
Discuss:
Hacker News
,
r/LocalLLaMA
👁️
Attention Mechanisms
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Scientists May Have Found How the Brain
Becomes
One
Intelligent
System
scitechdaily.com
·
18h
👁️
Attention Mechanisms
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Evidence of triple layer processing in LLMs: hidden thought behind the chain of thought. by
Laureana
Bonaparte
greaterwrong.com
·
3h
👁️
Attention Mechanisms
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
ML
Systems
Textbook
mlsysbook.ai
·
1d
🚀
Model Deployment
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help