Feeds to Scour
SubscribedAll
Scoured 80672 posts in 316.7 ms
Explicit Multi-head Attention for Inter-head Interaction in Large Language Models
arxiv.org·1d
🔄Sequence-to-Sequence Models
Preview
Report Post
Can Transformers Learn Causality? Part 3: What This Means For Deployment and Practice
philippmuller.bearblog.dev·1d
🔄LSTM Networks
Preview
Report Post
Memory Retrieval in Transformers: Insights from The Encoding Specificity Principle
arxiv.org·6h
👁️Attention Mechanisms
Preview
Report Post
DeepViT: Towards Deeper Vision Transformer
dev.to·10h·
Discuss: DEV
🧠Deep Learning
Preview
Report Post
Speed Up Training using the PyTorch Reference API
arcsin.bearblog.dev·1d
👁️Attention Mechanisms
Preview
Report Post
Giving AI the ability to monitor its own thought process could help it think like humans
livescience.com
·16h
👁️Attention Mechanisms
Preview
Report Post
New graph attention network models higher-order relationships in complex graph data
techxplore.com·17h
🕸️Graph Neural Networks
Preview
Report Post
Transformer Series - Blog #4 How the word "Bank" knows what it means: Self-Attention explained intuitively
dev.to·1d·
Discuss: DEV
👁️Attention Mechanisms
Preview
Report Post
Training a 67M-parameter transformer on an M4 Mac Mini
geddydukes.com·18h·
Discuss: Hacker News
🚀Model Deployment
Preview
Report Post
Efficient GAN-Based Anomaly Detection
paperium.net·11h·
Discuss: DEV
🎲Synthetic Data Generation
Preview
Report Post
Getting a custom PyTorch LLM onto the Hugging Face Hub (Transformers: AutoModel, pipeline, and Trainer)
gilesthomas.com·12h·
Discuss: Hacker News
🚀Model Deployment
Preview
Report Post
CNeuroMod-THINGS, a densely-sampled fMRI dataset for visual neuroscience
nature.com·40m
👁️Attention Mechanisms
Preview
Report Post
AI that talks to itself learns faster and smarter
sciencedaily.com·1d
🧠Deep Learning
Preview
Report Post
seemore: Implement a Vision Language Model from Scratch
huggingface.co·3d·
Discuss: Hacker News
👁️Attention Mechanisms
Preview
Report Post
Reflect Achieves Constitutional Alignment for Large Language Models Without Training Data
quantumzeitgeist.com·10h
🚀Model Deployment
Preview
Report Post
Understanding Multi-Head Latent Attention (MLA)
shreyansh26.github.io·3d·
👁️Attention Mechanisms
Preview
Report Post
Scientists May Have Found How the Brain Becomes One Intelligent System
scitechdaily.com·18h
👁️Attention Mechanisms
Preview
Report Post
Evidence of triple layer processing in LLMs: hidden thought behind the chain of thought. by Laureana Bonaparte
greaterwrong.com·3h
👁️Attention Mechanisms
Preview
Report Post
ML Systems Textbook
mlsysbook.ai·1d
🚀Model Deployment
Preview
Report Post

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help