Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Transformers
🤖 Transformers
Specific
Attention Mechanism, BERT, GPT Architecture, Sequence Modeling
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
133
posts in
6.3
ms
Researchers say they trained a foundation
model
from scratch for about $1,500
🤖
reinforcement learning, deep learning, machine learning
venturebeat.com
·
1d
1 day ago
·
Hacker News
Actions for Researchers say they trained a foundation model from scratch for about $1,500
Automated doubt 🤔, open code review 📝, how LLMs really work 🔨
✍️
Prompt Engineering
tldr.tech
·
4d
4 days ago
Actions for Automated doubt 🤔, open code review 📝, how LLMs really work 🔨
Apple WWDC On-Device AI Deep Dive - Google Docs
🤖
reinforcement learning, deep learning, machine learning
gist.is
·
1d
1 day ago
·
Hacker News
Actions for Apple WWDC On-Device AI Deep Dive - Google Docs
Weekly Bookmarks
🔤
NLP
inkdroid.org
·
5d
5 days ago
Actions for Weekly Bookmarks
The Next Industrial Revolution
🔤
NLP
Content type:
Blog
hexmhell.writeas.com
·
3d
3 days ago
Actions for The Next Industrial Revolution
Machine learning from scratch, what to build before using scikit-learn
🤖
reinforcement learning, deep learning, machine learning
Content type:
Tutorial
iwtlp.com
·
1d
1 day ago
·
DEV
Actions for Machine learning from scratch, what to build before using scikit-learn
Introducing North Mini Code: Cohere’s First
Model
For Developers
✍️
Prompt Engineering
Content type:
Blog
huggingface.co
·
2d
2 days ago
·
Hacker News
Actions for Introducing North Mini Code: Cohere’s First Model For Developers
Boltzmann
Attention
: Learnable Ising Couplings for Cooperative
Attention
📊
Embeddings
Content type:
Academic
arxiv.org
·
5h
5 hours ago
Actions for Boltzmann Attention: Learnable Ising Couplings for Cooperative Attention
A deep learning framework for emotion recognition in music using multimodal data fusion
🤖
reinforcement learning, deep learning, machine learning
Content type:
Academic
nature.com
·
1d
1 day ago
Actions for A deep learning framework for emotion recognition in music using multimodal data fusion
markusheimerl/gpt
: A generative pretrained
transformer
implementation
🔤
NLP
Content type:
Code
github.com
·
6d
6 days ago
·
Hacker News
Actions for markusheimerl/gpt: A generative pretrained transformer implementation
VelocityFM: Short-Horizon Protein Trajectory Prediction via Flow Matching in Velocity Space
🎮
Q-Learning
Content type:
Academic
biorxiv.org
·
4d
4 days ago
Actions for VelocityFM: Short-Horizon Protein Trajectory Prediction via Flow Matching in Velocity Space
Mixture-of-Experts (MoE), Explained: Why “Active Parameters” Decide What Runs on Your Machine
🤖
reinforcement learning, deep learning, machine learning
vettedconsumer.com
·
11h
11 hours ago
·
Hacker News
Actions for Mixture-of-Experts (MoE), Explained: Why “Active Parameters” Decide What Runs on Your Machine
The
Sequence
Knowledge #874:
Transformers
or Not?
🤖
reinforcement learning, deep learning, machine learning
substackcdn.com
·
2d
2 days ago
·
Substack
Actions for The Sequence Knowledge #874: Transformers or Not?
Human-Like
Neural
Nets
by Catapulting
🤖
reinforcement learning, deep learning, machine learning
gwern.net
·
5d
5 days ago
·
Hacker News
Actions for Human-Like Neural Nets by Catapulting
Tokenminning: Because Tokenmaxxing Is a Bad Idea
✍️
Prompt Engineering
tokenminning.com
·
2d
2 days ago
·
Hacker News
Actions for Tokenminning: Because Tokenmaxxing Is a Bad Idea
What the ocean taught me about AI.
🔤
NLP
Content type:
Blog
medium.com
·
3d
3 days ago
Actions for What the ocean taught me about AI.
Operator Fusion for
LLM
Inference on the Tensix
Architecture
🔤
NLP
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Operator Fusion for LLM Inference on the Tensix Architecture
NVIDIA at Computex 2026: RTX Spark Gaming Hands-On, DLSS 4.5, and More
🔤
NLP
techpowerup.com
·
20h
20 hours ago
Actions for NVIDIA at Computex 2026: RTX Spark Gaming Hands-On, DLSS 4.5, and More
The Memory Problem is Solved: How Google’s Memory Caching Makes RNNs Smart Again
🤖
reinforcement learning, deep learning, machine learning
Content type:
Blog
medium.com
·
3d
3 days ago
Actions for The Memory Problem is Solved: How Google’s Memory Caching Makes RNNs Smart Again
Architecture-Aware
Reinforcement Learning Makes Sliding-Window
Attention
Competitive in Math Reasoning
✍️
Prompt Engineering
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Architecture-Aware Reinforcement Learning Makes Sliding-Window Attention Competitive in Math Reasoning
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help