Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Transformers
🤖 Transformers
Specific
Attention Mechanism, Self-Attention, BERT, Architecture
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
79
posts in
14.1
ms
Claude Mythos Glasswing: Why AI Vuln Discovery Terrifies Me
🧠
LLM
Content type:
Blog
Content type:
Discussion
tildalice.io
·
6d
6 days ago
Actions for Claude Mythos Glasswing: Why AI Vuln Discovery Terrifies Me
Hasse Diagrams for
Attention
: A Partial Order Framework for Designing
Transformer
Masks
🧠
LLM
Content type:
Academic
arxiv.org
·
20h
20 hours ago
Actions for Hasse Diagrams for Attention: A Partial Order Framework for Designing Transformer Masks
Customer Churn Prediction on Structured Data Using
FT-Transformer
and Stacking Ensembles
📊
Statistics
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Customer Churn Prediction on Structured Data Using FT-Transformer and Stacking Ensembles
A Mean-Field Analysis of
Multi-Head
Self-Attention
under Cross-Entropy Training
📐
Optimization Theory
Content type:
Academic
arxiv.org
·
20h
20 hours ago
Actions for A Mean-Field Analysis of Multi-Head Self-Attention under Cross-Entropy Training
See, Act, Correct: three levers for working with a code agent
🎮
Reinforcement Learning
Content type:
Blog
blog.owulveryck.info
·
6d
6 days ago
·
Hacker News
,
Hacker News
Actions for See, Act, Correct: three levers for working with a code agent
princezuda/-RequiemGPT-: Fully open source and open weights built and trained by fable five with one prompt. An experience in how AI actually works
🤖
AI
Content type:
Code
github.com
·
1d
1 day ago
·
Hacker News
Actions for princezuda/-RequiemGPT-: Fully open source and open weights built and trained by fable five with one prompt. An experience in how AI actually works
Parallel Causal Associative Fields: Gated Sparse Memory for Long-Context
Language
Modeling
🎛️
Control Systems
Content type:
Academic
arxiv.org
·
20h
20 hours ago
Actions for Parallel Causal Associative Fields: Gated Sparse Memory for Long-Context Language Modeling
Introducing the Third Generation of Apple’s Foundation
Models
🤖
AI
machinelearning.apple.com
·
3d
3 days ago
·
Hacker News
,
r/apple
Actions for Introducing the Third Generation of Apple’s Foundation Models
DMT: Demographic Conditioning, Morphology-Enhanced
Transformer
for Cuffless Blood Pressure Estimation from PPG Signals
📶
Communications
Content type:
Academic
arxiv.org
·
20h
20 hours ago
Actions for DMT: Demographic Conditioning, Morphology-Enhanced Transformer for Cuffless Blood Pressure Estimation from PPG Signals
Beyond Item IDs: Scaling Short-Form-Video Recommendation via Semantic-Native Long Sequence
Modeling
💬
LLMs
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Beyond Item IDs: Scaling Short-Form-Video Recommendation via Semantic-Native Long Sequence Modeling
Human-Like
Neural
Nets
by Catapulting
🧠
LLM
gwern.net
·
4d
4 days ago
·
Hacker News
Actions for Human-Like Neural Nets by Catapulting
History of WYSIWYG editors and CMS: a timeline (2022)
💾
Retro Computing
Content type:
Blog
tiny.cloud
·
6h
6 hours ago
·
Hacker News
Actions for History of WYSIWYG editors and CMS: a timeline (2022)
Attention
at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal
Transformer
Kernels
🧠
LLM
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Attention at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal Transformer Kernels
Dynamic Linear
Attention
🧠
LLM
Content type:
Academic
arxiv.org
·
20h
20 hours ago
Actions for Dynamic Linear Attention
DeepSeek Made AI Cheap. Now It Needs Billions to Keep It Cheap.
🚀
Startups
Content type:
News
Content type:
Blog
chinacompany.substack.com
·
6d
6 days ago
·
Substack
Actions for DeepSeek Made AI Cheap. Now It Needs Billions to Keep It Cheap.
From
Architecture
to Output: Structural Origins of Hallucination in
Large
Language
Models and the Amplifying Role of Data
📊
Statistics
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for From Architecture to Output: Structural Origins of Hallucination in Large Language Models and the Amplifying Role of Data
Best-Known Sorting
Networks
🗄️
Vector Databases
bertdobbelaere.github.io
·
6d
6 days ago
·
Hacker News
Actions for Best-Known Sorting Networks
Overcoming
Decoder
Inconsistencies in Whisper for Dravidian and Low-Resource
Languages
🧠
LLM
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Overcoming Decoder Inconsistencies in Whisper for Dravidian and Low-Resource Languages
An Expanded Synthetic Conversation Dataset for
Multi-Turn
Smishing Detection
🧠
LLM
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for An Expanded Synthetic Conversation Dataset for Multi-Turn Smishing Detection
DxPTA: An
Architecture
Design Space Exploration with Optical Dataflow-guided Strategy for HW/SW Co-Design of Photonic
Transformer
Accelerators
📐
Semidefinite Programming
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for DxPTA: An Architecture Design Space Exploration with Optical Dataflow-guided Strategy for HW/SW Co-Design of Photonic Transformer Accelerators
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help