Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
AI Interpretability
🔎 AI Interpretability
mechanistic interpretability, explainable AI, XAI, saliency
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
52
posts in
6.5
ms
Mechanistic
Interpretability
: The Key to Trusting Agentic
AI
🛡️
AI Safety
Content type:
Discussion
bradenkelley.com
·
4d
4 days ago
Actions for Mechanistic Interpretability: The Key to Trusting Agentic AI
Query
Lens
:
Interpreting
Sparse
Key-Value Features with Indirect Effects
🗄️
Vector Databases
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Query Lens: Interpreting Sparse Key-Value Features with Indirect Effects
Compositional and
interpretable
representation of histology using
AI
foundation models and
sparse
autoencoders
🔬
AI Research
Content type:
Academic
biorxiv.org
·
3d
3 days ago
Actions for Compositional and interpretable representation of histology using AI foundation models and sparse autoencoders
[Paper] Dictionary Learning Identifiability for Understanding SAEs
🔬
AI Research
lesswrong.com
·
5d
5 days ago
Actions for [Paper] Dictionary Learning Identifiability for Understanding SAEs
Playing with Vision Embeddings
🔬
AI Research
prestonbjensen.com
·
4d
4 days ago
·
Hacker News
Actions for Playing with Vision Embeddings
Shared Semantics, Divergent
Mechanisms
: Unsupervised
Feature
Discovery by Aligning Semantics and
Mechanisms
🛡️
AI Safety
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Shared Semantics, Divergent Mechanisms: Unsupervised Feature Discovery by Aligning Semantics and Mechanisms
Coelho Mollo and Millière: The Vector Grounding Problem
🔬
AI Research
philosophyofbrains.com
·
4d
4 days ago
Actions for Coelho Mollo and Millière: The Vector Grounding Problem
Less-relevant results
BioByte 162: The Hype of Virtual Cells, ESMC's AlphaFold3-Like Performance, and the Prediction of Antibody Non-Specificity
🔬
AI Research
Content type:
Blog
decodingbio.substack.com
·
5d
5 days ago
·
Substack
Actions for BioByte 162: The Hype of Virtual Cells, ESMC's AlphaFold3-Like Performance, and the Prediction of Antibody Non-Specificity
When
Attribution
Patching
Lies: Diagnosis and a Second-Order Correction
🛡️
AI Safety
Content type:
Academic
arxiv.org
·
8h
8 hours ago
Actions for When Attribution Patching Lies: Diagnosis and a Second-Order Correction
Who Elected Anthropic?
🛡️
AI Safety
Content type:
Blog
vizierprime.substack.com
·
5d
5 days ago
·
Substack
Actions for Who Elected Anthropic?
The technical community can't be the main character in
AI
safety anymore
🛡️
AI Safety
substackcdn.com
·
3d
3 days ago
·
Substack
Actions for The technical community can't be the main character in AI safety anymore
FoldSAE: Learning to Steer Protein Folding Through
Sparse
Representations
🔬
AI Research
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for FoldSAE: Learning to Steer Protein Folding Through Sparse Representations
scMTG reconstructs single-cell temporal dynamics with Markov transition generators
🌐
Open Source
Content type:
Academic
biorxiv.org
·
3d
3 days ago
Actions for scMTG reconstructs single-cell temporal dynamics with Markov transition generators
Sparse
Autoencoders
Reveal
Interpretable
and Steerable Features in VLA Models
🧠
LLMs
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Sparse Autoencoders Reveal Interpretable and Steerable Features in VLA Models
Thoughts on 'Learning
Mechanics
'
🔬
AI Research
lesswrong.com
·
6d
6 days ago
Actions for Thoughts on 'Learning Mechanics'
Interactions Between Crosscoder
Features
: A Compact Proofs Perspective
🛡️
AI Safety
Content type:
Academic
arxiv.org
·
8h
8 hours ago
Actions for Interactions Between Crosscoder Features: A Compact Proofs Perspective
Pre-Intervention Prediction of
Sparse
Autoencoder
Steering Side Effects
🧠
LLMs
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Pre-Intervention Prediction of Sparse Autoencoder Steering Side Effects
Interpreting
and Steering a Text-to-Speech Language Model with
Sparse
Autoencoders
🧠
Language Models
Content type:
Academic
arxiv.org
·
8h
8 hours ago
Actions for Interpreting and Steering a Text-to-Speech Language Model with Sparse Autoencoders
A Geometric View for Understanding Concept Learning and Neuron
Interpretation
in
Sparse
Autoencoders
🔬
AI Research
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for A Geometric View for Understanding Concept Learning and Neuron Interpretation in Sparse Autoencoders
SAE It Across Models:
Explaining
Features
With Foreign NLA Verbalizers
🧠
LLMs
lesswrong.com
·
4d
4 days ago
Actions for SAE It Across Models: Explaining Features With Foreign NLA Verbalizers
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help