Interpretability

Feeds to Scour
SubscribedAll
Scoured 73 posts in 7.1 ms

Semiconductor advances a 'must' for data centers, says Tokyo Electron boss

 🕸️Network Theory  Content type: News
asia.nikkei.com·

Sparse Autoencoders Reveal Interpretable and Steerable Features in VLA Models

 λType Theory  Content type: Academic
arxiv.org·

Detecting Bias in Generative AI

 🗳️Social Choice
psychologytoday.com·

Remake of Action Thriller Classic That Inspired 'John Wick' Gets Beautifully Bloodsoaked Update

 👥Sociology  Content type: News
movieweb.com·

Inside the Visual Mind: Neuroscience-Motivated Concept Circuits for Interpreting and Steering Vision Transformers

 🧠Cognitive Science  Content type: Academic
arxiv.org·

Waymo built a virtual driver to study how humans react to surprises on the road

 🌀Dynamical Systems  Content type: News
theverge.com
·

Query Lens: Interpreting Sparse Key-Value Features with Indirect Effects

 λType Theory  Content type: Academic
arxiv.org·

Whisper Hallucination Detection and Mitigation via Hidden Representation Steering and Sparse AutoEncoders

 📡Information Theory  Content type: Academic
arxiv.org·

scMTG reconstructs single-cell temporal dynamics with Markov transition generators

 📡Information Theory  Content type: Academic
biorxiv.org·

Introducing Waymo’s New Reference Model for Human Collision Avoidance

 🌀Dynamical Systems  Content type: Blog
waymo.com··Hacker News

SAEExplainer: Interpreting SAE Features with Activation-Guided Preference Optimization

 λType Theory  Content type: Academic
arxiv.org·

Uber opens a London waitlist for Wayve robotaxis as the UK’s driverless race kicks off

 🌀Dynamical Systems  Content type: News
thenextweb.com·

What We Saw Saturday Was Decades in the Making

 🏺Ancient History  Content type: Academic
today.troy.edu·

One Lens, Many Worlds : A Capability-Typed Interface for World-Model Interpretability

 λType Theory  Content type: Academic
arxiv.org·

Amazon Warehouse Has Editor-Tested Tech up to 70% Off. Here's How to Take Advantage of Early Prime Day Savings

 ⚙️Mechanism Design  Content type: News
popularmechanics.com·

Shared Latent Structures Enable Unified Backdoor Detection and Mitigation in LLMs

 📡Information Theory  Content type: Academic
arxiv.org·

Symmetry-adapted qubit encoding with complete active space and Bravyi--Kitaev mapping for quantum chemistry on a quantum computer

 📡Information Theory  Content type: Academic
arxiv.org·

The Tell-Tale Norm: $\ell_2$ Magnitude as a Signal for Reasoning Dynamics in Large Language Models

 λType Theory  Content type: Academic
arxiv.org·

Jennifer Winget To Marry William Ishmael? Meet Singapore-Based Businessman, Career, Net Worth

 🕸️Network Theory  Content type: News
in.mashable.com·

Mechanistic Analysis of Alignment Algorithms in Language Models

 ⚙️Compilers  Content type: Academic
arxiv.org·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help