In Defense of Curiosity
At the NeurIPS Mechanistic Interpretability Workshop, I was asked to respond to Neel Nanda’s recent blog post on "pragmatic interpretability." I chose to respond by recounting the story of Venetian glassmaking.

Venice has been a historical center of glassmaking since the Roman Empire, and you can still get fine artistic glass from Murano today. This engraving depicts Venice’s Doge visiting glassworks in Murano in the 17th century, and you can see some of the artistic glass on the table. I chose this topic because M…
In Defense of Curiosity
At the NeurIPS Mechanistic Interpretability Workshop, I was asked to respond to Neel Nanda’s recent blog post on "pragmatic interpretability." I chose to respond by recounting the story of Venetian glassmaking.

Venice has been a historical center of glassmaking since the Roman Empire, and you can still get fine artistic glass from Murano today. This engraving depicts Venice’s Doge visiting glassworks in Murano in the 17th century, and you can see some of the artistic glass on the table. I chose this topic because Murano in the 17th century was going through a transformation that very much reminds me of the moment we are going through in mechanistic interpretability today.
Pragmatic Glassmaking

If you visited Murano in 1600, you would see "Pragmatic Glassmaking" everywhere. The artisans had mastered the secret of making the finest flawless "cristallo" glass, and they had discovered that the fine glass could be ground into lenses. And that if you do it right, you can cure blurry vision.
Thank goodness for glasses.
I would be nearly blind without them.
In 1600 Murano was a center of optics, with artisans making and selling spectacles all over town. They had turned glassmaking from art to application, a very important and life-changing application: repairing vision, so crucial to humanity. The invention of eyeglasses has been considered one of the most important inventions in human history, effectively doubling the productive working life of anyone who needed them.
That is the urgency of pragmatic science, and in our field we are going through something similar.
Three Perspectives
I divide our research goals in my lab into three buckets, each about a different set of research questions and motivations.
First is the adversarial view, which is the most "pragmatic" of the three perspectives.
As AI becomes more sophisticated, there can be a gap between the output of a model and the internal knowledge of the model. In other words, "what the AI says" can be different from "what the AI thinks" internally. Traditional machine learning has a handle on the output, which is the role of benchmarking and evaluation. But the internal thoughts? That is the special domain of interpretability researchers. The goal of interpretability here is to be a "lie detector" to bridge this gap, and with the pragmatic turn there will be more of this work.
See, for example, Marks et al. 2025 on auditing language models for hidden objectives, or Rager et al. 2025 on discovering forbidden topics in language models. We should also wonder whether the mere presence of censorship or hidden objectives amounts to lying, and so a more foundational look at deception would involve tracing a rational intent to create a false belief; for example, the detailed reasoning underneath theory of mind as studied in Prakash et al. 2025.
But I do not personally think this focus on the adversarial problem is the most important of the three perspectives. I think the most important is the second area.
Second is the empowerment view. The empowerment question asks: what will be the role of humans in a world filled with intelligent AI? Will humans know and think about anything anymore? Will we be able to understand the world?
In short, will the deployment of AI make us all dumb?
Interpretability plays a special role in machine learning because instead of focusing on making the AI smarter, we focus on improving human insight. I think this is the most important category of interpretability research, and we do not do enough of it.
There is room for a lot of innovative work here, from new training objectives to new user interaction models, focused on expanding human insight. It is possible, for example, to teach humans new things that AIs learn, things that no humans have known before. Lisa Schut’s paper "Bridging the Human AI Knowledge Gap" is not a paper from my lab, but I think it is a very important paper. You should read it.
Third is the scientific view, and that is what I am here to defend.
The scientific view is important because we live in a Copernican moment. For 5000 years, people have asked "what does it mean to think," and the answer has always placed the human mind at the center. Humans have been our only example of rational thinking, with animals like whales and octopi out at the outskirts. People are smarter than monkeys! So "thinking" has been synonymous with "being human."
But with AI, this view is changing, and the human mind is losing its privileged place at the center. Our work in this area asks fundamental questions about mechanism and meaning. See Meng et al. 2022 on locating and editing factual associations in GPT, Todd et al. 2024 on the emergence of the representation of function, or Feucht et al. 2025 on the emergence of the representation of language-independent meaning, which begins to brush at questions about the gap between language and meaning posed by Wittgenstein.
The Copernican Moment
The Venetian glassmakers would understand. They lived through the original Copernican moment.

In October 1608, a spectacle-maker named Hans Lipperhey in the Dutch town of Middelburg applied for a patent on an instrument "for seeing things far away as if they were nearby." According to one story, he got the idea when children playing in his shop noticed that a distant weather vane seemed much closer when viewed through two lenses at once. By combining multiple lenses in a tube, he had made a spyglass. A spyglass is different from a spectacle, because it expands your vision so that you can see much better and farther than normal. They started making these devices to help merchants spot distant ships. It was an immediate commercial success; word of the invention spread across Europe within months.
But just one year later, something bigger happened.

In Venice in June 1609, a professor of mathematics named Galileo Galilei heard about the "Dutch perspective glass." Within days he had built his own, without ever seeing one. He was an excellent experimentalist, and by grinding his own lenses he quickly improved the magnification from three-fold to twenty-fold. He demonstrated his improved instrument to the Venetian Senate from the top of the bell tower in St. Mark’s Square, pointing it at Padua thirty-five miles away, at Murano where one could see figures entering a church. The senators, impressed by the military potential, doubled his salary.
But then Galileo did something strange.
At the time, there was no reason to point a telescope at the sky. There was no pragmatic issue. There was no ship on the horizon there.
Galileo was just curious.
He could never have guessed that he would see the moons of Jupiter, or shadows of craters on the moon, or the mysterious rings of Saturn. Gazing at Jupiter, Galileo was the first to witness a new profound truth about the Earth. Not only is Earth another planet, but he could now see the reality of what a planet is. What he saw revolutionized the way humans understand the universe.

Follow Your Curiosity
That is my message. We must be curious. Because we are living in a Copernican revolution.
For the first time in 5000 years, our understanding of Rational Thought is changing. Aristotle had proposed that humans are the unique rational creature; St. Thomas Aquinas thought that it was our rationality that made humans uniquely spiritual; and Descartes famously declared "I think therefore I am," identifying rationality with personhood.
But now our creation of thinking devices disproves these old philosophers. The human mind is no longer so firmly at the center of the universe, because our AI now gives us, for the first time, a second example of intelligence. One that we can observe from afar, but also one that we can slice into and disassemble, inspecting every calculation, every neuron, every moment of learning.
The revolution confronts us with many fundamental questions: What is thinking? What is belief? What is meaning? What is agency? What is consciousness?
Even if there is no practical reason to answer these questions, we should follow our curiosity. Our Copernican moment opens many ancient previously-nonscientific questions, and now we can make them into scientific questions. As we pursue our pragmatic glassmaking, we should take a moment to point our lenses at the stars. We should remember to ask: what does it mean, to think?
Keep doing good long-term science.
This post is adapted from a short talk at the NeurIPS 2025 Mechanistic Interpretability Workshop.
Posted by David at December 9, 2025 01:08 PM