Diverting trains of thought, wasting precious time
Mon, 10 Nov 2025
Reversing abstractions: an existential crisis
Computer science in general, and language implementation in particular, are founded on the idea of realising abstractions in a way I’ll call “forwards” and also “existential”. This post is about how these aren’t the only way to think about programming infrastructure and abstractions.
The theory goes that you pick some desired abstraction—it could be a programming language, maybe an abstract data type, or something else. Then you figure out the realisation. In the case of a programming language, this could be memory layouts, calling conventions and a compiler that generates object code modelling these. For abstract data types, it’s some specific concrete data str…
Diverting trains of thought, wasting precious time
Mon, 10 Nov 2025
Reversing abstractions: an existential crisis
Computer science in general, and language implementation in particular, are founded on the idea of realising abstractions in a way I’ll call “forwards” and also “existential”. This post is about how these aren’t the only way to think about programming infrastructure and abstractions.
The theory goes that you pick some desired abstraction—it could be a programming language, maybe an abstract data type, or something else. Then you figure out the realisation. In the case of a programming language, this could be memory layouts, calling conventions and a compiler that generates object code modelling these. For abstract data types, it’s some specific concrete data structure and the mapping from abstract to concrete operations.
That’s the “forwards” part. It’s also “existential” when you keep the details of the realisation to yourself. Client code (e.g. source code in our new language) knows there is a realisation but the implementation choices themselves are hidden. In turn this allows the implementer to change their minds, while having indemnified themselves against any fallout when this breaks things. (Cynical, me?)
I find myself drawing an ASCII-art diagram like this.
existential . abstraction
v / | \
v //|||\\
v . . . . . . . possible realisations -- pick one! hide it!
So far, so textbook. However, this hiding isn’t always done. Examples of “opting out” of hiding are ABIs and debuggers.
ABIs opt out of hiding in favour of a public convention: certain implementation choices are documented publicly, while also being embodied in code. This allows independent implementations to interoperate. (It’s a pity ABIs aren’t often defined in some more machine-readable, declarative form.)
DWARF-style debugging, meanwhile, as I’ve written about before, opts out of hiding by instead operating at the meta-level. Debuggers consume machine-readable descriptions of the implementation details, generated by compilers. Again, like ABIs, this is motivated by something about independent implementations working together (but... wait).
The ABI example has simply dropped the existential quantification and left us with something we could call “propositional” i.e. a single, elaborated set of implementation decisions.
But the debugger’s approach goes further. It is the opposite of existential. It is a “universal” program! It consumes a description of how to view the concrete, abstractly. In principle it can recover the abstract view of any concretion, within some domain of discourse. It’s universal, not existential.
universal . abstraction
^ / | \
^ //|||\\
^ . . . . . . . possible realisations -- recognise all!
Being universal and going backwards are important but neglected abilities for a robust and flexible infrastructure. They seem necessary for any solution to Licklider’s “communicating with aliens” problem (a name actually due to paraphrasing by Alan Kay, I suspect). Debuggers communicate with an alien language implementation.
(Admittedly, communication with debuggers is bootstrapped by a shared meta-level convention, such as DWARF. With true aliens this would not be realistic; we should expect to share almost nothing at all. Still, with debuggers we have eliminated the need to share base-level knowledge, i.e. knowledge of specific language implementations.)
I’d go so far as to say the orthodox neglect of the universal and the backwards, in favour of the forwards and existential, is part of the reason why software is not a soft material. Programming artifacts remain brittle because they are balanced on propositions that are both fixed and hidden—opaque to other artifacts that would build on them, or interact with them. These ideals go back to the dream of programming as a purely mathematical entity. Rather like a crystal “grown” by logical inference, traditional viewpoints aspire to software that is perfectly formed at all times. We should “never need” to change it, because it’s perfect by definition! But push it out of its intended alignment, and it will shatter.
Contrast this with biological organisms, which are chock-full of mechanisms for probing and adapting to their environment. Almost no software is like this. (Notable exception: GNU autoconf. Stay with me on this.) Similarly, phenotropic computing remains a distant and unfamiliar idea. (The idea is Jaron Lanier’s, but Clayton Lewis’s analysis, linked, is the best introduction I know.)
As a field, “programming languages” research seems particularly averse to this “backwards” direction. Its orthodox approaches are wedded to existentialism, and pay no regard to universality. On floating the above concerns, of adaptability and flexibility, the usual reaction is: who needs all that “complexity”? Isn’t it all so “unnecessary”? Whether it be a debugger consuming large amounts of descriptive metadata, or autoconf performing large amounts of probing of the build environment, and so on, the urge is to eliminate it all. Can’t we instead just make everything really really uniform? Universality, or working backwards from an open space of concretions, feels like the wrong thing. We should “just” work forward from “the right thing” and be left with our beautiful crystalline product.
- With a logical hat on, classical logic’s insistence on monotonicity, i.e.—a baked-in “forward progress” of reasoning (albeit “forward” in a slightly different sense to mine here), was explored in a much-neglected ECOOP 2011 paper by Klaus Ostermann and others.
“Modern” programming language implementations continue to adopt this existential, “enforced uniformity” spirit, to an increasing degree: not only by de-prioritising cross-language use cases, but also by trying to “own” more and more of the tooling. The expected scope of a single language implementation now includes tools for linking, build-scripting, debugging, profiling, package management, and possibly more. I put “modern” in scorn-quotes because although this approach brings some short-term wins, it is a losing trajectory: it embeds monoculture, drives up fragmentation and (in my view) works against innovation and quality in all these tools, viewed globally. (The latter is a property common to what seem like locally productive ways forward.)
Going “backwards” is not just about debugging. Consider link-time optimisation (LTO). How does it work? Conceptually, the compiler generates some .o files “as usual” and then a link-time optimiser comes in and somehow contextually optimises them further, in their particular combination. But doing exactly that would require “going backwards”: lifting the .o files back up to some kind of intermediate representation (IR) for the optimiser to work with. So how does it really work, at present? It substitutes going backwards with delaying going forwards. The .o files basically contain “fake object code” that will be discarded by the link-time optimiser. The same files also secretly contain IR, in compiler-specific additional sections, that will be picked up by the link-time optimiser (provided it belongs to the same compiler that generated the .o files). Rather than lifting, we do various contortions that let us instead delay lowering. Going backwards would feel like “doing things wrong”.
- This presence of “fake data” can be very confusing to a developer who is debugging compiler problems. Also, the multiplicity of optimisation times, with their subtly different semantics around admissible optimisations, easily lead to subtle compiler bugs. In the era of GCC 5.x and 6.x, one of my main projects was broken by a GCC symbol interposition bug caused by the wrong logic being applied during link-time optimisations. I should mention that my mooted approach to LTO wouldn’t necessarily help with this problem (although it might).
Delayed lowering is easier to implement than lossless lifting. So maybe I am too quick to attribute to cultural sensibilities what could instead be attributed to expedience. But to finish this post, here are two anecdotes where I really did get pushback for aesthetic or even purportedly “scientific” reasons on the “backwards” and “universal” approaches. In one, I’m forbidden from saying who I was talking to (Chatham House rules) so I’ll keep it anonymous for both.
In conversation with a famous language implementer I floated the idea that to my mind, a clean approach to link-time optimisation would use debugging information to lift object code back to IR. I got a polite but insistent response that this was not the way. “Really?! Why not some intermediate representation? Surely that is the right thing?” I could only say that yes, I understand why monotonicity feels cleaner, and that it might seem like wasted work for the compiler to lower, only for the link-time optimiser to lift again. However, that work is not wasted because lifting object code (or its behaviour) up to a higher level is something that needs to work—for example, it is essential for debugging. Debugging should work well (currently it doesn’t). So should binary analysis: binaries should be tractable, and this requires exactly the “reverse” direction of abstraction. Moreover, we could do this much better if our tools and languages were to take a universal not existential approach: analysis tools should work with any binary from some wide universe of discourse, not just those produced by a toolchain which shares secret knowledge with the analysis tool.
Doing link-time optimisation in this way would be one more way of proving the same ideas.
- Indeed we could imagine avoiding the lifting via less-fake .o files that contain only IR—if we could be sure a well-matched link-time optimiser will be available later to work with that IR. That makes such a solution less flexible than one that can work back from arbitrary object code and standard debugging information. However, in either case, the bottom line is that going backwards still needs to work!
- It’s no coincidence that similar observations lie behind the current DARPA E-BOSS programme that I’m peripherally involved with. It’s not practical (nor has it ever) for deployed binaries to be opaque, “perfect” artifacts. The need for rolling analysis of security vulnerabilities is one motivation for this. Although a verified binary would be “better”, security is a leading edge and verification a trailing one.
In another conversation, with a different and also very well-known figure in programming language research, the adherence to existentialism was even more pronounced. I even had to explain that it’s not necessary to recompile your debugger when your code’s binary interface changes! I think that was just because I’d let my interlocutor get their thoughts tangled, from my advocacy of this “surprising” universal approach. But all this just makes my point. The existential approach is ingrained, while the universal one is unfamiliar at best or reviled at worst. Working “forwards” is the “right thing” and going “backwards” feels wrong.
One neighbouring-ish field where existentialism does not rule supreme is networking. We know that internetworking is a thing. We know that IP is, by design, “universal”. In fact it’s universal in two directions: we can embed IP datagrams into “any” underlying network’s framing abstraction (for some value of “any”), and we can encode (m)any higher-layer protocols into IP datagrams that describe them (at least nominally, through the “protocol” field). That is the hourglass design. IP is not without its limitations. It cannot encapsulate all alien protocols well, nor does it embed equally well into all underlying alien networks. But the valuable idea is that of supporting diversity, by supporting an open set of concretions rather than insisting on existential uniformity where a single implementation choice is hidden. In both cases there is an abstraction at work! But IP’s abstraction is radically different than a classical, existential approach.
The fact that this uniformity doesn’t scale—doesn’t scale to the human diversity of networks, applications and their rate of change—was the underlying motivation for the Internet and its famous hourglass architecture. Uniformity and “forward-only abstraction” eventually reach their social and cultural scaling limit. That’s why it’s past time that we thought seriously about the backwards and the universal, not least in the world of programming language implementation
(To make things concrete: as one tiny indication of a less uniform, more universal approach to language implementation, witness this modified implementation of CPython, mostly the work of Guillaume Bertholon, which can interface with any compiled native code, up to available metadata. It’s a tiny step along a very long journey of course, and only yet at the workshop stage. Meanwhile for a more leftfield take on universality in programming languages, also enabled by an unusual application of debugging information, there is my thesis work on the Cake language.)
It’s also not a coincidence that whereas “scale” is normally taken to mean scaling of homogeneous application deployments to ever larger hardware sizes, my use of it above intends it in a social sense. It should be obvious to anyone with perspective that software performance is now limited by human factors, primarily social and cultural, far more than it is limited by machine performance. So I’d argue that any performance argument against the potential overheads of approaches I’ve written about here—universality, backward reasoning and so on—should not be given much weight. I may write more about that in a future post.
*[/research] [all entries] permalink contact *