Friday, 16 January 2026
This is a follow up to my recent post on abstraction.
In it, I explain that abstraction is not an unalloyed good and that it also risks:
- Concentrating power
- Making us blind to diversity
- Software bloat
Ironically, that post was written mostly in the abstract. And, unless you wrestle with these decisions daily, it might not be obvious where those risks emerge.
It’s an awareness I’ve developed over time. And just recently, I found myself weighing the risks of an abstraction and choosing to avoid it. So, perhaps a concrete example will prove more informative?
I was considering the API to …
Friday, 16 January 2026
This is a follow up to my recent post on abstraction.
In it, I explain that abstraction is not an unalloyed good and that it also risks:
- Concentrating power
- Making us blind to diversity
- Software bloat
Ironically, that post was written mostly in the abstract. And, unless you wrestle with these decisions daily, it might not be obvious where those risks emerge.
It’s an awareness I’ve developed over time. And just recently, I found myself weighing the risks of an abstraction and choosing to avoid it. So, perhaps a concrete example will prove more informative?
I was considering the API to my tiny bloom filter library:
Commonly, these libraries encapsulate their data behind a public API. Here’s how the popular ‘bloom-filters’ package does it:
const {BloomFilter} = require('bloom-filters')
// create a Bloom Filter with a size of 10 and 4 hash functions
let filter = new BloomFilter(10, 4)
// insert data
filter.add('alice')
filter.add('bob')
You specify a size to the constructor and it allocates a suitable data array internally. But a common use for a bloom filter is to ship a filter built on one machine (which holds the full dataset) to another machine (which does not). This requires serializing the filter data.
But because the ‘bloom-filters’ package has abstracted away the raw data, the responsibility for import/export now rests on its shoulders. And, sure enough, they’ve added code to serialize and deserialize the bloom filter as JSON:
const serialized = filter.saveAsJSON();
const deserialized = BloomFilter.fromJSON(serialized);
Bloat +1 point. But what if JSON is not an appropriate format for our use case and we need a more compact encoding like Protobuf? Well, maybe we could ask the maintainer nicely to merge a patch? Centralised power +1.
Let’s consider an alternative: making the data array a parameter. Here’s how my tiny ‘bloom-filter’ library does it:
import BloomFilter from "./bloom-filter.js";
const data = new Uint8Array(120);
const hashes = 7;
const seed = 1;
const bloom = new BloomFilter(data, hashes, seed);
The internals of the bloom filter are now explicit arguments to the constructor. That’s more boilerplate but makes it immediately obvious how to export the data: just serialize the parameters.
// Serialize data
const serialized = data.toBase64();
// Recover bloom filter from serialized data
const deserialized = Uint8Array.fromBase64(serialized);
const bloom2 = new BloomFilter(deserialized, hashes, seed);
By not abstracting over the bloom filter’s storage, I can avoid adding import/export methods to the library. Bloat -1.
Another advantage is that I don’t have to make (possibly imperfect) decisions about encoding. Users of the library are free to choose any format that suits them without deferring to me. Self-reliance +1.
Let’s explore one more abstraction. This time more nuanced. Some libraries, like the ‘bloom-filters’ package, allow you to choose your own hash function:
const {BloomFilter, Hashing} = require('bloom-filters')
class CustomHashing extends Hashing {
serialize(_element, _seed) {
return BigInt(1)
}
}
const bl = BloomFilter.create(2, 0.01)
// override just your structure locally
bl._hashing = new CustomHashing()
bl.add('a')
Using the abstract Hashing class, I could swap xxhash for murmurhash3 without asking for permission. Self-reliance +1, right? But it also increases the size and complexity of the library, so maybe not.
In considering this abstraction, I looked at the size and complexity of my library. By avoiding earlier abstractions I had been able to keep it very small. Just 39 lines of code.
Since most people are unlikely to swap the default hashing algorithm, it seems reasonable to ask the rare few that do to fork the tiny module and swap it themselves.
This is the hunger of abstraction in evidence. By avoiding earlier abstractions it became reasonable to avoid a later one. The authors of the ‘bloom-filters’ library were less able to make that choice from the path they had followed.
It’s not really important whether you think I made the right choices here. All I want to point out is that choices were available. And in this case, a conscious decision was made to avoid further abstraction. If you design APIs, you will be presented with these trade-offs constantly.