Christoph Breitkopf: Interval Tables in Common Lisp

Recently, I've been getting back to parensful programming. I started with Scheme in the 1980s after reading SICP, but for most of my programming, I've preferred statically typed languages. However, for some reason, interacting with Lisp code always gives me that warm, fuzzy feeling, so in the intervening years, I sometimes tried to get back to Scheme, but was always put off by the fractured ecosystem and incompatibilities between implementations. I remember trying Common Lisp too, but the fact that it's a Lisp-2, coupled with the ugly #'function syntax, drove me away before I had a chance to see the positives.

But last time I had a strong urge to write Lisp, I just sat down to prototype something larger in Common Lisp, and parts of it started to click. I grew accustomed to the les...

But last time I had a strong urge to write Lisp, I just sat down to prototype something larger in Common Lisp, and parts of it started to click. I grew accustomed to the less-than-ideal aspects (quoting from the CLtL2 index: "kludges, 1-971") and began to appreciate the scope of the language, its type system, the quality and compatibility of implementations, and the surprisingly stable library ecosystem. I've been using Common Lisp regularly for about two years now, and I felt it's time to port some libraries I've been using in other languages.

So I started writing a Lisp version of my Haskell IntervalMap library. When writing the Haskell version, I started out with a simple API using a concrete type for intervals, and later added a version using type classes. For the functions to provide in addition to those for interval queries, the Data.Map API was a good guideline. (And a source of much work - it's a large API with almost 100 functions, even if many are just variants of others. And since Haskell also has Data.Set there are IntervalSets, too.)

Common Lisp does not have sorted collections in the standard, and there's no widely accepted library either. As for other tables, the standard has property lists, association lists, and hash tables. The first is rather specialized; association lists are, well, lists; so only hash tables could serve as inspiration for the API. In comparison to Haskell's Data.Map, Lisps hash table API is small - just about 10 functions. Unlike most data structures in Haskell, and the pure subset of Lisp lists, hash-tables are not persistent, but are mutated when adding, changing, or deleting elements. It seemed advisable, if only for efficiency reasons, to make the interval table API use destructive operations like hash tables, and perhaps later offer a persistent version as an alternative.

Efficiency considerations also played a role in the API design. In Haskell, there's a lower barrier to returning, say, a list of tuples from a function, because the assumption is that the compiler will transform intermediate data structures away. In practice, that's more often an unfulfilled hope than a realistic assumption, since it requires coding things in a certain way when producing the result and sufficient inlining, which is problematic given the recursive structure of binary trees. In Lisp, consing (Lisp slang for "allocating on the heap") intermediate data structures will most certainly not be optimized away, so the API should avoid that as far as possible. Instead, it takes a function argument that is called with each key-value pair.

The most important decision, however, was how to handle ordering. Common Lisp lacks comparison predicates that work across all comparable types. So there are two options: pass an ordering predicate to the table constructor, or use CLOS generic methods to implement the necessary operations on intervals, like the Interval type class in the Haskell version. Not having used CLOS extensively so far, I decided to start with the seemingly simpler and more functional-style predicate version. I might add an alternative CLOS-based API later on. Using just a single predicate leaves the question of how to get at the lower and upper bounds of the intervals themselves. No problem with CLOS - just add generic methods. The critical realization was to pass the lower and upper interval bounds as separate values, obviating the need for an actual interval type.

Taken together, this led to the following basic API, choosing names to avoid clashes with standard functions:

make-interval-table predicate [bounds-type]
interval-table-count table
get-interval lower-bound upper-bound [default] (setf-able, of course)
delete-interval lower-bound upper-bound table
clear-intervals table
map-intervals function table

But where are the interval-lookup functions, like containing and intersecting? They turned out to fit quite nicely into the map paradigm, since most of the time, you want to process their results further. Thus, map-intervals has a result-type parameter just like the standard map function, and keyword arguments like :containing or :intersecting for interval queries. And there are some functions related to the table being ordered, such as get-min, get-max, delete-min, ... There's certainly room for improvement, and some experimentation with the API would be good, which is why I have not yet requested addition to Quicklisp.

Here is the code and documentation on GitHub: https://github.com/bokesan/interval-tables

Similar Posts