Fluent Visitors: revisiting a classic design pattern

It’s been a while since I’ve written a pure programming post. I was recently implementing a specialist collection class that contained items of a number of different types. I needed to be able to iterate over the collection performing different actions depending on the specific type. There are lots of different ways to do this, depending on the school of programming you prefer. In this article, I’m going to take a look at a classic “Gang of Four” design pattern: The Visitor Pattern. I’ll describe how it works, provide some modern spins on it, and compare it to other ways of implementing the same functionality. Hopefully even the most die-hard anti-OO/patterns reader will come away thinki…

(Design Patterns? In this economy?)

A simple example

The example I’ll use in this post is a simple arithmetic expression language. It’s the kind of boring and not very realistic example you see all the time in textbooks, but the more realistic examples I have to hand have too many weird details, so this’ll do. I’m going to write everything in Java 25. Java because, after Smalltalk, it’s probably the language most associated with design patterns. And Java 25 specifically because it makes this example really nice to write.

OK, our expression language just has floating-point numbers, addition, and multiplication. So we start by defining datatypes to represent these:

public sealed interface Expression {
record Num(double value) implements Expression {}
record Add(Expression lhs, Expression rhs) implements Expression {}
record Mul(Expression lhs, Expression rhs) implements Expression {}
}

If you’re familiar with a functional programming language, this is effectively the same as a datatype definition like the following:

data Expression = Num Double
| Add Expression Expression
| Mul Expression Expression

Now we want to define a bunch of different operations over these expressions: evaluation, pretty-printing, maybe type-checking or some other kinds of static analysis. We could just directly expose the Expression sub-classes and let each operation directly traverse the structure using pattern matching. For example, we can add an eval method directly to the expression class that evaluates the expression:

default double eval() {
return switch (this) {
case Num(var value) -> value;
case Add(var lhs, var rhs) -> lhs.eval() + rhs.eval();
case Mul(var lhs, var rhs) -> lhs.eval() * rhs.eval();
};
}

(Incidentally, isn’t this great? It’s taken a long time, but I really like how clean this is in modern Java).

We can then try out an example:

static void main() {
var expr = new Add(
new Mul(new Num(3.14159), new Num(4)),
new Num(2.5));
System.out.printf("%s = %.2f%n", expr, expr.eval());
}

Which gives us:

Add[lhs=Mul[lhs=Num[value=3.14159], rhs=Num[value=4.0]], rhs=Num[value=2.5]] = 15.07

There are some issues with this though. Firstly, there’s no encapsulation. If we want to change the way expressions are represented then we have to change eval() and any other function that’s been defined in this way. Secondly, although it’s straightforward for this small expression language, there can be a lot of duplication in operations over a complex structure dealing with details of traversing that structure. The Visitor Pattern solves both of these issues, as we’ll show now.

The Basic Visitor

The basic Visitor Pattern involves creating an interface with callback methods for each type of object you might encounter when traversing a structure. For our example, it looks like the following:

interface Visitor<T> {
T num(Number value);
T add(T lhs, T rhs);
T mul(T lhs, T rhs);
}

A few things to note here:

We use a generic type parameter <T> to allow operations to return different types of results depending on what they do. We’ll see how this works in a bit.
In keeping with the idea of encapsulating details, we use the more abstract Number type rather than the concrete double type we’re using under the hood. (We could also have done this before, but I’m doing it here to illustrate that the Visitor interface doesn’t have to exactly represent the underlying data structures).

The next part of the pattern is to add an accept(Visitor) method to the Expression class, which then traverses the data structure invoking the callbacks as appropriate. In the traditional implementation, this method is implemented on each concrete sub-class using a technique known as “double-dispatch”. For example, we could add an implementation of accept to the Add class that calls visitor.add(lhs, rhs). This technique is still sometimes useful, but I find it’s often clearer to just inline all that into the top-level Expression implementation (as a default method implementation, because Expression is an interface):

default <T> T accept(Visitor<T> visitor) {
return switch (this) {
case Num(var value) -> visitor.num(value);
case Add(var lhs, var rhs) ->
visitor.add(lhs.accept(visitor), rhs.accept(visitor));
case Mul(var lhs, var rhs) ->
visitor.mul(lhs.accept(visitor), rhs.accept(visitor));
};
}

What’s going on here? Firstly, the method is parameterised to accept any type of return value. Again, we’ll see why in a moment. It then inspects the specific type of expression of this object and calls the appropriate callback on the visitor. Note that in the Add/Mul cases we also recursively visit the left-hand-side and right-hand-side expressions first, similarly to how we called .eval() on those in the earlier listing.

We can then re-implement our expression evaluator in terms of the visitor:

default double visitorEval() {
return accept(new Visitor<>() {
@Override
public Double num(Number value) {
return value.doubleValue();
}

@Override
public Double add(Double lhs, Double rhs) {
return lhs + rhs;
}

@Override
public Double mul(Double lhs, Double rhs) {
return lhs * rhs;
}
});
}

OK, that works. But it’s kinda ugly compared to what we had. Can we improve it? Yes, we can.

Fluent Visitors

The Visitor is really just a set of callback functions, one for each type of object in our data structure. Rather than defining these callbacks as an implementation of the interface, we could instead define them as three separate lambda functions. We can then invoke these instead:

class FluentVisitor<T> implements Visitor<T> {
private Function<Number, T> onNum;
private BiFunction<T, T, T> onAdd;
private BiFunction<T, T, T> onMul;

public FluentVisitor<T> onNum(Function<Number, T> f) {
onNum = f;
return this;
}

public FluentVisitor<T> onAdd(BiFunction<T, T, T> f) {
onAdd = f;
return this;
}

public FluentVisitor<T> onMul(BiFunction<T, T, T> f) {
onMul = f;
return this;
}

@Override
public T num(Number value) {
return onNum.apply(value);
}

@Override
public T add(T lhs, T rhs) {
return onAdd.apply(lhs, rhs);
}

@Override
public T mul(T lhs, T rhs) {
return onMul.apply(lhs, rhs);
}
}

// On Expression:
static <T> FluentVisitor<T> visitor() {
return new FluentVisitor<>();
}

We can then use this to reimplement our expression evaluator again:

default double fluentEval() {
return accept(Expression.<Double>visitor()
.onNum(Number::doubleValue)
.onAdd((a, b) -> a + b)
.onMul((a, b) -> a * b));
}

That’s a lot nicer to look at. We can then call it as before, and we can also use the fluent visitor to define operations on the fly, such as printing a nicer string representation:

static void main() {
var expr = new Add(
new Mul(new Num(3.14159), new Num(4)),
new Num(2.5));

var result = expr.fluentEval();
var str = expr.accept(visitor()
.onNum(value -> String.format("%.2f", value.doubleValue()))
.onAdd((a, b) -> String.format("(%s + %s)", a, b))
.onMul((a, b) -> String.format("(%s * %s)", a, b)));
System.out.printf("%s = %.2f%n", str, result);
}

There are some potential drawbacks to this approach, but overall I think it’s really clean and nice. One drawback is that you lose compile-time checking that all the cases have been handled: if you forget to register one of the callbacks you’ll get a runtime NullPointerException instead. There are ways around this, such as using multiple FluentVisitor types that incrementally construct the callbacks, but that’s more work:

record FluentVisitor0<T>() {
FluentVisitor1<T> onNum(Function<Number, T> f) {
return new FluentVisitor1<>(f);
}
}
record FluentVisitor1<T>(Function<Number, T> onNum) {
FluentVisitor2<T> onAdd(...) ...
}

That ensures that every callback has to be provided before you can call visit, at the cost of needing many more classes. This is the sort of thing where good IDE support would really help (IntelliJ plugin anyone?).

Another easy-to-fix nit is that, if you don’t care about the result, it is easy to forget to call visit and thus not actually do anything at all. This can be fixed by changing the visitor method to accept a function rather than returning a FluentVisitor:

default <T> T visitor(Function<FluentVisitor<T>, FluentVisitor<T>> builder) {
return accept(builder.apply(new FluentVisitor<>()));
}

Benefits of encapsulation

The encapsulation that the Visitor provides allows us to quite radically change the underlying representation, while still preserving the same logical view of the data. For example, here is an alternative implementation that works only on positive integers and stores expressions in a compact reverse Polish notation (RPN). The exact same visitors we defined for the previous expression evaluator will also work for this one:

public record RpnExpression(int... expr) {
static final int ADD = -1;
static final int MUL = -2;

public <T> T accept(Expression.Visitor<T> visitor) {
var stack = new Stack<T>();
for (int op : expr) {
switch (op) {
case ADD -> stack.push(visitor.add(stack.pop(), stack.pop()));
case MUL -> stack.push(visitor.mul(stack.pop(), stack.pop()));
default -> stack.push(visitor.num(op));
}
}
return stack.pop();
}

static void main() {
var rpn = new RpnExpression(3, 4, MUL, 2, ADD);
var result = rpn.accept(Expression.<Double>visitor()
.onNum(num -> num.doubleValue())
.onAdd((a, b) -> a + b)
.onMul((a, b) -> a * b));
System.out.printf("%s = %.2f%n", rpn, result);
}
}

A functional programming view on the humble Visitor

Hopefully this article has shown you that there is still something interesting about old patterns like the Visitor, especially if you adapt them a bit to modern programming idioms. I often hear fans of functional programming stating that the Visitor pattern only exists to make up for the lack of pattern matching in OO languages like Java. In my opinion, this is the wrong way to think about things. Even when you have pattern matching (as Java now does) the Visitor pattern is still useful due to the increased encapsulation it provides, hiding details of the underlying representation.

The correct way to think about the Visitor pattern is as a natural generalisation of the reduce/fold operation common in functional programming languages. Consider the following (imperative) implementation of a left-fold operation over a list:

static <T> T reduce(List<T> xs, T z, BiFunction<T, T, T> f) {
for (var x : xs) {
z = f.apply(z, x);
}
return z;
}

static void main() {
var xs = List.of(1, 2, 3, 4);
var sum = reduce(xs, 0, Integer::sum);
}

We can think of a linked list as a data structure with two constructors: Nil (the empty list), and Cons(List, List). In this case, the reduce operation is essentially a Visitor pattern where z corresponds to the onNil case and f corresponds to the onCons case.

So, far from being a poor man’s pattern matching, the true essence of the Visitor is a generalised fold operation, which is why it’s so useful. Maybe this old dog still has some nice tricks, eh?

A simple example

The Basic Visitor

Fluent Visitors

Benefits of encapsulation

A functional programming view on the humble Visitor

Similar Posts