The Easiest Way to Build a...
The Easiest Way to Build a Type Checker
Type checkers are a piece of software that feel incredibly simple, yet incredibly complex. Seeing Hindley-Milner written in a logic programming language is almost magical, but it never helped me understand how it was implemented. Nor does actually trying to read anything about Algorithm W or any academic paper explaining a type system. But thanks to David Christiansen, I have discovered a setup for type checking that is so conceptually simple it demystified the whole thing for me. It goes by the name Bidirectional Type Checking.
Bidirectional Type Checking
The two directions in this type checker are inferring types and checking types. Unlike Hindley-Milner, we do need some type annotations, but these are typically at function definitions. So code like the sillyExample below is completely valid and fully type checks despite lacking annotations. How far can we take this? I'm not a type theory person. Reading papers in type theory takes me a while, and my comprehension is always lacking, but this paper seems like a good starting point for answering that question.
function sillyExample(x: number): number {
let a = 10;
let b = 20;
let e = a;
let f = b;
let q = a + e;
let g = "hello";
let h = "world";
let i = 100 + q;
return x;
}
So, how do we actually create a bidirectional type checker? I think the easiest way to understand it is to see a full working implementation. So that's what I have below for a very simple language. To understand it, start by looking at the types to figure out what the language supports, then look at each of the infer cases. But don't worry, if it doesn't make sense, I will explain in more detail below.
export type Type =
| { kind: "number" }
| { kind: "string" }
| { kind: "function"; arg: Type; returnType: Type };
export type Expr =
| { kind: "number"; value: number }
| { kind: "string"; value: string }
| { kind: "varLookup"; name: string }
| { kind: "function"; param: string; body: Expr }
| { kind: "call"; fn: Expr; arg: Expr }
| { kind: "let"; name: string; value: Expr; type?: Type }
| { kind: "block"; statements: Expr[]; return: Expr };
export type Context = Map<string, Type>;
export function infer(ctx: Context, expr: Expr): Type {
switch (expr.kind) {
case "number":
return { kind: "number" };
<span class="token" style="color:#859900">case</span> <span class="token" style="color:#2aa198">"string"</span><span class="token" style="color:#586e75">:</span>
<span class="token" style="color:#859900">return</span> <span class="token" style="color:#586e75">{</span> kind<span class="token" style="color:#586e75">:</span> <span class="token" style="color:#2aa198">"string"</span> <span class="token" style="color:#586e75">}</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">case</span> <span class="token" style="color:#2aa198">"varLookup"</span><span class="token" style="color:#586e75">:</span>
<span class="token" style="color:#859900">const</span> <span class="token" style="color:#859900">type</span> <span class="token" style="color:#cb4b16">=</span> ctx<span class="token" style="color:#586e75">.</span><span class="token" style="color:#859900">get</span><span class="token" style="color:#586e75">(</span>expr<span class="token" style="color:#586e75">.</span>name<span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">if</span> <span class="token" style="color:#586e75">(</span><span class="token" style="color:#cb4b16">!</span><span class="token" style="color:#859900">type</span><span class="token" style="color:#586e75">)</span> <span class="token" style="color:#586e75">{</span>
<span class="token" style="color:#859900">throw</span> <span class="token" style="color:#859900">new</span> <span class="token class-name">Error</span><span class="token" style="color:#586e75">(</span><span class="token template-string"><span class="token template-punctuation" style="color:#2aa198">`</span><span class="token" style="color:#2aa198">Unbound variable: </span><span class="token interpolation"><span class="token interpolation-punctuation" style="color:#586e75">${</span>expr<span class="token" style="color:#586e75">.</span>name<span class="token interpolation-punctuation" style="color:#586e75">}</span></span><span class="token template-punctuation" style="color:#2aa198">`</span></span><span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#586e75">}</span>
<span class="token" style="color:#859900">return</span> <span class="token" style="color:#859900">type</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">case</span> <span class="token" style="color:#2aa198">"call"</span><span class="token" style="color:#586e75">:</span>
<span class="token" style="color:#859900">const</span> fnType <span class="token" style="color:#cb4b16">=</span> <span class="token" style="color:#b58900">infer</span><span class="token" style="color:#586e75">(</span>ctx<span class="token" style="color:#586e75">,</span> expr<span class="token" style="color:#586e75">.</span>fn<span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">if</span> <span class="token" style="color:#586e75">(</span>fnType<span class="token" style="color:#586e75">.</span>kind <span class="token" style="color:#cb4b16">!==</span> <span class="token" style="color:#2aa198">"function"</span><span class="token" style="color:#586e75">)</span> <span class="token" style="color:#586e75">{</span>
<span class="token" style="color:#859900">throw</span> <span class="token" style="color:#859900">new</span> <span class="token class-name">Error</span><span class="token" style="color:#586e75">(</span><span class="token" style="color:#2aa198">"Cannot call non-function"</span><span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#586e75">}</span>
<span class="token" style="color:#b58900">check</span><span class="token" style="color:#586e75">(</span>ctx<span class="token" style="color:#586e75">,</span> expr<span class="token" style="color:#586e75">.</span>arg<span class="token" style="color:#586e75">,</span> fnType<span class="token" style="color:#586e75">.</span>arg<span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">return</span> fnType<span class="token" style="color:#586e75">.</span>returnType<span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">case</span> <span class="token" style="color:#2aa198">"function"</span><span class="token" style="color:#586e75">:</span>
<span class="token" style="color:#859900">throw</span> <span class="token" style="color:#859900">new</span> <span class="token class-name">Error</span><span class="token" style="color:#586e75">(</span><span class="token" style="color:#2aa198">"Cannot infer type for function without annotation"</span><span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">case</span> <span class="token" style="color:#2aa198">"let"</span><span class="token" style="color:#586e75">:</span>
<span class="token" style="color:#859900">const</span> valueType <span class="token" style="color:#cb4b16">=</span> <span class="token" style="color:#b58900">infer</span><span class="token" style="color:#586e75">(</span>ctx<span class="token" style="color:#586e75">,</span> expr<span class="token" style="color:#586e75">.</span>value<span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">if</span> <span class="token" style="color:#586e75">(</span>expr<span class="token" style="color:#586e75">.</span><span class="token" style="color:#859900">type</span><span class="token" style="color:#586e75">)</span> <span class="token" style="color:#586e75">{</span>
<span class="token" style="color:#859900">if</span> <span class="token" style="color:#586e75">(</span><span class="token" style="color:#cb4b16">!</span><span class="token" style="color:#b58900">typesEqual</span><span class="token" style="color:#586e75">(</span>valueType<span class="token" style="color:#586e75">,</span> expr<span class="token" style="color:#586e75">.</span><span class="token" style="color:#859900">type</span><span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">)</span> <span class="token" style="color:#586e75">{</span>
<span class="token" style="color:#859900">let</span> expected <span class="token" style="color:#cb4b16">=</span> <span class="token" style="color:#268bd2">JSON</span><span class="token" style="color:#586e75">.</span><span class="token" style="color:#b58900">stringify</span><span class="token" style="color:#586e75">(</span>expr<span class="token" style="color:#586e75">.</span><span class="token" style="color:#859900">type</span><span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">let</span> actual <span class="token" style="color:#cb4b16">=</span> <span class="token" style="color:#268bd2">JSON</span><span class="token" style="color:#586e75">.</span><span class="token" style="color:#b58900">stringify</span><span class="token" style="color:#586e75">(</span>valueType<span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">throw</span> <span class="token" style="color:#859900">new</span> <span class="token class-name">Error</span><span class="token" style="color:#586e75">(</span><span class="token template-string"><span class="token template-punctuation" style="color:#2aa198">`</span><span class="token" style="color:#2aa198">expected </span><span class="token interpolation"><span class="token interpolation-punctuation" style="color:#586e75">${</span>expected<span class="token interpolation-punctuation" style="color:#586e75">}</span></span><span class="token" style="color:#2aa198">, got </span><span class="token interpolation"><span class="token interpolation-punctuation" style="color:#586e75">${</span>actual<span class="token interpolation-punctuation" style="color:#586e75">}</span></span><span class="token template-punctuation" style="color:#2aa198">`</span></span><span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#586e75">}</span>
<span class="token" style="color:#586e75">}</span>
ctx<span class="token" style="color:#586e75">.</span><span class="token" style="color:#859900">set</span><span class="token" style="color:#586e75">(</span>expr<span class="token" style="color:#586e75">.</span>name<span class="token" style="color:#586e75">,</span> valueType<span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">return</span> valueType<span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">case</span> <span class="token" style="color:#2aa198">"block"</span><span class="token" style="color:#586e75">:</span>
<span class="token" style="color:#859900">let</span> blockCtx <span class="token" style="color:#cb4b16">=</span> <span class="token" style="color:#859900">new</span> <span class="token class-name">Map</span><span class="token" style="color:#586e75">(</span>ctx<span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">for</span> <span class="token" style="color:#586e75">(</span><span class="token" style="color:#859900">const</span> stmt <span class="token" style="color:#859900">of</span> expr<span class="token" style="color:#586e75">.</span>statements<span class="token" style="color:#586e75">)</span> <span class="token" style="color:#586e75">{</span>
<span class="token" style="color:#b58900">infer</span><span class="token" style="color:#586e75">(</span>blockCtx<span class="token" style="color:#586e75">,</span> stmt<span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#586e75">}</span>
<span class="token" style="color:#859900">return</span> <span class="token" style="color:#b58900">infer</span><span class="token" style="color:#586e75">(</span>blockCtx<span class="token" style="color:#586e75">,</span> expr<span class="token" style="color:#586e75">.</span><span class="token" style="color:#859900">return</span><span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
}
}
export function check(ctx: Context, expr: Expr, expected: Type): void {
switch (expr.kind) {
case "function":
if (expected.kind !== "function") {
throw new Error("Function must have function type");
}
const newCtx = new Map(ctx);
newCtx.set(expr.param, expected.arg);
check(newCtx, expr.body, expected.returnType);
break;
<span class="token" style="color:#859900">case</span> <span class="token" style="color:#2aa198">"block"</span><span class="token" style="color:#586e75">:</span>
<span class="token" style="color:#859900">let</span> blockCtx <span class="token" style="color:#cb4b16">=</span> <span class="token" style="color:#859900">new</span> <span class="token class-name">Map</span><span class="token" style="color:#586e75">(</span>ctx<span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">for</span> <span class="token" style="color:#586e75">(</span><span class="token" style="color:#859900">const</span> stmt <span class="token" style="color:#859900">of</span> expr<span class="token" style="color:#586e75">.</span>statements<span class="token" style="color:#586e75">)</span> <span class="token" style="color:#586e75">{</span>
<span class="token" style="color:#b58900">infer</span><span class="token" style="color:#586e75">(</span>blockCtx<span class="token" style="color:#586e75">,</span> stmt<span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#586e75">}</span>
<span class="token" style="color:#b58900">check</span><span class="token" style="color:#586e75">(</span>blockCtx<span class="token" style="color:#586e75">,</span> expr<span class="token" style="color:#586e75">.</span><span class="token" style="color:#859900">return</span><span class="token" style="color:#586e75">,</span> expected<span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">break</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">default</span><span class="token" style="color:#586e75">:</span>
<span class="token" style="color:#859900">const</span> actual <span class="token" style="color:#cb4b16">=</span> <span class="token" style="color:#b58900">infer</span><span class="token" style="color:#586e75">(</span>ctx<span class="token" style="color:#586e75">,</span> expr<span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#859900">if</span> <span class="token" style="color:#586e75">(</span><span class="token" style="color:#cb4b16">!</span><span class="token" style="color:#b58900">typesEqual</span><span class="token" style="color:#586e75">(</span>actual<span class="token" style="color:#586e75">,</span> expected<span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">)</span> <span class="token" style="color:#586e75">{</span>
<span class="token" style="color:#859900">throw</span> <span class="token" style="color:#859900">new</span> <span class="token class-name">Error</span><span class="token" style="color:#586e75">(</span><span class="token template-string"><span class="token template-punctuation" style="color:#2aa198">`</span><span class="token" style="color:#2aa198">Type mismatch: expected </span><span class="token interpolation"><span class="token interpolation-punctuation" style="color:#586e75">${</span>expected<span class="token interpolation-punctuation" style="color:#586e75">}</span></span><span class="token" style="color:#2aa198">, got </span><span class="token interpolation"><span class="token interpolation-punctuation" style="color:#586e75">${</span>actual<span class="token interpolation-punctuation" style="color:#586e75">}</span></span><span class="token template-punctuation" style="color:#2aa198">`</span></span><span class="token" style="color:#586e75">)</span><span class="token" style="color:#586e75">;</span>
<span class="token" style="color:#586e75">}</span>
}
}
export function typesEqual(a: Type, b: Type): boolean {
if (a.kind !== b.kind) return false;
if (a.kind === "function" && b.kind === "function") {
return typesEqual(a.arg, b.arg) && typesEqual(a.returnType, b.returnType);
}
return true;
}
Here we have, in ~100 lines, a fully functional type checker for a small language. Is it without flaw? Is it feature complete? Not at all. In a real type checker, you might not want to know only if something typechecks, but you might want to decorate the various parts with their type; we don't do that here. We don't do a lot of things. But I've found that this tiny bit of code is enough to start extending to much larger, more complicated code examples.
Explanation
If you aren't super familiar with the implementation of programming languages, some of this code might strike you as a bit odd, so let me very quickly walk through the implementation. First, we have our data structures for representing our code:
export type Type =
| { kind: 'number' }
| { kind: 'string' }
| { kind: 'function', arg: Type, returnType: Type }
export type Expr =
| { kind: 'number', value: number }
| { kind: 'string', value: string }
| { kind: 'varLookup', name: string }
| { kind: 'function', param: string, body: Expr }
| { kind: 'call', fn: Expr, arg: Expr }
| { kind: 'let', name: string, value: Expr, type?: Type }
| { kind: 'block', statements: Expr[], return: Expr }
Using this data structure, we can write code in a way that is much easier to work with than the actual string that we use to represent code. This kind of structure is called an "abstract syntax tree". For example
// double(5)
{
kind: 'call',
fn: { kind: 'varLookup', name: 'double' },
arg: { kind: 'number', value: 5 }
}
This structure makes it easy to walk through our program and check things bit by bit.
Context
export type Context = Map<string, Type>
This simple line of code is the key to how all variables, all functions, etc, work. When we enter a function or a block, we make a new Map that will let us hold the local variables and their types. We pass this map around, and now we know the types of things that came before it. If we wanted to let you define functions out of order, we'd simply need to do two passes over the tree. The first to gather up the top-level functions, and the next to type-check the whole program. (This code gets more complicated with nested function definitions, but we'll ignore that here.)
Inference
Each little bit of infer may seem a bit trivial. So, to explain it, let's add a new feature, addition.
// add this into our Expr type
| { kind: 'add', left: Expr, right: Expr }
Now we have something just a bit more complicated, so how would we write our inference for this? Well, we are going to do the simple case; we are only allowed to add numbers together. Given that our code would look something like this:
case 'add':
const leftType = check(ctx, expr.left, {kind: "number"})
const rightType = check(ctx, expr.right,<