Patterns All the Way Down: A Generalization for Graph-Like Things

Press enter or click to view image in full size

Or: how trying to “say something” about paths led to a recursive data structure that subsumes graphs entirely*

4 min readJust now

–

“The limits of my language mean the limits of my world.” Wittgenstein wrote that about natural language, but it applies just as well to data notation. For years, I’ve been bumping into the limits of how we talk about graphs.

The problem with paths

Property graphs are elegant. A node “says something” about itself — labels, properties, a little bundle of information. A relationship “says something” about two nodes — direction, type, properties again. Clean. But paths?

Paths only exist at runtime. They’re ephemeral query results, not first-class citizens. You can’t annotate a path. You can’…

Press enter or click to view image in full size

Or: how trying to “say something” about paths led to a recursive data structure that subsumes graphs entirely*

4 min readJust now

–

The problem with paths

Paths only exist at runtime. They’re ephemeral query results, not first-class citizens. You can’t annotate a path. You can’t give it an identity. You can’t say “this sequence of hops through my data represents this particular concept” and have that meaning persist.

I wanted to explore that.

From GEOFF to gram

The starting point was GEOFF, a clever notation by Nigel Small (author of py2neo) that extended Cypher’s visual syntax for data interchange. GEOFF got me thinking about the shape of graph notation — those parentheses for nodes, arrows for relationships, the visual grammar we’d inherited from ASCII art.

My first instinct was to wrap a path in square brackets. Separate the “what this path is” from “what this path contains”:

[ myPath:ImportantRoute | (a)-[:CONNECTS]->(b)-[:CONNECTS]->(c)]

The left side names and describes the path. The right side (after the `|`) lists its elements. Simple enough.

Then realizing the obvious: nodes and relationships are also paths. A node is a path with zero hops. A relationship is a path with exactly one hop. The notation generalizes:

[ a:Person {name: "Alice"} ] // node[ r1:KNOWS | a, b ]          // relationship [ route:Path | r1, r2, r3 ]  // path

Same syntax, different arities. The brackets contain a subject (the thing’s identity, labels, properties) and elements (what it contains).

Generalizing too far (or not far enough?)

What happens when elements are mixed? Or nested? Or when the “path-ness” breaks down? Are these even paths any more?

I did what any reasonable person would do: I generalized further. (This is almost always a mistake. Except when it isn’t.)

Drop the path semantics entirely. What remains is a recursive structure that I’ve called a Pattern:

type Pattern<V> = { value: V; elements: Pattern<V>[];}

A pattern pairs a value with a sequence of element patterns. That’s it. No constraints on what elements mean, how many there are, or how they relate. The structure is pure — interpretation comes later.

For graph work, V becomes Subject : an identifier, labels, and a property record. This gives us Pattern<Subject>, which serializes to gram (lowercase, like json).

Gram is the notation; Pattern<Subject> is the data structure.

OK, cool. But why?

A reasonable question. We have property graphs. We have RDF. We have adjacency lists and edge lists and incidence matrices. Why another representation?

The answer, for me, is interpretation over enforcement.

Property graphs enforce graph semantics at the structural level. A relationship must connect two nodes. Patterns don’t enforce anything. You can write something that looks like a graph, then view it as a graph through an interpretation layer — a predicate that says “these patterns are nodes, therefore those are edges.” Change the predicate, get a different graph. Or don’t interpret it as a graph at all.

Even better, with a uniform data structure for representing any graph elements and more, you can apply all the functional programming functors, monads, comonads, and more to work with the data. I’ve found myself using patterns+gram for things that aren’t obviously graphs. The notation is uniform; the semantics are contextual.

Where it’s going

Gram started as a grammer — tree-sitter-gram. Then I built a Haskell implementation gram-hs that added the Pattern<V> data structure to get the semantics right (and to convince Cursor to write in functional style). That’s been ported to Rust with gram-rs for practical deployment.

I started tinkering with an agentic framework called pattern-agent but got stuck at the tool boundary. Squinting, patterns are similar to s-expressions so of course that led to pattern-lisp: a tiny embeddable Lisp optimized for working with patterns. This frontier feeds back into the core gram-hs.

It’s patterns all the way down, and a useful depth to pop back up.

What’s the right pattern?

The notation is stable. The core libraries exist in Haskell and Rust. Pattern-lisp is taking shape. But there are quite a few open questions:

Compatibility with property graphs. I work at Neo4j, so of course I want to bring it all back home. Gram can represent property graphs, and property graphs can be viewed through gram. But is that relationship useful in practice? Can gram serve as an interchange format, or a higher-level abstraction layer, or does it just add complexity without benefit?

Scale. Everything I’ve described works fine with file-sized data in a single-user environment. What happens at large scale? Patterns are trees, and trees have known scaling characteristics — but the graph-view interpretation adds indirection.

Simpler proof of value. Pattern-agent with pattern-lisp is ambitious — a portable runtime, effect system, service injection. Way over-engineered, though fun to do. Is there something smaller that would be more immediately useful? A really great todo list app?

What do you think? Shall I go further down this rabbit hole?

Or: how trying to “say something” about paths led to a recursive data structure that subsumes graphs entirely*

The problem with paths

Or: how trying to “say something” about paths led to a recursive data structure that subsumes graphs entirely*

The problem with paths

From GEOFF to gram

Generalizing too far (or not far enough?)

OK, cool. But why?

Where it’s going

What’s the right pattern?

Similar Posts