This article describes supporting Brainfuck as an embedded language within Forth using meta-programming. I had a lot of fun figuring this out. Like most Forth code, the resulting program is quite compact. My explanation of it is not.
I prefer to assume minimal prior knowledge from the reader and so have erred on the side of over-explanation, but for the sake of brevity I do assume that you’re at least somewhat comfortable with the common programming language concepts and the act of writing programs. If you’re already familiar with Forth and Brainfuck, feel free to skip the Introduction and head straight to The Cool Part. If you’re a Forth enthusiast and have a solid grasp on the internals of the language, you might consider also skipping the Concepts section. If you’re …
This article describes supporting Brainfuck as an embedded language within Forth using meta-programming. I had a lot of fun figuring this out. Like most Forth code, the resulting program is quite compact. My explanation of it is not.
I prefer to assume minimal prior knowledge from the reader and so have erred on the side of over-explanation, but for the sake of brevity I do assume that you’re at least somewhat comfortable with the common programming language concepts and the act of writing programs. If you’re already familiar with Forth and Brainfuck, feel free to skip the Introduction and head straight to The Cool Part. If you’re a Forth enthusiast and have a solid grasp on the internals of the language, you might consider also skipping the Concepts section. If you’re only interested in the final result, the full program code is included in its own section at the end of this article.
Background
Forth
Forth is a family of stack-oriented, concatenative programming languages invented and iterated on by Charles “Chuck” Moore in the 1960s.
The two dominant philosophical goals of Forth are simplicity through “highly factored code” and “direct communication with the machine”. References to these properties are found throughout Moore’s writing and lectures about Forth and in Forth literature by other authors like Rather and Brodie.
Forths are often bootstrapped directly from assembly without a host operating system, acting as a sort of supercharged macro language over the native machine code. Though language standards for Forth exist, most Forth implementations are highly idiosyncratic and contingent on their target hardware. Language amenities like conditional branching, looping, and so on are only defined if necessary for solving the problem at hand. The names of operations are chosen based on whatever mnemonics the user finds most convenient.
The lack of extraneous features and tight coupling with the machine makes Forths well suited for austere, resource constrained environments with strict performance requirements. For example, Forths have been deployed to good effect in space probes and satellites.
At the same time, their bootstrapped nature leaves all machine and language internals open to the programmer for modification, making Forths remarkably amenable to meta-programming. As a currently-relapsing Lisp programmer, I find Forth’s meta-programming ability highly compelling.
Brainfuck
Brainfuck is perhaps the most well known “esoteric” programming language. It was designed in the early 1990s by Urban Müller as an experiment in making as small of a compiler as possible. It is Turing-complete (that is, exactly as powerful as “real” languages like C) despite only supporting eight instructions. This mix of power and simplicity has made writing a Branfuck interpreter a common project for programming hobbyists.
Brainfuck is usually implemented as a very simple virtual machine. The machine comprises four elements:
-
Main memory - The data manipulated by the program. By convention, Brainfuck machines typically have 30,000 bytes of memory. Memory is accessed one byte at a time. Note that this memory does not include the program code. Code is stored in a different section (see below).
-
Data pointer - An index into main memory. Brainfuck machines are able to modify their memory one byte at a time by first shifting the data pointer to a location and then executing certain instructions.
-
Code pointer - Brainfuck machines execute code following the conventional Von Neumann cycle. The code pointer indexes the operation in the program code that will be executed in the upcoming iteration of the cycle.
-
Code - The program being executed by the Brainfuck machine. Brainfuck supports 8 operations:
-
>and<for incrementing and decrementing the data pointer, respectively. -
+and-for incrementing and decrementing the byte in memory addressed by the data pointer, respectively. -
.for printing the byte of memory addressed by the data pointer. Note that this operation assumes the byte in memory is valid ASCII. -
,for consuming a single character of input from the keyboard and writing it to the byte addressed by the data pointer. This operation also assumes an ASCII representation for input. -
[and]for performing conditional branching and looping. The[and]operations must appear in matched pairs and act as delimiters for blocks of code. The[word allows the program to enter its block if the bye of memory addressed by the data pointer is not equal to zero. Otherwise, the code pointer is shifted forward to the matching]. Similarly, when]is encountered, the code pointer is shifted backward to the corresponding[if the byte of memory addressed by the data pointer is equal to zero.
The Cool Part
By now you might have an idea of how you’d write your own Brainfuck interpreter. A couple variables for the pointers, a big byte array for the memory, some kind of string buffer to hold the code, and some functions for reading and executing instructions. Maybe an extra helper function so you can load code from files. However, this article isn’t about writing a Brainfuck interpreter. I want my Brainfuck code to itself be valid Forth code, able to be embedded straight into a Forth program. This article is about writing a Brainfuck DSL.
This brings us to The Cool Part: Not only is embedding a Brainfuck in Forth totally possible with Forth’s meta-programming facilities, it’s actually easier than doing it the regular way. In fact, our DSL won’t even model the code or code pointer of the Brainfuck machine. It doesn’t need to, the code is already in the interpreter.
Before writing any code, we need to dig into how Forth works under the hood a bit.
Concepts
Forth is highly factored code. I don’t know anything else to say except that Forth is definitions. If you have a lot of small definitions you are writing Forth. In order to write a lot of small definitions you have to have a stack.
-Chuck Moore, “1x Forth”, 1999
At the heart of every Forth is a pair of stacks – one for data, the other for control flow – and a dictionary for definitions. Values are pushed onto the data stack and consumed by words in the dictionary. Words are Forth’s name for subroutines. Traditionally, the dictionary is represented using a linked list. New word definitions are appended onto the dictionary as the program runs. When the programmer types in a word and hits Enter, the dictionary is searched from newest definition to oldest until an entry with the word’s name is found. The code of that word is then executed.
I’d like to dwell on this a moment: words are subroutines but shouldn’t be confused with the functions or procedures of most other languages. Rather than accepting a set of parameters explicitly passed by the caller, Forth words operate on whatever values happen to be on the stack at the time. They are free to pop and push however many values from and to the stack as they please. Any number of words can be written and executed in sequence successfully so long as each word’s assumptions about the data on the stack are true when that word executes. Likewise, words can be split internally into multiple separate words (with some exceptions). This is what makes Forths concatenative; the ability to splice and chunk sequences of words is what allows the programmer to build their “lot of small definitions”.
Forth basics speedrun any% [PB] [2025]
I didn’t initially write this with a plan to include a section on the fundamentals of Forth. The meat of the article is about a pretty advanced feature of the language, after all, but I would feel bad if I scared readers off just for refusing to write a few sentences. If you’re totally unfamiliar with Forth, please check out a proper introduction to the language like GForth’s tutorial or A Beginner’s Guide to Forth.
Forth programs are their own interactive runtime environment, that is, the Forth interpreter is just another Forth program. Forth code has almost no syntax. Any chunk of text delimited by whitespace is a valid lexical token. Numeric literals are recognized by the environment as values, and any other token is treated as the name of a piece of code to be executed. Like most stack-oriented languages, Forth code is written in Reverse Polish Notation.
So when you load up a Forth interpreter and type something like…
1 2 + .
… you are instructing Forth to do the following:
- Push the value 1 onto the stack.
- Push the value 2 onto the stack.
- Execute the
+word, which pops two values off of the stack sums them, and then pushes the result back onto the stack. - Execute the
.word, which pops a value off of the stack and writes it to the screen.
Values on the stack can be manipulated. A value on the stack can be duplicated with dup or discarded with drop. A pair of values can be transposed with swap. Trios of values can be spun around with rot and -rot. I always forget which direction is which.
If you find yourself writing the same chunk of code a lot, you can factor that code into a new word using the : and ; words.
: print-three 1 2 + . ;
The : word tells Forth that the next token is the name of a new word (print-three in this example). All code between the name and the ; is the definition of the new word. Words can be redefined at any time. Code depending on a prior definition of a word is not affected, only later definitions can see the redefined word.
Forth supports conditional branching inside word definitions.
: ?even 2 mod 0= ;
: print-parity
?even if ." Number is even!" else ." Number is odd!" then ;
It also supports iteration within words. Common looping constructs include begin ... while/until ... repeat for indefinite conditional looping and do ... loop for counted loops.
Here’s some code that writes the Collatz sequence for a number:
: ?even 2 mod 0= ;
: halve 2 / ;
: triple-plus-one 3 * 1 + ;
: collatz
begin
dup . ( print the number)
dup 1 >
while
dup ?even if halve
else triple-plus-one
then
repeat
drop ( the number)
;
Interpretation versus Compilation
Forth programs are executed as they run. Each time the user types code into the interpreter and presses Enter, the line of code is read and executed bit-by-bit. Perhaps counterintuitively, this includes code whose result is to define new words. If it’s not clear why this might be counter-intuitive, read on.
Imagine we’re writing a program that’s very preoccupied with taking quantities on the stack and doubling them. Doubling a value is as simple as duplicating it (pushing a copy of it onto the stack) so it can be added to itself. After a while, we get tired of writing dup + and decide that we want to factor that code into a word called dub.
We can define our new word like so:
: dub dup + ;
Each word in Forth has a textual definition denoting the word’s interpretation semantics, that is, the machine code or other Forth words which will be executed when the word is invoked. Therefore, whenever we encounter a dub in our code, we know that Forth will first perform a dup and then follow it with a +.
Now let’s pretend we’re Forth and we’re reading the definition of dub. We read input one white-space delimited word at a time. The first word we encounter is :, which is how Forth spells “define”. The interpretation semantics of : are to prepare the dictionary for a new definition corresponding to the next word in the input stream, dub. We initialize a new dictionary entry named dub. Next we encounter dup, so we should immediately duplicate whatever’s currently on the stack, right? After all, that’s what dup’s interpretation semantics are.
Obviously not. Words only having interpretation semantics would make defining new words impossible. Like other languages, Forth does something different when it encounters a word that’s being compiled into the definition of another. Unlike other languages, compilation is transient; Forth can toggle between its interpretation mode and compilation mode at any time. When in compilation mode, each word encountered executes according to its compilation semantics. By default the compilation semantics of a word are to find the word’s address in the dictionary and insert code for invoking it into the current definition.
So in our definition of dub we know that dup and + are simply being compiled and won’t run until dub is invoked. But the next word is ;, which is supposed to indicate the end of a word definition. When Forth encounters ; it’s supposed to do all the bookkeeping required to finalize the definition of a new word, but if ;’s compilation semantics are the same as other words, then we’d expect Forth to attempt to compile it into the current definition. How does ; avoid that and just tell Forth we’re done defining?
Clearly there’s something more interesting going on in the definition of ;. Forth is an interactive language that doesn’t believe in keeping secrets. Let’s hop into an interpreter and ask it what ; means.
see ; \ <- Only type this line
: ;
;-hook [COMPILE] EXIT flush-code ;-hook2 ?colon-sys reveal [COMPILE] [ ; immediate compile-only ok
That’s weird, ; appears in its own definition. My chosen Forth dialect is GForth, the GNU Forth. GForth is “bootstrapped”, meaning it defines itself in itself. Let’s put that detail aside.
I’m going to gloss over some details here to avoid ending up in the weeds. The definition of ; does a few interesting things. First, it compiles an EXIT (Forth’s name for returning by popping the return stack) into whatever the current definition is. Then it does some bookkeeping work in the dictionary. Finally it compiles the word [ into the current definition. The [ word toggles Forth from compilation mode back into interpretation mode. (If you look at the definition for :, you’ll find a corresponding ] that toggles compilation mode).
Note that I mentioned the “current definition” a couple times just now. You may be thinking, “Isn’t the ‘current definition’ in question the one for ;?”. It is not. Take a look at the pair of words right after the end of ;’s definition: immediate compile-only. Working backwards, compile-only is an indication that the most recently defined word, ; in this case, should only ever be used while in compilation mode. In other words, ; only has compilation semantics. There’s no sensible usage of ; other than terminating a new word’s definition, which necessitates being in compilation mode.
But what about immediate?
The most important part of Forth
The immediate word indicates that the most recently defined word is immediate. Immediate words execute the moment they are encountered in the input stream, even while the system is in compilation mode.
Therefore, the “current definition” I mentioned before isn’t the definition of ;, but is instead the definition of whatever new word is being compiled when Forth executes ;. It has the effect of writing instructions directly into the new word’s definition.
In his article “The Forth Programming Language - Why YOU should learn it”, Doug Hoyte (author of the delightful Let Over Lambda) writes:
Immediacy is a very powerful feature in forth. Once you understand how immediate words are used to build up everything in the language, you will have understood the most important part of forth.
Immediacy is the most important part of Forth because it allows the programmer to write code that executes at compile-time. This seems like an advanced feature reserved for Lisp hackers (and indeed Lisp’s reader macros and Forth’s immediate words are two solutions to basically the same problem), but compile-time code is actually ubiquitous in Forth. The definition terminating word ; is immediate. Conditional branching works because if, else, and then are immediate. The same is true for repetition with begin/repeat, conditional repetition with while and until, and counted repetition with do/loop, etc. Even comments use immediate words.
Thank you, five
There’s still a bit more to cover but this is a good place to pause, take a break, and review.
Forth lets the user break code into named chunks called words by compiling their definitions into a dynamic data structure called the dictionary. To support this, the Forth runtime can be toggled between interpretation and compilation modes on demand. Words usually have two sets of semantics, one for what happens when the word is interpreted and another for when the word is compiled. Some special words only have the latter. Words that are “immediate” execute the moment they are read, regardless of whether the Forth is currently interpreting or compiling.
Alright, back to it.
Postponed words
Assuming you’ve understood everything so far, there’s just one final piece to make our Brainfuck DSL possible: postponed words. The postpone word, which is compile-only, parses the next word in the input stream and appends the compilation semantics of that word into the current definition, even if that word is marked immediate. In other words, postpone compiles the compilation semantics of one word into the interpretation semantics of another.
Immediate words and postponed words work in opposite directions. Immediate words allow the programmer to execute code at compile time. Postponed words let the programmer defer what would normally be done at compile-time until run-time.
I lack Doug Hoyte’s bona fides but if immediate words are the most important part of Forth, then I think postponed words are in the running for second. Let’s start building the program to see why.
Building the program
With that background out of the way we’re ready to start writing some code. One of the best parts of Forth is that it’s interactive. I’ve chosen GForth as my dialect for this article. If you have it on your machine, feel free to paste code from this section straight into the interpreter. As you read, recall that the \ word begins an until-end-of-line comment and ( comments everything on a line until the comment block is terminated with ).
Setting up
To start things off, let’s get our Brainfuck machine defined.
variable dp
create memory 30000 chars allot
This just gives us an integer for our data pointer and a block of 30,000 bytes of memory. Note the lack of a code pointer variable or an allocation for the program code.
Next, let’s write some helper words.
: reset
0 dp !
memory 30000 chars erase ;
: show
cr ." dp: " dp @ .
cr ." memory[dp]: " m[dp]@ .
cr ." memory: " 10 0 ?do i memory + c@ . loop ;
: inc! 1 swap +! ;
: dec! -1 swap +! ;
: m[dp] dp @ memory + ;
: m[dp]@ dp @ memory + c@ ;
: m[dp]! dp @ memory + c! ;
These words let us reset the state of the Brainfuck machine, dump debug information about it to the screen, and do some basic manipulation of the data pointer and the byte it indexes.
Defining the DSL
Each of the words defined in this section follow an interesting pattern:
: <word> ]] <instructions...> [[ ; immediate
Recall that postpone defers the following word’s compilation semantics until interpretation. To save users from writing postpone repeatedly, GForth provides the words ]] and [[. The first word puts Forth into a mode where every word encountered is automatically postpone’d until the corresponding [[ is found. The following two definitions are equivalent:
: a postpone foo postpone bar postpone baz ;
: b ]] foo bar baz [[ ;
The first six Brainfuck operations are straightforward. A little bit of familiarity with Forth should be enough for you to understand them:
: > ]] dp inc! [[ ; immediate
: < ]] dp dec! [[ ; immediate
: + ]] m[dp] inc! [[ ; immediate
: - ]] m[dp] dec! [[ ; immediate
: . ]] m[dp]@ emit [[ ; immediate
: , ]] key m[dp]! [[ ; immediate
Branches and loops
The final two commands are where the real magic happens. Let’s begin with their definitions.
: [ ]] m[dp]@ 0<> if begin [[ ; immediate
: ] ]] m[dp]@ 0= until then [[ ; immediate
Note that the if inside of [ lacks a terminating then. The begin also lacks a terminating repeat/until. The ] word has the same issue. Without the power of postpone, these words would fail to compile. But why?
Recall that conditional branching and looping in Forth is achieved with immediate words. The reason for this is that Forth does not perform a separate lexing/parsing step when processing code inputs. Words are consumed one at a time and trigger some kind of action by the Forth runtime as they’re encountered. Branches and loops are achieved in Forth by defining words like if and begin to leave marker values inside the currently compiled definition. When a corresponding terminating word like then or repeat is encountered, the runtime scans back to find the marker value and replaces it with a jump instruction.
Scanning through a definition one instruction at a time? Jumping by some number of instructions? This sounds like what [ and ] do in Brainfuck! Indeed, the reason why our Brainfuck machine doesn’t need an allocation for the program code or a code pointer variable to index into it is because Forth already does the exactly equivalent thing when compiling branches and loops. All we have to do is ensure the branching and looping semantics match [ and ].
Trying it out
I’ve intentionally avoided explaining exactly what each of the DSL words above really do because I think a visual explanation is much more illustrative (and exciting). But for that, we’ll need Brainfuck code to test out. Thankfully we don’t need a large program to showcase each of Brainfuck’s operations. This one is adapted from an example in Brainfuck’s Wikipedia article.
: seven
+ + > + + + + +
[ < + > - ]
+ + + + + + + + [ < + + + + + + > - ] < .
;
The article describes the program in detail, but to summarize: the program places the value 2 in one memory cell, the value 5 in another, sums the two cells together to produce 7, and then encodes 7 as an ASCII value and prints it to the screen.
Go on, type seven into your interpreter and see it work:
seven 7 ok
Now for the big reveal. Let’s see seven’s real definition.
see seven
: seven
m[dp] inc! m[dp] inc! dp inc! m[dp] inc! m[dp] inc! m[dp] inc! m[dp] inc! m[dp] inc! m[dp]@ 0<>
IF BEGIN dp dec! m[dp] inc! dp inc! m[dp] dec! m[dp]@ 0=
UNTIL
THEN
m[dp] inc! m[dp] inc! m[dp] inc! m[dp] inc! m[dp] inc! m[dp] inc! m[dp] inc! m[dp] inc! m[dp]@ 0<>
IF BEGIN dp dec! m[dp] inc! m[dp] inc! m[dp] inc! m[dp] inc! m[dp] inc! m[dp] inc! dp inc! m[dp] dec! m[dp]@ 0=
UNTIL
THEN dp
dec! m[dp]@ emit ; ok
Here we see the magic of postponed words. Defining each Brainfuck operation as an immediate word full of postponed words means that the actions applied to the Brainfuck machine are compiled straight into a word’s definition in-line. Immediate postponed words act as code-generating macros that emit plain old Forth code.
This means we can write parts of a program in Brainfuck and parts of it in Forth, even within the same word definition! Is seven feeling a bit too unreadable? Let’s just make some helper words and redefine it:
: add-left-to-right ]]
[ < + > - ]
[[ ; immediate
: digit->ascii ]]
+ + + + + + + + [ < + + + + + + > - ] <
[[ ; immediate
: seven
+ + > + + + + +
add-left-to-right
digit->ascii
.
;
Conclusion
What I’ve tried to showcase here is just a taste of the kinds of meta-programming Forth lets you do. Consider, for instance, that so-called “parsing words” can take direct control of the Forth input stream, or that code-emitting words can operate conditionally based on a larger context. My DSL is simple and useless in roughly equal measure. I hope that you’re inspired by seeing how it was able to be written using such an austere language with so few moving parts.
What I can promise is the perspective from a simple, passionate programmer of a couple decades, and 3 years of those belong to Forth. Never has my heart been completely enveloped by an idea. The allure and compulsion has practically caused me to fall in love with a programming language.
Lee, “An Attempt at a Compelling Articulation of Forth’s Practical Strengths and Eternal Usefulness”, 2025
I have a lot of thoughts about programming languages in general, and Forth in particular. I delight in learning new langauges and paradigms. Feeling my brain bend and stretch into new shapes as it gradually learns to think about problems in new ways is the thrill that keeps me hooked on writing programs. Few languages have challenged and rewarded me to the extent that Forth has. Indeed, one of the most challenging aspects of Forth is articulating exactly what makes the language so special. I hope to expand on those themes in future articles.
The full program
variable dp ( data pointer)
create memory 30000 chars allot
\ Helper words for manipulating memory at the data pointer
: inc! 1 swap +! ;
: dec! -1 swap +! ;
: m[dp] dp @ memory + ;
: m[dp]@ dp @ memory + c@ ;
: m[dp]! dp @ memory + c! ;
\ Brainfuck commands
: > ]] dp inc! [[ ; immediate
: < ]] dp dec! [[ ; immediate
: + ]] m[dp] inc! [[ ; immediate
: - ]] m[dp] dec! [[ ; immediate
: . ]] m[dp]@ emit [[ ; immediate
: , ]] key m[dp]! [[ ; immediate
: [ ]] m[dp]@ 0<> if begin [[ ; immediate
: ] ]] m[dp]@ 0= until then [[ ; immediate
\ ========== Demos ==========
\ Plain Brainfuck definition for adding two numbers
: seven
+ + > + + + + +
[ < + > - ]
+ + + + + + + + [ < + + + + + + > - ] < .
;
\ Helper words over Brainfuck code
: add-left-to-right ]]
[ < + > - ]
[[ ; immediate
: digit->ascii ]]
+ + + + + + + +
[ < + + + + + + > - ] <
[[ ; immediate
: seven
+ + > + + + + +
add-left-to-right
digit->ascii
.
;
\ Another Brainfuck example, just for you :)
: hello-world
reset
+ + + + + + + + [ > + + + + [ > + + > + + + > + + + > + < < < < - ] > + > +
> - > > + [ < ] < - ] > > . > - - - . + + + + + + + . . + + + . > > . < - .
< . + + + . - - - - - - . - - - - - - - - . > > + . > + + .
;