Preview
Open Original
An Attempt at a Compelling Articulation
of Forth's Practical Strengths
and Eternal Usefulness
PRELUDE
Problem Statement
It's done! We have that crucial ingredient giving life to that little
Forth, allowing it to walk by itself! Unfortunately, you'll soon see
that after a few steps, this little creature stumbles and falls.
What is it lacking, a little balance maybe?
-- Virgil Dupras
Explaining why Forth is still relevant has become a bittersweet, ironic comedy
routine enthusiasts and operators are finding themselves in more frequently.
We usually begin with lavish statements of simplicity, and ease the
reader into a fog of what Forth is. Finally we slam back to Earth with examples.
Examples so contrived that they have the complete opposite effect, a tint and a
dulling of any allu...
An Attempt at a Compelling Articulation
of Forth's Practical Strengths
and Eternal Usefulness
PRELUDE
Problem Statement
It's done! We have that crucial ingredient giving life to that little
Forth, allowing it to walk by itself! Unfortunately, you'll soon see
that after a few steps, this little creature stumbles and falls.
What is it lacking, a little balance maybe?
-- Virgil Dupras
Explaining why Forth is still relevant has become a bittersweet, ironic comedy
routine enthusiasts and operators are finding themselves in more frequently.
We usually begin with lavish statements of simplicity, and ease the
reader into a fog of what Forth is. Finally we slam back to Earth with examples.
Examples so contrived that they have the complete opposite effect, a tint and a
dulling of any allure that may have existed as a spark in the prospective
Forther's mind. Each time a speech of grandeur falling flat.
Virgil's quote is a perfect specimen of this phenomenon. The miracle of popping
into existence - walking nonetheless - is a direct testament to Forth's
evolutionary nature. It's such a strong positive to begin with, yet immediately
follows up with "it's useless".
Virgil needs no defense: he's right. It is useless. The point is, that statement
is a negative wrinkle for what is a positive feature: a foundational, minimal
core to build on. It's these instances of faulter that contribute to Forth's
continual uncertainty in today's context.
On Saturday I saw the final act causing me to want to write this piece.
Here Use This Honda Civic Made of Lambos
A long-running project called eforth was linked on lobste.rs, a link aggregation
site, attempting to sell Forth on still having relevancy:
https://github.com/chochain/eforth.
The author details their history, how they worked with a prolific Forth user and
contributor, Dr. Chen Hanson Ting (the creator of eforth, passed away 2022),
and why their particular Forth project is a successor to his work.
But it's the first paragraph that takes the cake.
The first paragraph unironically reads:
With all the advantages, it is unfortunate that Forth lost out to C
language over the years and have been reduced to a niche. Per ChatGPT:
due to C's broader appeal, standardization, and support ecosystem
likely contributed to its greater adoption and use in mainstream
computing.
The project's README then proceeds into lengthy detail about how to implement
Forth in C.
Yep.
It is the fastest open-closed case I've seen to date of someone explaining why
you'd use Forth and then miraclously shoot both their feet at once.
It seriously left me empty-minded after reading.
The only hint of relevancy revolves around compilation and being able to build
the system for many architectures, thanks to C compilers and ease of
implementation.
Why Do My Words Have Weight And Why They Don't
I'm not a Forth guru; a great sage like those of LISP, C, or Haskell. No
universal promise can be made to you that what I say is unyielding truth. You
will have to verify that for yourself. What I can promise is the perspective
from a simple, passionate programmer of a couple decades, and 3 years of those
belong to Forth. Never has my heart been completely enveloped by an idea. The
allure and compulsion has practically caused me to fall in love with a
programming language. Maybe this is how Pythagoreans felt about their triangles.
With this great fire, I will tell you why Forth is forever.
WHAT IS FORTH
It's very possible many readers will be hearing about Forth for the first time
when they encounter this essay. There are two recommended readings for all
Forth first timers by Leo Brodie: Starting Forth, and Thinking Forth. The former
focuses more on concrete code, and the latter is a philosophical read. They go
hand-in-hand and I can't really recommend one over the other: it really depends
on the kind of person you are.
I will do my best to give an extremely short description of what Forth is in
this essay to remain self-contained.
Explain Like I'm 5: Forth Edition
Forth is a pancake stacking machine with superpowers.
top pancake5
pancake4
pancake3
pancake2
bottom pancake1
You eat these pancakes like you normally would: from top to bottom.
What do you normally eat pancakes with? A fork and a knife.
In Forth you can make your own super special fork and knife, maybe they shoot
lasers, but you can also make robots that help you.
For example, the Gigafork-o-tron 9000 lets you eat 3 pancakes at once, by
smushing them into 1 big pancake for you to eat:
smushed_pancake
pancake2
pancake1
With 100s of robots you can do some really cool stuff when you start feeding
them things other than pancakes, like build video games.
Why use Forth and not something else? Because all you need are some elastic
bands, pencils and paper to build this stacking machine with superpowers! These
are all items you've got lying around the house! Get going!
Explain Like I'm Familiar With Programming
Ok let's try again: so Forth is a stack-based programming language.
On the surface that statement means nothing except somehow the stack data
structure is somehow involved.
In Forth it's core to data passing (arguments) and your memory pool (allocated
memory).
To program arguments onto the stack, list them out before functions (words):
1 2 3 4 + + +
-> 1 2 7 + +
-> 1 9 +
-> 10
We're left with 10 by the end of the reduction.
You're intended to create 100s of words (functions), which are inserted into
the dictionary (a global stack), which make up a vocabulary. Yes, each Forth
program is a domain specific language (DSL).
\ A comment describing what's going on.
\ ( stuff ) is an inline comment but it is used commonly to denote expected
\ function arguments coming in on the stack. They do nothing.
: my-func ( int int -- int ) * 10 + ;
Forth can manipulate its own interpreter at "compile time". You can leverage
this when needing to parse formats, or create new syntax.
s" something-to-parse-until" parse
Doing common work like splitting strings or concatenating lists is all provided
as words too.
That's Forth. It's so simple the idea is you can bootstrap it from assembly
relatively quickly.
SUPERNATURAL BOOTSTRAPPING
Programming Language By Synthesis
Programmers will often use other programming languages to implement their new
programming language. The choice of parent language is usually determined by
what problems the child language is intended to solve. Choosing a suitable
parent reduces time to implementation, keeping the programmer motivated and
actually achieving a finished program.
A side effect of using a parent language is the child will inherit its traits.
The performance ceiling of the parent becomes the ceiling of the child, unless
it's able to evolve a code generator. All available parent language
packages and libraries are available to the child for use toward its own
development and extensions. Any memory management and access faults are
transfered.
It's understandable why using an existing language to bootstrap your own is very
reasonable.
A familiar case of the preceeding is Python. Python has and continues to be
written in the C programming language. Its C core hardly minimal, clocking in
at 35% of the whole codebase as of writing.
The Glasgow Haskell Compiler, the most prolific functional programming language
compiler, does the same, but with an exceptionally smaller, desireable core
size: 8% is C. Unfortunately it uses Python as well, pulling in 952 C source
files if we are to require it to build. So is it really small then, or simply
abstracted away into other software packages?
Clojure is a case of being a far descendant of C. Bootstrapped from Java,
which itself written in C++, further evolved from C. Clojure benefits from
Java's garbage collection and being able to interoperate with other Java source
files, while offering unparalleled expressiveness. CPUs that implement JVM
bytecode compatible hardware (such as the "Jazelle ARM extension") are an
interesting case of removing ancestrial ties, simplifying lineage. Similar
can be said with CPU architectures that favor C-like code.
C is Not Free
What's the problem then? C is the lingua-franca of the programming world. It
exists for nearly every practical architecture imaginable. It's free!
At the risk of getting absolutely roasted by two (or more) very strongly
opinionated communities, I will say pretty much (not every!) every mainstream
development in programming languages, Rust and Zig included, are guilty of this
fallacy. They are built around the illusion of their parent language having low
to zero cost.
LLVM is the typical basis of these languages: a heavy technology largely funded
by mega corporations. A Zig and Rust enthusiast argument about their language
being less complex than the other is off-putting. The reality is the complexity
comparison is surface level. The shape of two fruit are compared but the genetic
structure of the fruit is 90% the same. Theirs arguments revolve around
particular expressions of logic in various situations but ignores everything
else.
Zig and Rust experts will try to reason about this cost, that "they only make
up 1% of the code base". Uh, no, without that 1% there is nothing; the code will
not compile, and cannot be invoked. That 1% is made up of millions of dollars
and lines of code. It is not free, far from it for these LLVM-based languages!
Let me make an evidentary statement:
A C compiler takes a non-trivial amount of energy to create.
Following are projects that I believe cover a range of full-featured to smallest
C compilers, and ran `sloccount` on them. The evidence leans toward a good
estimate, as the timelines match up with the commit histories and the amount of
contributors.* The results are organized from most to least expensive.
clang**
Total Physical Source Lines of Code (SLOC) = 11,200,038
Development Effort Estimate, Person-Years = 3,570.35
Total Estimated Cost to Develop = $ 482,305,367
gcc**
Total Physical Source Lines of Code (SLOC) = 8,208,908
Development Effort Estimate, Person-Years = 2,576.50
Total Estimated Cost to Develop = $ 348,049,714
tcc
Total Physical Source Lines of Code (SLOC) = 108,038
Development Effort Estimate, Person-Years = 27.31
Total Estimated Cost to Develop = $ 3,688,901
chibicc
Total Physical Source Lines of Code (SLOC) = 9,487
Development Effort Estimate, Person-Years = 2.12
Total Estimated Cost to Develop = $ 286,832
*1 Of course you can call bullshit and verify yourself. The `sloccount`
website has further testimonials to its usefulness in estimates. Even
if the margin of error is a whole magnitude, the cost remains high.
*2 Fun fact these took like 8+ minutes to crunch, the others were instant.
*3 It must be made clear for transparency that clang and gcc SLOC counts
include GIMPLE, LLVM, and other tooling. Generally speaking both
compilers require them to work, but they include additional code for
non-C compilers, such as ALGOL, Fortran, and GO, as well.
Now add on top the person-hours required to build the child language. Bonkers.
Programming Language By Evolution
New programming languages are also created through more natural, evolutionary
means. Common examples are C++ and TypeScript. C++ came to be from C programmers
required to handle ever-growning requirements and complexity of systems
software. Web developers turned to type-systems to deal with their own
ecosystem and social complexity. structuring JavaScript code written by people
(and now robots) from all walks of life. What could such an evolution look like
when we go back to the roots of programming: assembly?
Assembly is the lowest level of abstraction we can get from the computer. The
IBM 1401 from the 1960s forced humans to write code in literal 1s and 0s.
Quickly we learned mnemonics are easier on the brain than monotonous numerals.
Textual code is a one-to-one abstraction that arose from naturally using the
technology of the time. As programs grew in complexity, and humans being
incredibly good pattern matchers, they realized a lot of code could be
deduplicated into reuseable chunks. At first this took the form of subroutines,
but it was not enough. The macro was created to resolve this complexity.
Once again a natural solution to a natural problem. Now assembly programmers
could create systems beyond their normal comprehension - literally!
This is where Charles Moore (aka Chuck) enters the scene. The idea of Forth is
not particularly genius or crazy, but his resolve toward simplicity is something
to really celebrate. The story goes Chuck worked at various astronomical
observatories throughout the years, and needed to program the telescopes. He
found himself needing to interact with all the different hardware frequently.
Bringing a set of macros to each new telescope, this is the system that became
Forth. An evolution of assembly macros; Forth is a true example of holistic
design.
While it took years to refine the idea of what Forth was, the implementation of
a Forth-like takes place all the time, to this day. The first thing language
implementation learners do: implement a stack machine. When a technology
accidentally exists time-again the cost is even better than free. It just is.
For me this equivalent to discovering how to create your own molecules. A true
lingua-franca of the Turing machine.
Masterclass Example of a Forth Bootstrap
SmithForth is a golden textbook example of bootstrapping a Forth from nothing.
SmithForth literally constructs executable headers using binary, and each line
of binary maps to an instruction, with each commented in great detail.
Here is an excerpt:
################ Interpreter subroutines #######################################
99 05 43 4F 4D 50 4C #### COMPL Forth's COMPILE, B9 ( ebx=xt -- )
B0 FF AA # compile >>>>>>>>>>>>>>>>> call r/m64 FF /2
B0 14 AA # al = _ mov r8, imm8 B0+rb ib
B0 25 AA # [rdi++] = al stos m8 AA
93 # eax = ebx xchg eax, r32 90+rd
AB # [rdi(++4)] = eax stos m32 AB
C3 # return ret C3
Absolutely psycho. And it goes on for ~500 lines. I am envious of the mind that
yields the mental model to parse this fine.
A C programmer in comparison would have to write a minimal C compiler this way,
to then write some C to bootstrap other C features. All while on that same
computer. Sounds painful. The person who wrote the smallest useful C compiler,
chibicc, shown in the previous section, also wrote 8cc. Both took over a year
to write.
Meanwhile the Forth programmer is off to solving their original problem in a
week. A full working environment. Their power continues to build as the
dictionary fills out with useful words. Maybe they go a little overboard and
develop a full operating system. That's the principle behind Virgil's incredible
Dusk OS. It's a testment to the practicality of doing that in Forth.
EXTENSION THROUGH SELF-MODIFICATION
Programming languages rarely expose methods to extend the language syntax
itself, or modify existing syntax. The latter is more common in interpreted
languages like Python or JavaScript, where an object property can be overridden
with a new one. Forget about this in systems languages. C++ has method override
but it's not the same: you cannot change the behavior of how addition works
on two 64-bit integers (such as treating them both as fixed-point numbers).
At most these languages expose complex macro or template systems, and it's a
discouraged practice depending on where, or how, you work. Adding more syntax to
already complex languages literally creates more complexity.
The opposite approach is taken in Forth: syntax is non-existent, so adding it
to simplify reasoning about the solution to a problem is encouraged. Each
program then is a narrow, well-focused expression of a solution.
For example, there's a complex tax situation, and you want to record and
calculate everything with some sort of special text record file. The playbook
for this problem is to write a parser. Design decisions such as number precision
and operations are made. Finally the the actual computations take place. Then
there's the output format.
In Forth, we leverage the language itself. Your program code doubles as the
record file. There is no separation because the Forth programmer can leverage
the Forth compiler, morphing it to their needs at run-time. Changing addition
to work on "bignum"s (numbers that have arbitrarily large precision) is a good
usecase of overriding the addition operation.
As you may have guessed, this is a real situation I found myself in. Encoding
tax brackets and having a system that calculated tax became trivial once I
defined the "tax bracket" word. Cute how it's a literal bracket:
...
\ Calculate a bracket.
\ ac:accumulated capital, rb:remaining balance
\ r:tax rate, l: upper limit
: ]> ( ac rb r l -- ac rb )
( ac rb r l ) frot
( ac r l rb ) f2dup f< if
fremaining frot frot f*
( ac rb tax ) frot
( rb tax ac ) f+ fswap
( ac rb ) else
fswap fdrop f* f+ 0e \ rb is zero
then
;
: tax-cra-income ( income-to-tax -- collected )
0e fswap
( collected remaining ) 0.150e 55867e ]>
0.205e 111733e ]>
0.260e 173205e ]>
0.290e 246752e ]>
0.330e fover ]>
fdrop
;
...
accounts LeeBusiness
accounts ACompany BCompany
-5674.84e tax-sales ACompany
remaining tax-income LeeBusiness
-2145.66e BCompany
remaining tax-income LeeBusiness
...
Then we arrive at the point of divergence: most languages stop at a parser for a
new file format. Little Forth goes beyond: you can also modify and extend the
"outer interpreter".
This is the REPL most Forths offer to developers so they can poke and prod at
hardware. In this case the hardware is literally the computer you're working at.
It is common for Forth developers to create their own rudimentary text editors
as an exercise. Quickly it becomes obvious how such a holistic system gives a
natural extension power very few other languages have.
Some people have taken this strength to the extreme: Virgil Dupras's DuskOS and
mschwartz's mykesforth both build entire operating systems off these principles.
DuskOS is intended to solve the goal of bootstrapping other C-based operating
systems with a Forth core that implements a C compiler.
mykesforth is a complete pure Forth OS.
EXECUTABLE COMPRESSION
Texas Instruments' introduction of the MSPM0C1104 this year (2025) has ushered
us into the world of nano-controllers. A computer that can sit on a pin-head.
With only 16 KiB of storage and 1KiB of memory. The 1980s were more generous:
most home computers had 64 KiB of memory and much more storage. It's no surprise
Forth found itself popular in that era. The ability to create essentially
compressed executable code was and still is unmatched and greatly desired. This
is achieved through a particular implementation called Indirect Threading Code
(ITC). Combined with heavy code refactoring into small functions, programs
become executable Lempel–Ziv–Welch compressions, which astonishingly also have a
concept of a "dictionary" and reusable parts. The parallels are remarkable.
There is literally no other language I'm aware of that does this kind of
optimization to this degree.
Briefly I will explain what ITC is.
Here is a pseudo example of some program code (& means address, : is a label):
my_func:
&other_func1
&other_func2
&other_func3
other_func1:
&deeper1
other_func2:
&deeper2
other_func3:
&deeper3
deeper1:
push a
ld a, 7
...
deeper2:
...
Each function, which is extremely small because we've done the correct practice
of refactoring them, reside somewhere in memory like usual. Compilers typically
will then use a CPU's `call` instruction to jump to the address and set the
return register. Forth instead opts to place these addresses to functions one
after the other. A function call may do tens to hundreds of lookups - so yes,
the trade-off is slower code... except you can decide to switch out that model
of function calling! It's a win-win. There is an even more compact variant
called Huffman Threading and it's exactly what it describes: using Huffman codes
to create indexes to functions. On the opposite end a developer can invoke
particular Forth compilers, such as gForth, to use familiar code optimizations
found in LLVM and other code generators, to generate fast code.
The best part of the above is you get the advantages by doing good practice.
Refactoring code into small composable functions is what everyone strives for in
the first place!
ENLIGHTENMENT GAINED FROM EXPLICIT DATA-FLOW
Time for a little detour into the higher level topic of data-flow.
Being a stack-based language, Forth passes arguments to functions using the
stack. Yes, pretty much every other language in existence also does this, but
usually only after registers are exhausted, and it's implicit. Never does a
programmer explicitly need to say "this goes on the stack", it just sort of
happens, and is placed in whatever position is deemed most optimal by the
compiler.
In Forth once again the opposite is true: argument passing on the stack is
explicit! The most compelling reason for this is how natural it is to pass an
arbitrary amount of arguments to a function this way, and how portable that is.
As long as the target machine has memory that can be linearly accessed, it can
express a stack machine. The original Turing machine is essentially a stack
machine. There is something inherently natural about these machines.
After you use Forth for a little you begin to realize, there is no logical sense
to have it reversed: a computer cannot be told to do something, expecting input
to act on, without the input!
It makes way more sense to have that data preceed the operation: 4 7 +.
Personally seeing and taking this fact in for the first time kind of unlocked
something in my head. I'm led to try algebra this way. I'm led to doing every
day quick math this way. No more does order of operations or parenthesis matter,
the order is baked into the expression. And there's a trick to reading
expressions written this way: you can read them like operations consuming their
operators, sweeping your eyes from right to left.
With explict data-flow, value copies and reuses become extremely obvious. They
become painful to see in code, similar to when encountering `.unwrap`s in Rust.
There is a new drive to remove any copies, because it causes the stack to thrash
more. You literally feel it in your brain when trying to reason about the stack
manipulations. These new experiences teach that we've had the blinders on this
whole time on data-flow.
AN ALTERNATIVE TO
An alternative to C
Python, JavaScript, Ruby being interpreted languages are incapable of writing
low level code to poke at memory registers, use pointers or inline assembly.
For years these needs have been fulfilled by C and C++, followed by some niche
languages like D. Within the last 10 years there has been an explosion of
contenders for a better systems / low-level programming language, with Rust and
Zig being the most well-known.
Meanwhile Forth has been busy benchwarming. Never do I see it recommended
as an alternative to C in programming discussions. It's surprising, because
Forth has the same low-level mechanisms as C and can express the same high
level constructs. It has pointers. It can poke arbitrary memory. The default
memory allocation model, the dictionary, can grow arbitrarily big, unlike C's
`malloc`. Forth has `alloc` too, but it's heavily discouraged. Sticking to
growing the dictionary forces the programmer to construct their program in such
a way that it only uses the memory it absolutely needs at any point in
execution.
All these C alternatives are also missing another Forth superpower: the REPL.
(Just like LISP!)
Forth is designed to play with hardware at run-time, being able to poke memory
with the keyboard as the program runs. The closest thing we've ever had to that
I think is Terry Davis's HolyC, that provides a C shell (lol), and could allow
the same technique.
And there's all the benefits that have been mentioned in previous sections:
cheap, fast, compact, simple, portable.
An alternative to LISP
If you've gotten this far, and love LISP, you have probably seen many parallels
to LISP throughout the essay. especially after the mention of REPL in the
previous section. The evidence is strong enough for me to believe that Forth
is some sort of LISP sibling.
LISP's birth took place in the sterile environment of academia at MIT, as a
language to further develop artificial intelligence. Forth was born in the
messy work world, with ever changing requirements and environments. I tell
people that Forth is a "working man's LISP", because of how closely it looks
like a LISP but born out of completely different circumstances.
Here are some direct code comparisons for a quick visual:
(+ 1 2 3 4 5) vs 5 4 3 2 1 + + + +
(defun my_func (a b) ...) vs : my_func ( a b -- ) ... ;
'my_func vs 'add
(LOOP (PRINT (EVAL (READ)))) vs BEGIN ACCEPT EVALUATE TYPE AGAIN
Is that not the just the coolest thing to see in awhile? A simple, fast,
low-level LISP is possible! It's called Forth! With a skip and a hop you can get
LISP-like list operations very easily:
: E+ 1 - 0 do + loop ;
1 2 3 4 5 5 E+
Now take into account how there is no garbage collection. Inline assembly is
possible. It is even more syntax-less than LISP. These are aspects I find myself
desiring more than hygenic macros, or data structures (lists) that also define
the language. Don't interpret that as me saying LISP sucks. It's me saying
I'm so incredibly glad there is a LISP-like with trade-offs I'm happy with.
Forth is a chimera of the Turing Machine and the Lambda Calculus.
UNSUITABLE CLASSES OF USE
The purpose of this section is to trick the reader into thinking of their own
reasons why they would use Forth.
Imagine a venn diagram with these aspects:
Against Both For
| |
security | |
memory safety | cost | ???
logical rigor | no-ast |
teamwork | |
Since file formats parsed by Forth are Forth programs themselves, we can't
really say Forth is suitable for security, since someone could simply craft a
file that is also some malicious Forth code. On the opposite end, if you work
in a secure environment, this is no longer a problem. So Forth is suitable to
use in all secure environments.
Working in a large team can be difficult with a growing list of vocabulary in
a custom Forth DSL. In contrast, Forth is an excellent language to work as a
solo developer, allowing you to work in your own comfy mental models.
Type systems and logical constructions are not concepts available to a Forth
programmer - which allows them to achieve some of the most unholy logic ever,
and get away with it! Of course, they can always develop their own type system
to work within.
While the grand total cost of Forth is significantly less than pretty much any
other mainstream language, the human costs can be much higher. A C or JavaScript
programmer cost significantly less because they can use abstractions that are
consistent across thousands of codebases. We can ignore the underlying
technology cost "just because". We have agreed as a society, or at least with
some sort of inherent bias, to using free compilers and interpreters for the
most part. This is why Forth is generally rarely, if ever, suitable for a
business, but thrives in individualism.
Those examples hopefully convey the duality of Forth's weaknesses.
CLOSING COMMENTARY
If there's one thing I'd like to drive home, it's the natural evolution of
Forth from the primoridial assembly ooze, and all the power for the low cost
it brings along.
I look forward to what challenges the next generation of Forth programmers will
face, and the beautiful spawn that will rise from them.
An Attempt at a Compelling Articulation
of Forth's Practical Strengths
and Eternal Usefulness
Thank you for reading.
- Lee
Special thanks to paultag, austin, appledash, and jon, for peer reviewing!
2025-11-09T21:25-04:00