Recursive macros in C, demystified (once the ugly crying stops)

In which it becomes clear, the C Preprocessor was designed by a Kafka fan

So you have heard rumors whispered between peers, that a rare few people somehow manage to make compile-time recursion work in C? And you want to have some insight into how that might be possible??

I should warn you, you’re risking your sanity… but I’ll indulge you.

Wait, did I really just say that? I must be a glutton for punishment, because the macro system is, by far, the thing I like least about C.

C has many advantages that have led to its longevity (60 years as perhaps the most important language). It has many quirks due to its age, most of which are easy to look past. But despite 30 years writing C, I still bristle at C’s macro system.

That’s not just because there are many languages (including …

In which it becomes clear, the C Preprocessor was designed by a Kafka fan

So you have heard rumors whispered between peers, that a rare few people somehow manage to make compile-time recursion work in C? And you want to have some insight into how that might be possible??

I should warn you, you’re risking your sanity… but I’ll indulge you.

Wait, did I really just say that? I must be a glutton for punishment, because the macro system is, by far, the thing I like least about C.

That’s not just because there are many languages (including C++) with more modern takes on compile-time execution. Macros appear simple, but have subtleties that make them poorly suited for anything other than light wrappers.

Still, being C’s only compile-time execution capability (currently), it is still both critical and important. Critical, in that many venerable critical systems heavily depend on them, and wouldn’t compile without them. Important, in that it’s often the only way to abstract out complexity that would lead to safety or security issues if exposed, such as automatically adding sentinels or static type checks.

C macros being hard to use does discourage their overuse. It’s easy for too much abstraction to make it too difficult for other people to maintain the code, so in some ways, as painful as they are, I can find some things to appreciate, and do occasionally find reason to use them for something non-trivial.

But it doesn’t take much for a macro to be non-trivial, because, while C macros can look like functions, they cannot be called recursively (at least, not easily, as we will see).

I have never been able to find out why recursion is limited in C macros. It could have started off intentional, but compile time execution wasn’t really on people’s minds then; the challenge was abstracting over many platform differences as cheaply as possible.

I suspect the system evolved as needed in the early days, without really thinking about it as something that perhaps should support recursion. Certainly at some point, the question would get raised; but it’s easy to imagine:

The organic evolution of the macro system coupled with early success made it brittle, and hard to evolve.
People were worried about build issues like hanging compile times due to infinite loops, or crashing with no diagnostics, due to infinite recursion.

Depending on which of these two was more prominent, I could equally imagine the lack of recursion being an accident, or being an intentional choice.

However it happened, that die was cast in a completely different era—people have definitely woken up to the value of pushing as much into compile time as possible.

Either way, it all seems archaic and unnecessary now.

So, let’s roll up our sleeves and learn to cope with the issue!

Motivation

If you have anything interesting you need done with the preprocessor, you probably need to generalize over multiple items.

Maybe you want to add automatic sanity checks around parameters, or add automatic type checking. You might want to pre-fill arrays, or generate a list of functions based on some data. Generally, we should be able to do such things at compile time, and in cases like type checking, it is often incredibly challenging to defer the work till runtime. So the lack of a good compile time solution for such things is problematic.

I tend to reach for macros when they can remove the potential for human error; for example, calling an API wrong. That can indeed be adding automatic casts, enforcing that null terminators get added on variable argument arrays, etc.

In all of those scenarios, we would need something to move macros toward Turing completeness at compile time. But macros do not advertise support for either of the things we’d be looking for there: iteration, and recursion.

Thankfully, we can get there. But it’s not going to be easy.

To frame our discussion, let’s pick a simple, but highly valuable goal. We’re going to build a macro that counts the number of variable arguments in a function-like macro that accepts variable arguments.

Why that problem? Because from there it’s a short jump to dealing with some common issues, where we can remove large sources of human error:

Variadic functions (aka varargs) are error prone, because the implementation has to figure out where the arguments stop; the language doesn’t give you a way to know. The caller has to remember to follow your convention (like adding a null terminator). And that convention can often back fire, for example, when a null value is a valid argument . Being able to count the number of variadic arguments allows us to provide a single, general approach to dealing with this, and to not put the burden on the caller to have to count correctly.
There are plenty of cases where we’d want to apply a transformation to each argument of a variable argument function, like statically checking that parameters are all the same type, or automatically adding a layer of sanity checking to a third party API, so that the people calling your function can remain blissfully unaware. These don’t directly require counting, but once we can count recursively, it’s a small change to give ourselves a more general purpose map construct to make such transformations.

You might be surprised that the language doesn’t provide a way to count variable arguments in a macro, especially if you noticed the recent addition of a pre-defined macro called __COUNTER__ in the draft C2Y standard. The new __COUNTER__ macro is intended to make it easier to provide uniqueness for situations where macros need to generate identifiers and labels; it definitely isn’t for counting variadic macro arguments.

Apparently math is hard?

If there’s no primitive to count variadic macro arguments, well, we need to create one, right? And if we have to create one, that means it should be pretty easy, one would hope?

🦗🦗🦗

I see you’re skeptical. But an optimist who wanted to delve into compile-time coding in C for the first time might give it a go. But they will quickly find that the obvious approach below, that feels like it should work, absolutely does not work:

// The first macro... counts one argument, then we'd like it to recurse.
#define _COUNT_ONE(x, ...) + 1 _COUNT_TOP(__VA_ARGS__)
#define _COUNT_TOP(...)    __VA_OPT__(_COUNT_ONE(__VA_ARGS__))
#define COUNT(...)        (_COUNT_TOP(__VA_ARGS__) + 0)

Try to call this code, and…

🤮

Yup, it’ll barf.

What this doe-eyed attempt is trying to do, is generate an expression (at compile time) of the form (+ 1 + 1 + … + 0), by iterating over each argument at a call site (via recursion). It doesn’t care what the arguments are, it just wants to generate a + 1 for each valid argument, and a + 0 at the end, both to make it a valid expression and to handle the case where there are no arguments passed.

If we’re successful, the addition will all happen at compile time, and will be what C calls an integer constant expression. That means, the compiler will, at compile time, fold this into a single static integer. So, even if it feels inefficient, there is no run-time cost involved.

Why three macros? That seems a bit excessive, right?

Unfortunately, C currently doesn’t have an easy way to do the equivalent of an if() statement at compile time. It’s possible to create something with macros, but that’s just extra hackery.

Instead, we split the primary body into its own macro— _COUNT_ONE(), which adds the 1 + and then triggers recursion (we wish, anyway; again, the recursion part won’t work this easily).

The intent the behind _COUNT_TOP() macro is evaluating an exit condition for our recursion. Specifically, we want to stop when we have no more arguments left in the function. The builtin macro __VA_OPT__() allows us to do exactly that— the text inside the parentheses gets expanded only if there are arguments. And when there are no arguments, the text inside the parentheses is discarded.

This gives us a lightweight way to separate the one argument case from the two argument case, without need for a kind of if statement; the 0-case won’t generate a recursive call. To combine this with _COUNT_ONE(), we’d need a primitive that allowed us to specify a replacement that only expands when there are no arguments, which doesn’t come out of the box.

The outer COUNT() macro could be factored out trivially; I leave it because it keeps what’s going on a bit clearer. This macro is the actual entry point, adds the ‘0’ a single time the end, and wraps the whole thing in parentheses, which helps avoid running afoul of operator precedence rules.

If we directly combine it with _COUNT_TOP() in the obvious way, it would compute the right thing, but it would do it by adding the + 0 after EVERY term, and parenthesizing the expression in a way that would come off as odd if you were looking at the resulting C code. For instance, we’d be aiming for a three argument function to generate:

( + 1 ( + 1 ( + 1 ( + 0 ) + 0 ) + 0 ) + 0 ) If we were to generate this code, it’d be ugly, but most people would declare success and move on. But unfortunately, the above code does not work, and will never work, no matter how many iterations of the standard are released between now and the heat death of the universe— changing the behavior would be too likely to impact plenty of real code.

For example, let’s attempt to use the above implementation like so:

#include <stdio.h>

int
main()
{
printf("COUNT() = %d\n", COUNT(1, 2, 3));
return 0;
}

With clang, I get:

tmp.c:8:30: error: expected ')'
8 |     printf("COUNT() = %d\n", COUNT(1, 2, 3));
|                              ^
tmp.c:4:28: note: expanded from macro 'COUNT'
4 | #define COUNT(...)        (_COUNT_TOP(__VA_ARGS__) + 0)
|                            ^
tmp.c:3:39: note: expanded from macro '_COUNT_TOP'
3 | #define _COUNT_TOP(...)    __VA_OPT__(_COUNT_ONE(__VA_ARGS__))
|                                       ^
tmp.c:2:32: note: expanded from macro '_COUNT_ONE'
2 | #define _COUNT_ONE(x, ...) + 1 _COUNT_TOP(__VA_ARGS__)
|                                ^
tmp.c:39:30: note: to match this '('
tmp.c:4:27: note: expanded from macro 'COUNT'
4 | #define COUNT(...)        (_COUNT_TOP(__VA_ARGS__) + 0)

Wow, that’s a lot of error messages, saying little that makes sense.

I’m sure you can already get the feeling that, when you have a problem with your macros, it’s incredibly challenging to translate the resulting errors into what’s actually wrong. Here, it’s complaining about balancing parentheses, and with just a bit more complexity in our macros, it’d be incredibly easy for someone to spend 10 minutes trying to figure out where the parenthesis is missing, when no parenthesis is missing whatsoever.

Or, we could have taken advantage of the fact that C is perfectly happy to accept + + 0, and write 1 + instead of + 1.

Making that microscopic change, merely transposing two tokens, completely changes the error clang produces:

tmp.c:8:30: error: call to undeclared function '_COUNT_TOP'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
8 |     printf("COUNT() = %d\n", COUNT(1, 2, 3));

Hey, at least that message is concise. Never mind that it’s totally different, and also unrelated to the real issue.

Not counting on it

The C Preprocessor (which I will usually call CPP) is responsible for macro expansion and processing lines with a leading #. As we’ve said, it does not fully support recursion. As you might expect, that’s the core of the actual problem in our first attempt. Yet, the preprocessor happily thinks it did its job. We’ll see in more detail what’s going on, but the crux of this particular problem is how recursion is disallowed, not that it IS disallowed.

The problem here is that C macros are their own programming language, being used to generate C code. The macro language doesn’t model most of the interesting parts of the language, and it is quite easy to produce code that the preprocessor finds acceptable, that the compiler cannot understand (as we will see).

In both these cases, the preprocessor feels like it’s done its job, and passes off its work to the C compiler. The C compiler gets the generated code, and has no idea that macros were used. It calls the error as it sees it.

This disconnect between the preprocessor and the compiler is one of the things that makes macros in C so unfriendly.

If a macro expansion (basically the same as an ‘evaluation’) is recursive, the CPP decides “they can’t possibly have wanted recursion here, because that might loop forever, so this must be plain old text I have to substitute”. As a result, the emitted code will still contain the unexpanded macro.

How can we confirm this? If we can’t understand the resulting transformations, we’re going to end up stumbling around in the dark.

Many developers don’t know how to see what the C preprocessor actually produces. If the preprocessor successfully exits, we can see it’s output by stopping the compiler after the preprocessing phase, generally with the -E flag. If we don’t give a file name (via the -o flag), we should see the results on the terminal. And at least in the case of clang, we will even get output up to the point that we did something so wrong that the preprocessor gives us an error.

For the code above, running cc -E tmp.c works without errors, and dumps the output of CPP to my terminal.

That consists of a lot of stuff you might not expect to see. The output contains our code after the preprocessor has fully expanded it. But that full expansion includes the results of it preprocessing all of the header files we pulled in, which in our case was stdio.h, and any cascading dependencies it might have.

However, our code is easy to find in that noise. The last thing output will be our fully translated main() function, ready to be input into the C compiler. For the case where we add +1 at the beginning of the macro, we will see:

int
main()
{
printf("COUNT() = %d\n", (+1 _COUNT_TOP(2, 3) + 0));
return 0;
}

Here, the compiler doesn’t have to try to look up the symbol _COUNT_TOP; it knows that it doesn’t make sense to have a function call after a number with no operator in between.

When we reverse the + and the 1, the line of code is valid, as long as there’s a function C can resolve called _COUNT_TOP. Because there isn’t, the compiler bails.

That explains why we get two different errors, for such a minor change.

Because CPP and the compiler itself are oblivious to the execution of the other, and because we have to live with the fact that recursive macros aren’t errors to the CPP (silently passing them through unexpanded), it’s quite a bit of work for any compiler to even try to tell you that your problem here is attempting to use recursion in a macro. It could be done, and maybe it should be done, because otherwise, compilers are effectively trying to gaslight you into believing you have a syntax error of some sort.

Because the preprocessor is much more permissive than the compiler, without the compiler having any awareness of macros even existing is perhaps the most significant reason why writing non-trivial macros is so hard to do.

Again, we can hack our way around the recursion problem. The semantics are arcane and intricate enough that, even knowing the rules (and doing macro work with a copy of the standard at the ready), macro development is incredibly challenging the second you have any problem at all. Decades later, I often feel like I’m stumbling around in the dark when a macro I write blows up on me.

If this is all too intimidating, absolutely we can plagiarize our way to success. Though, personally, I really prefer not to cut-and-paste code from Stack Overflow, especially if it’s code I don’t understand. Similarly, while Claude and I are casual acquaintances, I do not tjava his code. It’s my unwillingness to using code I don’t understand in production that keeps me learning and growing. Instead, I avoid non-trivial macros, unless (as I said above), I make an exception when they will be a huge net positive for helping the developer, usually by removing potential failure modes, or with significant clarity improvements. Meaning, if you can use them to provide an abstraction that makes the code more robust, and is also not going to be hard to maintain if the need arises, then I’d consider it (even if you have to get Claude to write it). Here, automating size detection statically feels like a good enough use case for my tastes, because macros will not only make variadic functions easier to write, but also make it far easier to call them correctly.

So, I’d like to help those interested understand how to navigate through the pain, and shine a light on it, in the hopes that this is another area of the language that the modern standards committee can make massively better.

That doesn’t count

If we want to know how to circumvent the recursion restriction, we probably need to understand the detection mechanism we are attempting to evade.

It sure would be nice if we could debug by having our compiler give us intermediate expansions, up until the point that it breaks. This is not directly built into any compiler as far as I know. And my experience with macro debuggers has been that they have a hard time matching compiler semantics. I dusted one of them off when working on this article, and it was easy to get it to expand macros as valid that CPP barfed on, and vice versa. Despite the lack of tooling, I’ll walk through the expansion process in detail so we can all understand.

The rules for C macro evaluation are hard to explain in a way that’s simultaneously precise and clear. But for function-like macros, the main process of evaluating a macro boils down to:

Replace the macro text. Stashing aside the arguments used to call the macro, we replace the full macro with the textual body, within the larger token stream we’re processing.
Add placeholder tokens. Instances of ‘arguments’ in the body get replaced with placeholder tokens, to prevent them from being evaluated as macros in any nested argument expansion we might have. This includes __VA_ARGS__ and __VA_OPT__() invocations.
Evaluate preprocessor operators in the body that take operands , particularly # and ##. We don’t make good use of these operators in this article, so we won’t cover in too much depth. Note, however, that __VA_OPT__() is also a preprocessor operator that takes arguments. We inhibited expansion within the replacement text, when we were evaluating operators, but at the end of this phase, we put it back; it can get expanded in the next step.
Rescanning the body. The body is then scanned for more macros to expand, starting from our cursor in the token stream. The “input” head moves forward a token at a time, until we mind a macro to expand, or reach the end of the macro we’re evaluating. When we find a macro to expand, we recursively apply the algorithm.

There are some pretty large subtleties here. As long as a macro invocation starts in the scope of a rescan, the scanning head position can move past the end of the original macro. For example, consider this basic scenario:

#define CONCAT(X, Y)  X ## Y

int PRINT_INT = 100;

int
main()
{
printf("%d\n", CONCAT(PRINT_, INT));
}

The ## operator, seen used in the CONCAT macro, appends two tokens together, turning them into a single preprocessor token.

When the preprocessor evaluates CONCAT(), it will result in the token PRINT_INT. If PRINT_INT were a macro, the preprocessor would do further expansion. But it is not, so the preprocessor outputs PRINT_INT. The C compiler does not complain, because it sees a variable named PRINT_INT.

So far, depending on your background, the semantics may or may not being intuitive. But either way, what do you think should happen in this slightly more complex scenario?

#define CONCAT(X, Y)  X ## Y

#define PRINT_INT(N)  printf("look an integer %d\n", (N));

int PRINT_INT = 100;

int
main()
{
printf("%d\n", CONCAT(PRINT_, INT));
CONCAT(PRINT_,INT)(100);
}

It would be reasonable to think there’s an error here, but this code will compile and run. Why? In both cases, the preprocessor will generate the token PRINT_INT. In the first case, everything will happen the same way it did in our first example, and the compiler will see the variable PRINT_INT.

But, with the second use of CONCAT, the preprocessor will see that there is a parenthesis immediately following the token PRINT_INT. Since it has a function-like macro with that name, it will prefer the macro interpretation.

That’s true, even though we didn’t directly write PRINT_INT in the code, there. The effective result of the preprocessor’s expansion would look like this:

#define CONCAT(X, Y)  X ## Y

#define PRINT_INT(N)  printf("look an integer %d\n", (N));

int PRINT_INT = 100;

int
main()
{
printf("%d\n", 100);
printf("look an integer %d\n", 100);
}

That’s because C’s macros come in two flavors, with slightly different semantics:

Function-like macros, which take arguments, and syntactically LOOK like functions. As we see here, if the preprocessor sees a token with the same name as a function like macro, but it’s not used like a function like macro, it will pass it through, letting the C compiler resolve the token.
Object-like macros, the definitions of which look like variables, and do not take arguments. If the C preprocessor sees a left parenthesis after an object-like macro, that token will just be passed through directly to the compiler.

That is why the following code does NOT error. In fact, it runs quite happily:

#define OBJECT_LIKE_MACRO printf
#include <stdio.h>
int main()
{
OBJECT_LIKE_MACRO("Hello, world!\n");
}

Meaning, if we we have an object-like macro and it looks like we’re trying to call it, the C preprocessor isn’t going to call it. It’s just going to pass the token through, and let the compiler figure it out.

As you can see, the preprocessor’s philosophy is to do its job, and nothing else.

Another preprocessor subtlety that’s easy to miss, yet important to understand is:


ℹ️️	Arguments to function-like macros are expanded at the call site. For expansions that trigger inside the body where those arguments are used, the contents of the expanded arguments will not be available to be part of any additional expansion.

The preprocessor’s approach of substituting arguments with placeholder tokens during evaluation is pretty effective at stopping a bunch of accidental recursion that would be non-intuitive. Although, it’s not the problem for the recursion we’re trying to solve. The barrier we’re hitting is a subtlety that we haven’t discussed yet, but we’ll get to it soon enough.

Before that, let’s solidify our understanding by walking through the relevant steps with our invocation of COUNT(1) .

We’ve learned that, when COUNT() has an invocation of _COUNT_TOP(), the replacement text cannot lead to recursion. The expansion endures the rescan, and nothing attempts to expand it.

The rule that’s preventing the expansion is effectively an explicit anti-recursion rule. Formally, when we replace a macro, that macro is marked as currently being replaced. The mark stays, until that replacement is totally finished, including its rescan.

And, unfortunately, marked macros are ineligible for replacement. Note the rescan ineligibility is IN THE CALLING ENVIRONMENT. That’s going to cause us some grief in a few minutes.

For whatever reason, the original C89 standards committee referred to a macro marked as ineligible for expansion as painted blue.


❓	Why the term painted blue? Perhaps the standards committee at the time realized how miserable it was going to make future C developers? It’s not a term mentioned in the standard, but is often used when people try to explain the whole process.

I’m blue, because I can’t count on you

We can work around the problem, despite there being no way in C to opt out of your macro getting painted blue. But the work-around is going to be hard fought.

First, let’s give ourselves a cause to be optimistic: the restriction that’s preventing _COUNT_TOP() from recursively expanding is relaxed when the second _COUNT_TOP() gets replaced. If, instead, the restriction stayed in full force the entire time we’ve evaluating COUNT(), then you would not be able to do the following:

#include <stdio.h>
#define H4X(x) # x // convert to string
#define DUPE(token) H4X(token) H4X(token)

int
main() {
printf("%s\n", DUPE(h4x0r));
}

But that will absolutely work. The preprocessor will output exactly what I intended:

int
main()
{
printf("%s\n", "h4x0r" "h4x0r");
}

In C, two string literals next to each other are merged into a single literal at compile time, so this program prints:

h4x0rh4x0r

This leads us to believe that the blue paint wears off when the rescan moves past an expanded macro, after we’ve processed its replacement.

So our hypothesis right now might be that _COUNT_ONE() expanding inside COUNT() is fine, as long as it happens after the processing head moves past the start of where the macro was, after expansion.

If that’s the case, then all we need to do now is add another layer of indirection, right? Let the calling macro evaluate its recursive call on rescan!

That would make sense! We’ll rewrite our attempt at recursion to add a proxy layer to call _COUNT_TOP(), knowing it cannot re-expand.

#define _COUNT_ONE(x, ...) + 1 _COUNT_TOP(__VA_ARGS__)
#define _COUNT_TOP(...)    __VA_OPT__(_COUNT_ONE(__VA_ARGS__))
#define _COUNT_PROXY(...)  (_COUNT_TOP(__VA_ARGS__) + 0)
#define COUNT(...)         _COUNT_PROXY(__VA_ARGS__)

We’d now expect the replacement when our evaluation gets back up to COUNT(1, 2, 3) to look like:

(+1 _COUNT_TOP(2, 3) + 0))

So now, we’re thinking that escaped the paint, and _COUNT_TOP() will further expand, right? 🦗🦗🦗

It’s time to shatter our youthful optimism with the bitter pill of experience. Yes, we’ve added an extra indirection, but here’s what it expands to:

(+1 _COUNT_TOP(2, 3) + 0))

🤯

That’s the same as our intermediate expansion, but nothing further was done to the macro! I thought we made our way past the paint?

Clearly, there are subtleties to the rules somewhere. Clearly, the author is a jerk, and must have intentionally failed to mention it above.

It’s true I am a jerk, and it’s also true that I failed to mention the restriction that’s biting us right now. That’s because it’s a part of the journey— no tutorial or explanation I’ve ever seen made it clear to me that restrictions on expanded macros can survive past the call site.

So maybe I’m just obtuse, and torturing others because I once suffered long ago. Let’s go look at the relevant text in the C23 standard, shall we? It’ll either be enlightening… or further torture.


ℹ️️	After all parameters in the replacement list have been substituted and # and ## processing has taken place, all place-marker preprocessing tokens are removed. The resulting preprocessing token sequence is then rescanned, along with all subsequent preprocessing tokens of the source file, for more macro names to replace.If the name of the macro being replaced is found during this scan of the replacement list (not including the rest of the source file’s preprocessing tokens), it is not replaced. Furthermore, if any nested replacements encounter the name of the macro being replaced, it is not replaced. These non-replaced macro name preprocessing tokens are no longer available for further replacement even if they are later (re)examined in contexts in which that macro name preprocessing token would otherwise have been replaced.

ℹ️️

After all parameters in the replacement list have been substituted and # and ## processing has taken place, all place-marker preprocessing tokens are removed. The resulting preprocessing token sequence is then rescanned, along with all subsequent preprocessing tokens of the source file, for more macro names to replace.If the name of the macro being replaced is found during this scan of the replacement list (not including the rest of the source file’s preprocessing tokens), it is not replaced. Furthermore, if any nested replacements encounter the name of the macro being replaced, it is not replaced. These non-replaced macro name preprocessing tokens are no longer available for further replacement even if they are later (re)examined in contexts in which that macro name preprocessing token would otherwise have been replaced.

The sentence I bolded, I think we clearly understood; I did explain it above. The two sentences afterward, which I put in italics, is where the restriction that’s hurting us is specified.

For years, I took that to mean, “Even at the top level where the macro was called, no matter how many times subsequent replacement text gets rescanned, the called macro is still ineligible for expansion.”

What am I doing wrong? Why does the blue paint leave and then come back? This makes no sense. Right? Right?

😭😭

(I warned you, there’d be ugly crying. Better me than you though; I clearly deserve it).

Close n-counter

Perhaps it would be obvious to most readers that I had misinterpreted the standard. Perhaps, but even once I finally realized that my original interpretation couldn’t possibly be right, I still feel the above text from the standard is bit under-specified.

Specifically, what are the boundaries for “nested replacement”? Clearly it’s not the case that once a macro is called, a parent of the calling macro can never replace it again. So does it mean, “the resulting text can never, in any way be involved in an expansion which produces an expansion ever again?” requiring full taint tracking of the replacement through all future transformations as long as the text could possibly be rescanned, no matter what?

Or, does it only apply to any text with the name of the function we called, and the second that changes, it could change back?

Or maybe there are some different semantics?

Let’s roll up our sleeves, with another little experiment, to help us determine how we should interpret the above test. What we’d like to see, is, can _COUNT_ONE() produce a macro invocation with a different name, that we don’t try to expand until after leaving the context in which it and _COUNT_TOP() are painted, and then somehow replace that macro with _COUNT_TOP()?

If we can make that happen, then we’ll test to see if we can call the resulting function is callable. Let’s junk our previous experiment, and go back to where we were before:

#define _COUNT_ONE(x, ...) + 1 _COUNT_TOP(__VA_ARGS__)
#define _COUNT_TOP(...)    __VA_OPT__(_COUNT_ONE(__VA_ARGS__))
#define COUNT(...)        (_COUNT_TOP(__VA_ARGS__) + 0)

Now, we want to try to change the value of _COUNT_TOP to something else to escape detection. It’s got to be a valid macro, but one that we’re NOT going to end up expanding when rescanning _COUNT_ONE() or _COUNT_TOP(). If we call it _COUNT_INDIRECT, we don’t want _COUNT_INDIRECT(2,3) to evaluate until we pop all the way back up into _COUNT().

Sounds like a tall order, but there’s are a couple of facts we’ve already learned, that can help us:

We know that, if we have a function-like macro named X(), the preprocessor does not consider a bare X with no parenthesis next to it to be an invocation of X.
Rescans start at the replacement site, they don’t back up tokens.

So perhaps we can separate the function name and the arguments for a while, and somehow bring them together in a way where we could rescan it inside COUNT().

Having the function name separate from an argument list will both keep it from running, and will help us avoid a rescan the first time we get the left parenthesis to plop down in the right place.

We just need a way to add a spacer of some sort that goes away at rescan, to keep apart the name _COUNT_ONE from its argument list. That is, we’d conceptually like to do:

#define _COUNT_ONE(x, ...) \
+ 1 __VA_OPT__(_COUNT_ONE <<SPACER>> (__VA_ARGS__))

But we’re going to change the name of _COUNT_ONE to _COUNT_INDRECT. If we just do:

#define _COUNT_INDIRECT _COUNT_ONE

Then we’re going to re-generate _COUNT_ONE on the rescan, which is painted blue.

So it’s actually _COUNT_INDIRECT where we currently have the dire need to postpone evaluating it.

Therefore, we need to turn _COUNT_INDIRECT into a function, and keep THAT identifier separated from the parentheses that trigger it, via our to-be-written spacer. So this is the definition we want instead:

#define _COUNT_INDIRECT() _COUNT_ONE

We’ll also need to add the empty parameter list to invoke it on the other side of our spacer. So here’s what we really need _COUNT_ONE to look like:

#define _COUNT_ONE(x, ...) \
+ 1 __VA_OPT__(_COUNT_INDIRECT <<SPACER>> ()(__VA_ARGS__))

As we cascade up with replacements, we want the spacer to disappear, leaving:

_COUNT_INDIRECT()(2,3)

Remember, in the contents where we replace the spacer, we will have advanced the input head past _COUNT_INDIRECT. So if the spacer expands to nothing, the rescan will know that () isn’t a replaceable macro, and go on with its day. But it will leave _COUNT_INDIRECT next to the () so it can be called above, to generate the correct name.

You can count on me being empty inside

It’s not hard to get something to expand to the empty string.

And then, once we get back up to COUNT(), we will evaluate _COUNT_INDIRECT(), which will leave us with _COUNT_ONE(2,3).

Once we get that far, will tinker to find if there are any conditions where we can re-invoke _COUNT_ONE(). Because hey, we already know there are, we just might not know WHAT they are.

It’s not hard to create a spacer. We can define a macro named EMPTY like this:

#define EMPTY

And use that as our spacer. If you’ve ever peeked into someone’s recursive macros and been dumbfounded with what you saw, there’s a good chance you saw a macro named EMPTY and at the time were thinking, “what the heck could that possibly do?”

Now you know. But when you did see it, it was probably defined as a function-like macro instead:

#define EMPTY()

Using it to postpone evaluation is as easy as:

_COUNT_INDIRECT EMPTY() () (2, 3)

Why would we use a function-like macro for our spacer? Doing so makes it easy for us to control how long we want to wait before we’re able to evaluate what we’re separating.

Specifically, let’s say we nest our recursive call several levels down. Every level, we use the same trick recursively, separating EMPTY() apart… using another EMPTY() invocation.

For instance, if we need to postpone a total of three layers, we could write:

_COUNT_INDIRECT EMPTY EMPTY EMPTY() () () () (2, 3)

Honestly, EMPTY() is a confusing name that detracts from what’s happening. we can encapsulate this into a more readable macro… or macros, one for each level we might want to postpone evaluation:

#define POSTPONE1(macro_name, args) macro_name EMPTY() args
#define POSTPONE2(macro_name, args) macro_name EMPTY EMPTY()() args
#define POSTPONE3(macro_name, args) \
macro_name EMPTY EMPTY EMPTY()()() args

We’ll only need the first of these by the way; but now you know, if you never have a use case where you have deeper nesting (though if you do, maybe worry your macros are getting too complex to be readable?)

Our POSTPONE1 macro encapsulates the unintuitive EMPTY() madness for us, allowing us to instead write:

#define _COUNT_ONE(x, ...) \
+ 1 __VA_OPT__(POSTPONE1(_COUNT_INDIRECT, ())(__VA_ARGS__))

The EMPTY() invocation is hidden inside POSTPONE1() That makes the code we use here, it’s more explicit that we’re going to postpone expanding _COUNT_INDIRECT. We’ve even added the args we want to call it with as a second parameter, to make it more clear what we’re doing, instead of having a bunch of consecutive argument lists detached from their identifier, which many engineers find alien and incomprehensible.

So far, the rest of what we have is:

#define _COUNT_TOP(...)   __VA_OPT__(_COUNT_ONE(__VA_ARGS__))
#define COUNT(...)        (_COUNT_TOP(__VA_ARGS__) + 0)
#define _COUNT_INDIRECT() _COUNT_ONE

Let’s now see if it is getting the right text up to the top, even though we won’t yet try to get it to expand (and thus, we will get a compiler error). If we invoke as COUNT(1, 2) again, CPP will generate the following expansion:

(+1 _COUNT_INDIRECT ()(2) + 0)

That… looks exactly like what we were hoping to see. As we wanted, the pieces came together, and _COUNT_INDIRECT() did NOT get evaluated during the rescan process. If we had made a mistake, and the rescan had happened, it would have been replaced with _COUNT_ONE, which we already know we could not trigger for re-evaluation.

Okay, let’s now see what happens if we try to force the expansion of the above, to finally get an answer to our question as to the true scope of blue paint.

Let’s wrap the body of COUNT() with a call to a passthrough macro, which we’ll name EVAL():

#define EVAL(...)  __VA_ARGS__
#define COUNT(...) EVAL((_COUNT_TOP(__VA_ARGS__) + 0))

Based on our rules above, EVAL() will substitute, and then rescan. So this passthrough macro forces the rescan we want.

The test case for our macro should be:

#include <stdio.h>

int
main()
{
printf("%d\n", COUNT(1, 2));
}

You might notice, this actually compiles. And if you run it, it gives the right answer.

🥹

Wow, are we done?

🤣😂🤣😂🤣😂🤣😂🤣😂🤣😂🤣😂🤣😂🤣😂🤣😂🤣😂🤣😂🤣😂🤣

While we may be farther than expected, we don’t have something that will fully work. What’s important is that we’ve shown that blue paint FULLY gets removed from a macro when:

We have finished all rescans of that macro where it was called; and
EITHER the macro does not appear in the replacement text, or we completely exit the recursive expansion.

Unfortunately, we still have a problem. If you change your invocation to COUNT(1, 2, 3) then your code will no longer compile. Instead, it will expand to:

(+1 +1 _COUNT_INDIRECT ()(3) + 0))

What’s happening should be clear at this point: while we are iterating over arguments, we are stopping after the second iteration.

Great, that’s easy to fix. We can do so by…

Passing the output of EVAL() to another call of EVAL(), like so:

#define COUNT(...) EVAL(EVAL((_COUNT_TOP(__VA_ARGS__) + 0)))

Before we address the obvious complaint, I’d like to point out that the extra EVAL() doesn’t mess up the results if we change our test invocation to COUNT() or COUNT(1).

EVAL() will expand, but it doesn’t break anything when there are no macros in the text passed to it that are eligible for expansion. Once the last possible expansion happens, it will just keep copying its arguments to the replacement text, per the algorithm above; the associated rescans have nothing to do.

Given that, it’s common to say, “Nobody needs more than 10 arguments!” and do something like:

#define _E(...) __VA_ARGS__
#define EVAL(...) _E(_E(_E(_E(_E(_E(_E(_E(_E(_E(__VA_ARGS__))))))))))

Although, more commonly, you’d probably see people do more expansions, perhaps enough to accommodate 20 or 50 arguments. Even there, I’ve seen some Microsoft APIs that convince me, that’s too low a limit, so I’d suggest at least 100 iterations.

We can use a much smarter approach that can still fairly compactly get us as many expansions as we think we might ever need.

I’d guess that, if we saw a function with 1024 arguments, it would be explicitly TRYING to break our macro. So let’s do the base 2 version of the Spinal Tap “one more”, and get to 1025 expansions:

#define _E1(...)    __VA_ARGS__
#define _E8(...)    _E1(_E1(_E1(_E1(_E1(_E1(_E1(_E1(__VA_ARGS__))))))))
#define _E64(...)   _E8(_E8(_E8(_E8(_E8(_E8(_E8(_E8(__VA_ARGS__))))))))
#define _E256(...)  _E64(_E64(_E64(_E64(__VA_ARGS__))))
#define _E1024(...) _E256(_E256(_E256(_E256(__VA_ARGS__))))
#define EVAL(...)   _E1024(__VA_ARGS__)

The last expansion really is gratuitous; we could name _E1024() toEVAL() and stop on the power of two, but what fun is that?

Before writing this article, I generally stopped at 256 iterations, but I was curious as to whether the compiler’s CPP was smart enough to skip unnecessary evaluations, given it’s a common idiom. I ramped the number of iterations up to 65,636, and built a minimal program that would trigger 100 different calls to eval. On a Macbook Pro using clang, that many expansions took .13 seconds; when setting EVAL to 256 expansions, compiling took .06 seconds. When using gcc, both times were a bit more than twice as expensive.

So no, we definitely shouldn’t keep going until we get to 2^64; it’s unlikely to work. But, I’ve never noticed compile time impact due to 256 iterations, even in programs using recursive macros extensively, and can recommend it, but it also seems 2^16 expansions is totally acceptable for cases where you might need it (probably when you’re iterating over something other than call arguments).

NOW we can declare victory.

Hang on, I’m going to go cry again, but this time tears of joy. 😭😭

You can count me out

While the EVALapproach works, you may find it feels kludgy. Why ask the preprocessor to do all that additional work? maybe it can recognize the idiom and short-circuit a bunch of work with an EVAL()? Is there really no better way? There is a technique that facilitates the kind of compile-time recursion we’re trying to do here, without unnecessary layers of expansion. The basic idea is related to the concept of continuations; every macro passes a bag of state to the ‘next’ function-like macro that should get called, doing it in a way that allows us to BOTH dodge blue paint, AND terminate without oblivious expansions. However:

The approach is MUCH more complicated than what we’ve done so far (and I hope anyone reading should agree what we’ve done today is already much too complicated).
Having used the continuation approach, I find it too brittle (as implementable in CPP), and significantly harder to debug than more traditional recursion work-arounds.
The extra evals in the approach we’re using tend to be cheap enough, that the massive amount of extra complexity buys you virtually nothing.
I suspect the continuation technique cannot be done without relying on undefined behavior (specifically, how the compiler chooses to resolve cases where there are multiple possible ambiguous valid expansions). But, it’s still really cool, and if you’re interested (and really want to risk your head going 🤯), check out this brainf— interpreter written entirely in C macros.

I should note, there are still other ways we could avoid such deep compile-time recursion, or even all recursion if we aren’t trying to get close to an arbitrary number of arguments. It’s all far uglier (and way more challenging to understand), and what we’re doing is performant, enough so that added ugliness isn’t merited, IMHO.

Though, one thing I do recommend you do differently from what we’ve done today, is that you should add a common prefix to all your names, to remove the risk of name collisions — many libraries already define things like EVAL() and EMPTY()for themselves, and you never know what might make its way into your system, and cause chaos.

You can’t be bothered? Okay, I’m a pushover (and feel strongly about the issue). So I’ve provided a complete version for you at the end of this article.

Count your blessings

This journey has taken us further down the macro rabbit hole than anyone should ever have to go, yet we didn’t have to sacrifice ALL of our remaining sanity (it helps that I was already tapped out). As much as C actively worked to thwart us from our goal, we got there; now we have a good tool for better compile-time checking of C code in a number of cases, such as when we want to support variable argument functions. And, with incredibly minor changes, we can re-use the code to operate on each argument, to support other things we might want to automate to make our code more robust. For instance, we could add automatic casts or calls to runtime type checkers, or so on. All you really need to do is, take the text we insert (generating the addition that the compiler can easily fold into a count), and replace it with an invocation of a macro that the caller passes in, passing that macro the current argument. Yes, the C standard committee has good reason to disallow recursion— removing the restriction would undoubtedly break existing code. Yet, the difficulty of the whole exercise hopefully demonstrates the need for some quality-of-life improvements in C2Y, all of which can be done without significant backward compatibility risk.

Most of all, I’d love to be able to do far more meaningful work at compile time much more sanely, minimizing my use of macros (as I know many others would too). For that, the language should extend constexpr capabilities, giving us good, full-fledged constexpr functions, with as few limitations as possible. That may be a tall order for C2Y, given the complexity of that change. But even still, I’d want to improve the macro system too:

Add a builtin macro, __VA_COUNT__. We need it to avoid null truncation problems for varargs, and shouldn’t have to keep reinventing the wheel (or doling out __VA_ARGS__ in our code like it’s Halloween candy).
Add another builtin macro, __VA_EMPTY__(...), which would be the inverse of __VA_OPT__(...); its arguments would only get expanded only when the are NO variadic arguments. This would make it even easier to ensure people have good tools to easily terminate the kind of recursion we did today. Assuming constexpr functions take longer to do well, A __VA_EMPTY__ macro can also bridge the gap of not having a good compile-time IF available for complex use cases; it would be much easier to cobble together a reasonably robust one one in macros than it is today by pairing it with __VA_OPT__.
Add a __VA_EVAL__(...)which simply expands its contents, with two semantic changes to the process that other macro goes through:
When an expansion fully finishes, the entire macro gets rescanned again, as many times as necessary, until the full expansion reaches a fix-point (meaning, no more expansions are possible). That would allow us to worry about artificial limits on EVAL() macros; they’d just stop when they should stop.
Every top-level rescan should remove all blue paint generated during one expansion, before starting the next expansion (alternately, it could forego further painting macros in the first place).
A more general purpose __MAP__(body_macro, state, ...)would be useful (though, to be fair, much less of a problem to build robustly if the rest of the above were present).
While waiting for constexr functions, it would be valuable to have a preprocessor built-in __SHA256__ that… replaces the contents passed to it (after expansion) with the SHA256 hash of those contents. I’ve seen more than a few cases where this would be incredibly useful for saving both startup costs, and ongoing costs. No? Why can’t we have nice things??!! Interestingly, doing a compile-time only implementation of SHA-256 using only C macros may seem possible, but I’ve built it (for strings one block in length), and it completely overwhelms both GCC and Clang, even with significantly reduced rounds.

If the committee were to adopt most of the above, it sure would make one of the ugliest legacies the language has needed to carry forward far more tolerable.

We mere mortals would be able to get the gist of expansion rules more easily, if we could tell ourselves, “macro recursion is disallowed unless you use __EVAL__()”. That removes a huge source of confusion, but still leaves us with the problem of getting termination conditions right, which are a bit tricky. Adding __VA_EMPTY__() makes it pretty easy for someone to get the exit condition right in the common case where our recursion is being used to iterate over the arguments passed to function-like macros. __EVAL__() would then essentially be able to function as a “do what I mean” operator, for most of the things people bang their head against when trying to write useful, robust macros to hide C’s unnecessary complexity. With just these two things, you would be able to implement __VA_COUNT__(…)fairly simply:

#define __REST__(x, ...) __VA_ARGS__
#define __VA_COUNT__(...) \
__VA_EMPTY__(0) \
__VA_OPT__(1 + __EVAL__(__VA_COUNT__(__REST__(__VA_ARGS__))))

Sure, it’s still a little obtuse, but it’s far more sane than our final product.

Does this count?

I said there was no price for my full implementation, and there’s not. But, if you really feel obliged, then spend a minute indulging me.