Unswitching loops for fun and profit
xania.org·1d
🚀Compiler Optimizations
Preview
Report Post

Written by me, proof-read by an LLM. Details at end.

Sometimes the compiler decides the best way to optimise your loop is to… write it twice. Sounds counterintuitive? Let’s change our sum example from before to optionally return a sum-of-squares1:

At -O2 the compiler turns the ternary into: sum += value * (squared ? value : 1); - using a multiply and add (mla) instruction to do the multiply and add, and conditionally picking either value or the constant 1 to avoid a branch inside the loop.

However, if we turn the optimisation level up, the compiler uses a new approach:

Here the compiler realises the bool squared value doesn’t change throughout the loop, and decides to duplicate the loop: one copy that squ…

Similar Posts

Loading similar posts...