Bit counting and geometric series

06 Nov, 2025 *

In this blogpost we make a connection between the operation of bit counting and a nice properties of unsigned binary numbers.

An identity

Consider x to be an unsigned strictly positive number. Let j be the first entry of our binary number that is unequal to 0, i.e. we can write it in binary notation as x=xn−1...xj+110...0.

Consider

x−1=∑j≤i≤n−1xi2i−1=2j+∑j<i≤n−1xi2i−1

We have the geometric series:

∑i∈{0,...,j−1}2i=1−2j1−2=2j−1

Rearranging the terms yields:

2j−1=∑i∈{0,...,j−1}2i

Plugging that into above equation we obtain:

x−1=∑i∈{0,...,n−1}yi2i

where yi=1 for i<j , yj=0 and yi=xi else, i.e. we can write as binary number x−1=xn−1...xj+101...1.

If we perform an bitwise AND of this number with x we will obtain

x∧x−1=xn−…

06 Nov, 2025 *

In this blogpost we make a connection between the operation of bit counting and a nice properties of unsigned binary numbers.

An identity

Consider x to be an unsigned strictly positive number. Let j be the first entry of our binary number that is unequal to 0, i.e. we can write it in binary notation as x=xn−1...xj+110...0.

Consider

x−1=∑j≤i≤n−1xi2i−1=2j+∑j<i≤n−1xi2i−1

We have the geometric series:

∑i∈{0,...,j−1}2i=1−2j1−2=2j−1

Rearranging the terms yields:

2j−1=∑i∈{0,...,j−1}2i

Plugging that into above equation we obtain:

x−1=∑i∈{0,...,n−1}yi2i

where yi=1 for i<j , yj=0 and yi=xi else, i.e. we can write as binary number x−1=xn−1...xj+101...1.

If we perform an bitwise AND of this number with x we will obtain

x∧x−1=xn−1...xj+10...0. which is simply x with the rightmost one removed. As we will see below this is a useful identity when one wants to count the number of one bits in a number.

Naive Code

A simple way to count the number of bits is simply to always extract the rightmost bit by performing bitwise and with 1. We’ll than right-shift the bits and continue until only 0 bits are left.

#include <assert.h>
#include <stdint.h>

#define uint uint32_t

int bitcount(uint x);

int main(void)
{
uint x = 10; // 2^3 + 2^1 -> 0b0...01010
assert(bitcount(x) == 2);
}

int bitcount(uint x)
{
int b;

for(b = 0; x != 0; x >>= 1) {
if(x & 01)
b++;
}
return b;
}

The corresponding assembly can be obtained by inputting the code into godbolt.

main:
xor    eax,eax
ret
data16 data16 data16 cs nop WORD PTR [rax+rax*1+0x0]
bitcount:
xor    eax,eax
test   edi,edi
je     1160 <bitcount+0x20>
mov    ecx,edi
nop    DWORD PTR [rax+rax*1+0x0]
mov    edx,edi
and    edx,0x1
add    eax,edx
shr    ecx,1
cmp    edi,0x1
mov    edi,ecx
ja     1150 <bitcount+0x10>
ret

Note that the above algorithm has obvious weakness because it runs as many times as the position we have the leftmost one bit at. That means a number like 0b10...0 will make 32 comparisons and 31 shifts.

Using the identity

Using above identity we can do better: In each round we simply delete the rightmost one. We are done once we have only zero bits left.

#include <assert.h>
#include <stdio.h>
#include <stdint.h>

#define uint uint32_t

int bitcount(uint x);

int main(void)
{
uint x = 10; // 2^3 + 2^1 -> 0b0...01010
assert(bitcount(x) == 2);
}

int bitcount(uint x)
{
int b;

b = 0;
for(b = 0; x != 0; x &= (x-1))
b++;
return b;
}

We can see that we need less instructions.

main:
xor    eax,eax
ret
data16 data16 data16 cs nop WORD PTR [rax+rax*1+0x0]
bitcount:
xor    eax,eax
test   edi,edi
je     115b <bitcount+0x1b>
cs nop WORD PTR [rax+rax*1+0x0]
inc    eax
lea    ecx,[rdi-0x1]
and    ecx,edi
mov    edi,ecx
jne    1150 <bitcount+0x10>
ret

Note that not only have we better looking assembly but also we perform less loops. For example in the same case like above we will only perform one loop. That means on sparse bit sequences the second algorithm will perform much better.

Popcount

Note that depending on our processor we may leverage popcnt instruction like so:

#include <assert.h>
#include <stdint.h>
#include <stdio.h>

#define uint uint32_t

int bitcount(uint x);

int main(){
uint x = 10;
assert(bitcount(x) == 2);
return 0;
}

int bitcount(uint x)
{
return __builtin_popcount(x);
}

The assembly looks as follows:

main:
xor    eax,eax
ret
data16 data16 data16 cs nop WORD PTR [rax+rax*1+0x0]
bitcount:
popcnt eax,edi
ret

where we need to provide -mpopcnt to compiler to let the compiler know that popcnt instruction is available. This way we don’t need any bit hacks and have a single instruction to perform our operation. See here for corresponding godbolt link.

Conclusion

I hope this blogpost served as a good introduction to bit counting and how we can use elementary results from Math to improve the naive algorithm. If you like to exchange ideas on GPU programming or low level programming in general you can contact me on my Linkedin. The above observation is inspired by an exercise of the K&R book on C programming. I intend to solve a large portion of the K&R exercises in the next weeks and you can check out the solutions on my Github.

An identity

An identity

Naive Code

Using the identity

Popcount

Conclusion

Similar Posts