EPI - 4.1 Parity

4.1 Parity

Problem Statement

Count the number of set bits in an integer x.

Solution

1
2
3
4
5
6
7
8
def count_bits(x):
num_bits = 0
while x:
num_bits += x & 1
x >>= 1
return num_bits

Time: O(N) where N is the number of bits necessary to represent the integer x.

Explanation

Let’s say you give this function the integer 10. This is 1010 in binary. The function should return 2 bits because there are 2 set bits (i.e. bits equal to 1). In the algorithm, you’re looping until x is equal to 0. With each iteration, you are shifting x to the right by 1; 1010 -> 0101 -> 0010 -> 0001 -> 0000. In each iteration, you are also getting a cumulative value of num_bits as the current value of x masked with 1 or 0001.

Parity

According to the book, the parity of a binary word is 1 if the number of set bits is ODD and the parity of a binary word is 0 if the number of set bits is EVEN. Building on the previous section, if you have the binary word 1010 then the parity is 0 since there are an even number of set bits. The challenge the book gives is, “How do you compute the parity of a very large number of 64 bit words? If you think about count_bits above, couldn’t you just get the number of bits and execute return 0 if num_bits % 2 == 0 else 1? You can! But, the book presents an alternative, cleaner way below.

1
2
3
4
5
6
7
8
def parity(x):
result = 0
while x:
result ^= x & 1
x >>= 1
return result

Most of the code is the same except result ^= x & 1. Looking at the previous example of 12 or 1100 we’re going to bitwise xor assign to result. So for 3 & 1 result will go from 0 -> 1
and then on the next shift 1 & 1 will make result = 1 xor 1 which is 0. So the 12 has a parity of 0.

The book then describes performance improvements on this.

x & (x - 1) equals x with its lowest set bit erased.

Why is this beneficial to know? Because when you care about only the set bits, you can iterate through them faster by “skipping” the unset bits and getting x to 0 far quicker. Let’s see how that looks:

1
2
3
4
5
6
7
8
def parity(x):
result = 0
while x:
result ^= 1
x &= x - 1
return result

Going back to the existing example of 1100. In the first loop, result = 1 and x - 1 = 11 or 1011 so x = 1100 & 1011 which is x = 1000 or 8. In the final loop, we have result = 1 xor 1 which is 0 and then finally x - 1 = 7 or 0111 and then x = 1000 and 0111 which is 0 and this terminates the loop.

The time complexity of this algorithm is O(k) where k is the number of bits in x.

The book asks, “how can we improve this?” This can be improved in two ways: 1) process multiple bits at a time and/or 2) cache results in an array based lookup table.

Let’s say we have the following 64 bit word:

0000000000000000 0000000000000000 0000000000000000 0000000000000001

If you have a 64 bit word, you can cache each 16 bit subword in an array and reference that. Let’s try a smaller example, let’s say you have 10010101 which is 149. You can process this 2 bits at a time to have pre-computed parity values as such: 00 -> 0, 01 -> 1, 10 -> 1, 11 -> 0 or [0, 1, 1, 0]. Then if you go 2 bits at a time then you can look up the result as: 1 1 1 1 so this is of parity 0 because there are even numbers of 1s.

Here is the solution the book presents:

1
2
3
4
5
6
7
def parity(x):
MASK_SIZE = 16
BIT_MASK = 0xFFFF
return (PRECOMPUTED_PARITY[x >> (3 * MASK_SIZE)] ^
PRECOMPUTED_PARITY[(x >> (2 * MASK_SIZE)) & BIT_MASK] ^
PRECOMPUTED_PARITY[(x >> MASK_SIZE) & BIT_MASK] ^
PRECOMPUTED_PARITY[x & BIT_MASK])

Time complexity: O(n/L) where n is the word size and L is the width of the words for which the results are cached. That’s a lot to digest, so let’s break it down.

BIT_MASK is a varible that has all bits set to 1. The MASK_SIZE is the number of bits you are working with/want to shift. Here we have 16 but it could also be set to 2 for the 2 bit example earlier.

First, you shift by x >> 3 * MAX_SIZE to get the last MAX_SIZE bits. You lookup that value in the parity array and xor it with the next statment of x >> 2 * MASK_SIZE. Working with the 64 bit example from earlier, this will get you the first and second group of 16 bit subwords. You can then just bitwise-AND with the bitmask to get the second index and finally apply the same for the 3rd and 4th groups.


TIP: The XOR of two bits is defined to be 0 if both bits are 0 or both bits are 1; otherwise it is 1. XOR is commutative and associative.

This parity computation can be done in O(log n) by exploiting the tip above. Let’s say you have 10010101. The parity of this is the same as the parity of 1001 xor 0101 = 1100 which is then XORed again as 11 ^ 00 = 11 and 1 xor 1 = 0. Isn’t that cool? You can keep XORing half the word size with itself to ultimately arrive at the solution of:

1
2
3
4
5
6
7
8
9
def parity(x):
x ^= x >> 32
x ^= x >> 16
x ^= x >> 8
x ^= x >> 4
x ^= x >> 2
x ^= x >> 1
return x & 0x01

Time complexity: O(logn)

The last step is to extract the last bit of the result by using a mask of 1.