EPI  4.1 Parity
4.1 Parity
Problem Statement
Count the number of set bits in an integer x
.
Solution


Time: O(N) where N is the number of bits necessary to represent the integer x.
Explanation
Let’s say you give this function the integer 10
. This is 1010
in binary. The function should return 2 bits because there are 2 set bits (i.e. bits equal to 1). In the algorithm, you’re looping until x is equal to 0. With each iteration, you are shifting x to the right by 1; 1010
> 0101
> 0010
> 0001
> 0000
. In each iteration, you are also getting a cumulative value of num_bits as the current value of x masked with 1 or 0001
.
Parity
According to the book, the parity of a binary word is 1 if the number of set bits is ODD and the parity of a binary word is 0 if the number of set bits is EVEN. Building on the previous section, if you have the binary word 1010
then the parity is 0 since there are an even number of set bits. The challenge the book gives is, “How do you compute the parity of a very large number of 64 bit words? If you think about count_bits
above, couldn’t you just get the number of bits and execute return 0 if num_bits % 2 == 0 else 1
? You can! But, the book presents an alternative, cleaner way below.


Most of the code is the same except result ^= x & 1
. Looking at the previous example of 12 or 1100
we’re going to bitwise xor assign to result. So for 3 & 1
result will go from 0 > 1
and then on the next shift 1 & 1
will make result = 1 xor 1
which is 0. So the 12 has a parity of 0.
The book then describes performance improvements on this.
x & (x  1)
equalsx
with its lowest set bit erased.
Why is this beneficial to know? Because when you care about only the set bits, you can iterate through them faster by “skipping” the unset bits and getting x
to 0 far quicker. Let’s see how that looks:


Going back to the existing example of 1100
. In the first loop, result = 1
and x  1 = 11 or 1011
so x = 1100 & 1011
which is x = 1000 or 8
. In the final loop, we have result = 1 xor 1 which is 0 and then finally x  1 = 7 or 0111 and then x = 1000 and 0111 which is 0 and this terminates the loop.
The time complexity of this algorithm is O(k)
where k is the number of bits in x
.
The book asks, “how can we improve this?” This can be improved in two ways: 1) process multiple bits at a time and/or 2) cache results in an array based lookup table.
Let’s say we have the following 64 bit word:
0000000000000000 0000000000000000 0000000000000000 0000000000000001
If you have a 64 bit word, you can cache each 16 bit subword in an array and reference that. Let’s try a smaller example, let’s say you have 10010101
which is 149
. You can process this 2 bits at a time to have precomputed parity values as such: 00
> 0
, 01
> 1
, 10
> 1
, 11
> 0 or [0, 1, 1, 0]
. Then if you go 2 bits at a time then you can look up the result as: 1 1 1 1
so this is of parity 0 because there are even numbers of 1s.
Here is the solution the book presents:


Time complexity: O(n/L) where n is the word size and L is the width of the words for which the results are cached. That’s a lot to digest, so let’s break it down.
BIT_MASK
is a varible that has all bits set to 1. The MASK_SIZE is the number of bits you are working with/want to shift. Here we have 16 but it could also be set to 2 for the 2 bit example earlier.
First, you shift by x >> 3 * MAX_SIZE
to get the last MAX_SIZE bits. You lookup that value in the parity array and xor it with the next statment of x >> 2 * MASK_SIZE
. Working with the 64 bit example from earlier, this will get you the first and second group of 16 bit subwords. You can then just bitwiseAND with the bitmask to get the second index and finally apply the same for the 3rd and 4th groups.
TIP: The XOR of two bits is defined to be 0 if both bits are 0 or both bits are 1; otherwise it is 1. XOR is commutative and associative.
This parity computation can be done in O(log n)
by exploiting the tip above. Let’s say you have 10010101
. The parity of this is the same as the parity of 1001
xor 0101
= 1100
which is then XORed again as 11 ^ 00
= 11
and 1 xor 1
= 0
. Isn’t that cool? You can keep XORing half the word size with itself to ultimately arrive at the solution of:


Time complexity: O(logn)
The last step is to extract the last bit of the result by using a mask of 1.