Online Book Reader

Home Category

Beautiful Code [148]

By Root 5140 0
63 instructions plus the loop-control overhead. I leave it to the reader to figure out why this works.

The Quest for an Accelerated Population Count > Sum and Difference of Population Counts of Two Words

10.4. Sum and Difference of Population Counts of Two Words

To compute pop(x) + pop(y) (if your computer does not have the population count instruction), some time can be saved by using the first two executable lines of Example 10-1 on x and y separately, and then adding x and y and executing the last three stages of the algorithm on the sum. After the first two lines of Example 10-1 are executed, x and y consist of eight 4-bit fields, each containing a maximum value of 4. Thus x and y may safely be added, because the maximum value in any 4-bit field of the sum would be 8, so no overflow occurs. (In fact, three words may be combined in this way.)

This idea also applies to subtraction. To compute pop(x) – pop(y), use: [§]

[§]y denotes the one's-complement of y, which in C is written ~y.

pop(x) – pop(y) = pop(x) – (32 – pop(y))

= pop(x) + pop(y) – 32

Then, use the technique just described to compute pop(x) + pop(y). The code is shown in Example 10-2. It uses 32 instructions, versus 43 for two applications of the code of Example 10-1 followed by a subtraction.

Example 10-2. Computing pop(x) – pop(y)

int popDiff(unsigned x, unsigned y) {

x = x - ((x >> 1) & 0x55555555);

x = (x & 0x33333333) + ((x >> 2) & 0x33333333);

y = ~y;

y = y - ((y >> 1) & 0x55555555);

y = (y & 0x33333333) + ((y >> 2) & 0x33333333);

x = x + y;

x = (x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F);

x = x + (x >> 8);

x = x + (x >> 16);

return (x & 0x0000007F) - 32;

}

The Quest for an Accelerated Population Count > Comparing the Population Counts of Two Words

10.5. Comparing the Population Counts of Two Words

Sometimes one wants to know which of two words has the larger population count, without regard to the actual counts. Can this be determined without doing a population count of the two words? Computing the difference of two population counts, as in Example 10-2, and comparing the result to 0, is one way, but there is another way that is preferable if either the population counts are expected to be low, or if there is a strong correlation between the particular bits that are set in the two words.

The idea is to clear a single bit in each word until one of the words is all zero; the other word then has the larger population count. The process runs faster in its worst and average cases if the bits that are 1 at the same positions in each word are first cleared. The code is shown in Example 10-3. The procedure returns a negative number if pop(x) < pop(y), 0 if pop(x) = pop(y), and a positive number (1) if pop(x) > pop(y).

Example 10-3. Comparing pop(x) with pop(y)

int popCmpr(unsigned xp, unsigned yp) {

unsigned x, y;

x = xp & ~yp; // Clear bits where

y = yp & ~xp; // both are 1.

while (1){

if (x == 0) return y | -y;

if (y == 0) return 1;

x = x & (x - 1); // Clear one bit

y = y & (y - 1); // from each.

}

}

After clearing the common 1-bits in each 32-bit word, the maximum possible number of 1-bits in both words together is 32. Therefore the word with the smaller number of 1-bits can have at most 16, and the loop in Example 10-3 is executed a maximum of 16 times, which gives a worst case of 119 instructions executed on a basic RISC (16 x 7 + 7). A simulation using uniformly distributed random 32-bit numbers showed that the average population count of the word with the smaller population count is approximately 6.186, after clearing the common 1-bits. This gives an average execution time of about 50 instructions when executed on random 32-bit inputs, not as good as using Example 10-2. For this procedure to beat that of Example 10-2, the number of 1-bits in either x or y, after clearing the common 1-bits, would have to be three or less.

The Quest for an Accelerated Population Count > Counting the 1-Bits in an Array

10.6. Counting the 1-Bits in an Array

The simplest way to count the number of 1-bits

Return Main Page Previous Page Next Page

®Online Book Reader