Beautiful Code [98]
Considering the gains that could be achieved by reducing code size, I finally asked myself a third variant of the question that began this chapter. "What is the most beautiful code that you never wrote?" How was I able to accomplish a great deal with very little? The answer was once again related to Quicksort, specifically, the analysis of its performance. The next section tells that tale.
The Most Beautiful Code I Never Wrote > More and More with Less and Less
3.2. More and More with Less and Less
Quicksort is an elegant algorithm that lends itself to subtle analysis. Around 1980, I had a wonderful discussion with Tony Hoare about the history of his algorithm. He told me that when he first developed Quicksort, he thought it was too simple to publish, and only wrote his classic "Quicksort" paper after he was able to analyze its expected runtime.
It is easy to see that in the worst case, Quicksort might take about n2 time to sort an array of n elements. In the best case, it chooses the median value as a partitioning element, and therefore sorts an array in about n lg n comparisons. So, how many comparisons does it use on the average for a random array of n distinct values?
Hoare's analysis of this question is beautiful, but unfortunately over the mathematical heads of many programmers. When I taught Quicksort to undergraduates, I was frustrated that many just didn't "get" the proof, even after sincere effort. We'll now attack that problem experimentally. We'll start with Hoare's program, and eventually end up with an analysis close to his.
Our task is to modify Example 3-1 of the randomizing Quicksort code to analyze the average number of comparisons used to sort an array of distinct inputs. We will also attempt to gain maximum insight with minimal code, runtime, and space.
To determine the average number of comparisons, we first augment the program to count them. To do this, we increment the variable comps before the comparison in the inner loop (Example 3-2).
Example 3-2. Quicksort inner loop instrumented to count comparisons
for (i = l+1; i <= u; i++) {
comps++;
if (x[i] < x[l])
swap(++m, i);
}
If we run the program for one value of n, we'll see how many comparisons that particular run takes. If we repeat that for many runs over many values of n, and analyze the results statistically, we'll observe that, on average, Quicksort takes about 1.4 n lg n comparisons to sort n elements.
That isn't a bad way to gain insight into the behavior of a program. Thirteen lines of code and a few experiments can reveal a lot. A famous quote attributed to writers such as Blaise Pascal and T. S. Eliot states that, "If I had more time, I would have written you a shorter letter." We have the time, so let's experiment with the code to attempt to create a shorter (and better) program.
We'll play the game of speeding up that experiment, trying to increase statistical accuracy and programming insight. Because the inner loop always makes precisely u-l comparisons, we can make the program a tiny bit faster by counting those comparisons in a single operation outside the loop. This change yields the Quicksort shown in Example 3-3.
Example 3-3. Quicksort inner loop with increment moved out of loop
comps += u-l;
for (i = l+1; i <= u; i++)