Beautiful Code [16]
The general lesson that I take away from this bug is humility: It is hard to write even the smallest piece of code correctly, and our whole world runs on big, complex pieces of code.
Here is the implementation of binary search I want to test. In theory, the fix to the way the mid is calculated should resolve the final bug in a pesky piece of code that has eluded some of the best programmers for a few decades:
public static int binarySearch(int[] a, int target) {
int low = 0;
int high = a.length - 1;
while (low <= high) {
int mid = (low + high) >>> 1;
int midVal = a[mid];
if (midVal < target)
low = mid + 1;
else if (midVal > target)
high = mid - 1;
else
return mid;
}
return -1;
}
This version of binarySearch looks right, but there might still be problems with it. Perhaps not bugs, but things that can and should be changed. The changes will make the code not only more robust, but more readable, maintainable, and testable. Let's see whether we can discover some interesting and unexpected opportunities for improvement as we test it.
8. On-the-Fly Code Generation for Image Processing
Charles Petzold
Among the pearls of wisdom and wackiness chronicled in Steven Levy's classic history Hackers: Heroes of the Computer Revolution (Doubleday), my favorite is this one by Bill Gosper, who once said, "Data is just a dumb kind of programming." The corollary, of course, is that code is just a smart kind of data—data designed to trigger processors into performing certain useful or amusing acts.
The potential interplay of code and data tends to be discouraged in most conventional programming instruction. Code and data are usually severely segregated; even in object-oriented programming, code and data have their own special roles to play. Any intermingling of the two—such as data being executed as if it were machine code—is considered to be a violation of natural law.
Only occasionally is this barrier between code and data breached. Compiler authors write programs that read source code and generate machine code, but compilers do not really violate the separation of code and data. Where the input and output are code to the human programmers, they are just data to the compilers. Other odd jobs, such as those performed by disassemblers or simulators, also read machine code as if it were data.
As we all accept rationally, if not emotionally, code and data are ultimately just bytes, and there are only 256 of them in the entire universe. It's not the bytes themselves but their ordering that gives them meaning and purpose.
In some special cases, it can be advantageous for programs that are not compilers to generate code while they're running. This on-the-fly code generation is not easy, so it's usually restricted to very particular circumstances.
Throughout this chapter, we will use an example that embodies the most common reason for using on-the-fly code generation. In this example, a time-critical subroutine must perform many repetitive operations. A number of generalized parameters come into play during the execution of these repetitive operations, and the subroutine could run a lot faster if we replaced those generalized parameters with specific values. We can't replace those parameters while we're writing the subroutine because the parameters aren't known until runtime, and they can change from one invocation to the next. However, the subroutine itself could do the code generation while it's running. In other words, the subroutine can examine its own parameters at runtime, generate more efficient code, and then execute the resulting code.
I first stumbled upon this technique while writing assembly language. I had a subroutine that performed many repetitive operations. At a crucial point, the subroutine would execute either a bitwise AND operation or a bitwise OR operation, depending on some other value that remained constant during these operations.