Beautiful Code [313]
Again, this was a task that turned out to be achievable by delta debugging. However, we did not get there directly. We wanted to make a detour via program states—that is, the set of all program variables and their values. In this set, we wanted to determine failure causes automatically, as in "At the call to shell_sort(), variable size causes the failure." How would that work?
Let us recapitulate what we had done so far. We had done delta debugging on program versions—one that worked and one that failed—and isolated the minimal difference that caused the failure. We had done delta debugging on program inputs—again, one that worked and one that failed—and isolated minimal differences that caused the failure. If we applied delta debugging on program states, we would take one program state from a working run, and one program state from a failing run, and eventually obtain the minimal difference that caused the failure.
Now, there are three problems in here. Problem number one: How does one obtain a program state? Eventually, I would instrument the gdb debugger to query all named variables first, and then unfold all data structures. If I encountered an array or a structure, I would query its elements; if I found a pointer, I would query the variable it pointed to, and so on—until I reached a fix point, or the set of all accessible variables. This program state would be represented as a graph of variables (vertices) and references (edges), as shown in Figure 28-2, abstracting away concrete memory addresses.
Figure 28-2. A program state of the GNU compiler
Next problem: How does one compare two program states? This was rather easy: there are known algorithms for computing common subgraphs between two graphs—and anything that would not be part of the subgraph became a difference. With such an algorithm properly implemented, we now could actually extract and determine the difference between two program states.
Third and last problem: How does one apply differences between states? This was a real challenge, as it involved not only observing but actually manipulating program states. To apply a difference in program state, we would have to set variables to new values, but also to replicate entire complex data structures, including allocating and deleting elements. Once this was done, we could do something quite fun; we could arbitrarily transfer program states between running processes. And not only entire program states, but also partial pro-gram states—ranging from small changes to a single variable to large changes of, say, half of a symbol table.
This idea of transferring program states while the program is executing is something that one needs time getting used to. I remember one of my first presentations at IBM where I explained the algorithm, its application on states, and came to the ultimate example: "We now have 879 differences between these two states. We now let delta debugging narrow down the failure cause. To this end, the algorithm takes half of the differences, that is, 439 state differences, and applies them. This means that in the passing run, 439 variables are now set to the values found in the failing run…."
At this moment, a fellow from the audience stepped up and said: "But doesn't this sound like a very insane thing to do?"
Of course, he was right. Nothing meaningful came out of setting 439 variables to values found in another run; nor did it help setting the other 440 variables. But this is just the situation in which delta debugging comes up with the idea of making smaller changes—that is, it tries 220 variables, 110, and so on. Eventually, it isolates the variable that caused the failure: "The compiler crash was caused by a loop in the abstract syntax tree." And this end, of course, justifies the means—in particular, for the people at IBM, who were all pretty busy developing (and debugging) compilers.
Thus, the demonstration that it worked helped people forget it was a pretty weird approach. Still, my first publication on the topic had a hard time getting accepted.