Beautiful Code [119]
I wasn't willing to turn off verification completely, despite requests to do so. However, it was obvious that the verification process had to be sped up. The approach I took is an old optimization classic: table lookup. In a table lookup, you create a table that contains all the answers for all the known inputs. When given any input, the compiler can simply look up the answer in the table, without having to perform a calculation. This is an O(1) operation, and its performance speed is close to the theoretical maximum. Of course, the devil is in the details.
The simplest way to implement table lookup in Java is with a switch statement. javac com-piles this statement into a table of values stored in the byte code. Depending on the switch statement cases, the compiler creates one of two byte code instructions. If the cases are contiguous (e.g., 72–189 without skipping any values in between) the compiler uses a more efficient tableswitch. However, if any values are skipped, the compiler uses the more indirect and less efficient lookupswitch instruction instead.
* * *
Note: This behavior isn't absolutely guaranteed, and may perhaps not even be true in more recent virtual machines (VMs), but it certainly was true in the generation of VMs I tested and inspected.
* * *
For small tables (a few hundred cases or less), it was possible to fill in the intermediate values with the default value. For instance, a simple test can determine whether a character is a hexadecimal digit (Example 5-7). The test starts with the lowest possible true value, '0', and finishes with the highest possible true value, 'f'. Every character between 0 and f must be included as a case.
Example 5-7. switch statement character verification
Code View: Scroll / Show All
private static boolean isHexDigit(char c) {
switch(c) {
case '0': return true;
case '1': return true;
case '2': return true;
case '3': return true;
case '4': return true;
case '5': return true;
case '6': return true;
case '7': return true;
case '8': return true;
case '9': return true;
case ':': return false;
case ';': return false;
case '<': return false;
case '=': return false;
case '>': return false;
case '?': return false;
case '@': return true;
case 'A': return true;
case 'B': return true;
case 'C': return true;
case 'D': return true;
case 'E': return true;
case 'F': return true;
case 'G': return false;
case 'H': return false;
case 'I': return false;
case 'J': return false;
case 'K': return false;
case 'L': return false;
case 'M': return false;
case 'N': return false;
case 'O': return false;
case 'P': return false;
case 'Q': return false;
case 'R': return false;
case 'S': return false;
case 'T': return false;
case 'U': return false;
case 'V': return false;
case 'W': return false;
case 'X': return false;
case 'Y': return false;
case 'Z': return false;
case '[': return false;
case '\\': return false;
case ']': return false;
case '^': return false;
case '_': return false;
case '`': return false;
case 'a': return true;
case 'b': return true;
case 'c': return true;
case 'd': return true;
case 'e': return true;
case 'f': return true;
}
return false;
}
This is long but shallow. It is not complex. It is easy to see what's happening here, and that's good. However, although switch statements are shallow, they do run into problems for larger groups of cases. For instance, XML character verification checks tens of thousands of cases. I tried writing a switch statement to handle these larger groups and discovered that Java imposes a 64K maximum size on the byte code of a method. This situation required an alternate solution.
Although the compiler and runtime limited the size of the lookup table stored in the byte code, there were other places I could hide