The Information - James Gleick [158]
Claude Shannon, having moved from Bell Labs to MIT, reanalyzed the Turing machine in 1956. He stripped it down to the smallest possible skeleton, proving that the universal computer could be constructed with just two internal states, or with just two symbols, 0 and 1, or blank and nonblank. He wrote his proof in words more pragmatic than mathematical: he described exactly how the two-state Turing machine would step left and right, “bouncing” back and forth to keep track of the larger numbers of states in a more complex computer. It was all very intricate and specific, redolent of Babbage. For example:
When the reading head moves, the state information must be transferred to the next cell of the tape to be visited using only two internal states in machine B. If the next state in machine A is to be (say) state 17 (according to some arbitrary numbering system) this is transferred in machine B by “bouncing” the reading head back and forth between the old cell and the new one 17 times (actually 18 trips to the new cell and 17 back to the old one).♦
The “bouncing operation” carries the information from cell to cell, and the cells act as “transmitters” and “controllers.”
Turing had titled his great paper “On Computable Numbers,” but of course the real focus was on uncomputable numbers. Could uncomputable numbers and random numbers be related? In 1965 Chaitin was an undergraduate at the City College of New York, writing up a discovery he hoped to submit to a journal; it would be his first publication. He began, “In this paper the Turing machine is regarded as a general purpose computer and some practical questions are asked about programming it.” Chaitin, as a high-school student in the Columbia Science Honors Program, had the opportunity to practice programming in machine language on giant IBM mainframes, using decks of punched cards—one card for each line of a program. He would leave his card deck in the computer center and come back the next day for the program’s output. He could run Turing machines in his head, too: write 0, write 1, write blank, shift tape left, shift tape right.… The universal computer gave him a nice way to distinguish between numbers like Alice and Bob’s A and B. He could write a program to make a Turing machine print out “010101 …” a million times, and he could write down the length of that program—quite short. But given a million random digits—no pattern, no regularity, nothing special at all—there could be no shortcut. The computer program would have to incorporate the entire number. To make the IBM mainframe print out those million digits, he would have to put the whole million digits into the punched cards. To make the Turing machine do it, he would still need the million digits for input.
Here is another number (in decimal this time):
C: 3.1415926535897932384626433832795028841971693993751…
This looks random. Statistically each digit appears with the expected frequency (one in ten); likewise each pair of digits (one in a hundred), each triplet, and so on. A statistician would say it appears to be “normal,” as far as anyone can tell. The next digit is always a surprise. The works of Shakespeare will be in there, eventually. But someone might recognize this as a familiar number, Π. So it is not random after all.
But why do we say Π is not random? Chaitin proposed a clear answer: a number is not random if it is computable—if a definable computer program will generate it. Thus computability is a measure of randomness.
For Turing computability was a yes-or-no quality—a given number either is or is not. But we would like to say that some numbers are more random than others—they are less patterned, less orderly. Chaitin said the patterns and the order express computability. Algorithms generate patterns. So we can gauge computability by looking at the size of the algorithm. Given a number—represented as a string of any length—we ask, what is the length of the shortest program that will generate it? Using the language of a Turing machine, that question can have a definite