Complexity_ A Guided Tour - Melanie Mitchell [56]
This simple-sounding problem turns out to have echos in the work of Kurt Gödel and Alan Turing, which I described in chapter 4. The solution also contains an essential means by which biological systems themselves get around the infinite regress. The solution was originally found, in the context of a more complicated problem, by the twentieth-century Hungarian mathematician John von Neumann.
Von Neumann was a pioneer in fields ranging from quantum mechanics to economics and a designer of one of the earliest electronic computers. His design consisted of a central processing unit that communicates with a random access memory in which both programs and data can be stored. It remains the basic design of all standard computers today. Von Neumann was also one of the first scientists who thought deeply about connections between computation and biology. He dedicated the last years of his life to solving the problem of how a machine might be able to reproduce itself; his solution was the first complete design for a self-reproducing machine. The self-copying computer program I will show you was inspired by his “self-reproducing automaton” and illustrates its fundamental principle in a simplified way.
FIGURE 8.1. A simplified picture of computer memory, with numbered locations 1–5 and beyond, four of which contain lines of a program. The instruction pointer points to the instruction currently being executed by the computer. The lines sometimes contain leading spaces, which are ignored when the instruction is executed.
Before showing you the self-copying program, I need to explain a few more things about the programming language I will be using.
Consider the picture of computer memory given in figure 8.1. Computer memory, in our highly simplified case, consists of numbered locations or “addresses,” here numbered 1–5 and beyond. Each location contains some text. These lines of text can be interpreted by the computer as commands in a program or as data to be used by a program. The program currently stored in memory, when executed, will print
Hello, world! Goodbye.
To accomplish this, the computer has an “instruction pointer”—a number also stored in memory, which always is equal to the memory location of the instruction currently being executed by the computer. The instruction pointer—let’s call it ip for short—is initially set to the memory location containing the program’s first line. We say it “points to” that instruction. At each step in the computation the instruction pointed to by ip is carried out, and ip is increased by 1.
For example, in figure 8.1, the value of ip is 2, and we say that ip is pointing to the line print("Hello, world!").
We call ip a variable since its value changes (“varies”) as the computation is carried out.
We also can define another variable, line[n], as equal to the string of characters in memory location n. For example, the command print(line[2]) will print
print(”Hello, world!”)
Finally, our language contains a loop command. For example, the following lines of computer code,
x = 0 loop until x = 4 { print(”Hello, world!”) x = x + 1 }
will print
Hello, world! Hello, world! Hello, world! Hello, world!
The commands inside the two curly brackets get repeated until the loop condition (here x = 4) is true. The variable x is used as a counter—it starts off at zero and is increased by 1 each time the loop is performed. When it gets to 4, the loop stops.
Now we are ready for the self-copying program, which appears in its entirety in figure 8.2. The best way to understand a computer program is to hand-simulate it; that is, to go through it line by line keeping track of what it does.
Suppose this program has been loaded into computer memory as shown in figure 8.2, and suppose a user comes along and types selfcopy on the computer’s command line. This signals the computer to start interpreting the program called selfcopy. The interpreter—part of the computer