Reinventing Discovery - Michael Nielsen [75]
But there’s a problem with this neat story. Just because we know the DNA sequence for a protein doesn’t mean we can easily predict what shape the protein has, or what the protein will do. In fact, today we have only a very incomplete understanding of how proteins fold. Complete structures—the exact shapes—are known for only 60,000 proteins, despite the fact that we know the DNA sequences for millions of proteins. Most of those complete structures have been found using a technique called X-ray diffraction—basically, shining X-rays at a protein and figuring out its shape by looking carefully at the X-ray shadow it casts. It’s slow, expensive, painstaking work, and the techniques are only gradually getting better. What we’d really like is a fast and reliable way to predict the shape from the genetic description. If we could do that, cutting out the slow and expensive X-ray diffraction step, we’d go from knowing the shape of 60,000 proteins to knowing the shape of millions. Even more significantly, such a method would be a tremendously powerful tool for helping us design proteins with desired shapes. This would, for instance, help us engineer new antibodies to fight disease.
To solve the protein folding problem, biochemists have turned to computers in an attempt to predict protein shape from the genetic description. To make their predictions they use the idea that a protein will eventually fold into its lowest energy shape, much as a ball will roll to the bottom of a valley between two hills. All that’s needed is good method for finding the lowest energy shape of a protein. This sounds promising, but in practice it’s hard to search through all the possible shapes, looking for the shape with the lowest energy. The difficulty is the number of different shapes a protein can potentially fold into. Proteins typically have hundreds or even thousands of amino acids. To determine the structure means knowing the exact position and orientation of every single one of those amino acids. With so many amino acids involved, the number of possible shapes is astronomical, far too many to search through even on a very powerful computer. Enormous effort has been put into finding clever algorithms that can be used to restrict the number of configurations that must be examined, and the algorithms are getting pretty good. But there’s still a long way to go before we can use computers to reliably predict protein shapes.
In 2007, a biochemist named David Baker and a computer graphics researcher named Zoran Popovic, both from the University of Washington, in Seattle, had an idea for a better way of solving the problem. Baker and Popovio’s idea was to create a computer game that shows a protein to the player, and gives them controls to change the shape, rotating the protein, moving amino acids around, and so on. Some of the controls built into the game are similar to the tools used by professional biochemists. The lower the energy of the shape the player comes up with, the higher their score, and so the highest scoring shapes are good candidates for the real shape of the protein. Baker and Popovic hoped that this might be a better approach to protein folding than the conventional approaches, combining state-of-the-art computational techniques with computer gamers’ persistence and abilities at pattern matching and 3-D problem solving.
I was skeptical when I first heard about