Final Jeopardy (Alexandra Cooper Mysteries) - Linda Fairstein [68]
The statisticians trounced the experts. But the statistically trained machines they built, whether they were translating from Chinese or analyzing the ads that a Web surfer clicked, didn’t know anything. In that sense, they were like their question-answering cousins, the forerunners of the yet-to-be-conceived Jeopardy machine. They had no response to different types of questions, ones they weren’t programmed to answer. They were incapable of reasoning, much less coming up with ideas.
Machines were seemingly boxed in. When people taught them about the world, as in the Halo project, the process was too slow and expensive and the machines ended up “overfitted”—locked into single interpretations of facts and relationships. Yet when machines learned for themselves, they turned everything into statistics and remained, in their essence, ignorant.
How could computers get smarter about the world? Tom Mitchell, a computer science professor at Carnegie Mellon, had an idea. He would develop a system that, just like millions of other students, would learn by reading. As it read, it would map all the knowledge it could make sense of. It would learn that Buenos Aires appeared to be a city, and a capital too, and for that matter also a province, that it fit inside Argentina, which was a country, a South American country. The computer would perform the same analysis for billions of other entities. It would read twenty-four hours a day, seven days a week. It would be a perpetual reading machine, and by extracting information, it would slowly cobble together a network of knowledge: every president, continent, baseball team, volcano, endangered species, crime. Its curriculum was the World Wide Web.
Mitchell’s goal was not to build a smart computer but to construct a body of knowledge—a corpus—that smart computers everywhere could turn to as a reference. This computer, he hoped, would be doing on a global scale what the human experts in chemistry had done, at considerable cost, for the Halo system. Like Watson, Mitchell’s Read-the-Web computer, later called NELL, would feature a broad range of analytical tools, each one making sense of the readings from its own perspective. Some would compare word groups, others would parse the grammar. “Learning method A might decide, with 80 percent probability, that Pittsburgh is a city,” Mitchell said. “Method C believes that Luke Ravenstahl is the mayor of Pittsburgh.” As the system processed these two beliefs, it would find them consistent and mutually reinforcing. If the entity called Pittsburgh had a mayor, there was a good chance it was a city. Confidence in that belief would rise. The computer would learn.
Mitchell’s team turned on NELL in January 2010. It worked on a subsection of the Web, a cross section of two hundred million Web pages that had been culled and curated by Mitchell’s colleague Jamie Callan. (Operating with a fixed training set made it easier in the early days to diagnose troubles and carry out experiments.) Within six months, the machine had developed some four hundred thousand beliefs—a minute fraction of what it would need for a global knowledge base. But Mitchell saw NELL and other fact-hunting systems growing quickly. “Within ten years,” he predicted, “we’ll have computer programs that can read and extract 80 percent of the content of the Web, which itself will be much bigger and richer.” This, he said, would produce “a huge knowledge base that AI can work from.”
Much like Watson, however, this knowledge base would brim with beliefs, not facts. After all, statistical systems merely develop confidence in facts as a calculation of probability. They believe, to one degree or another, but are certain of nothing. Humans, by contrast, must often work from knowledge. Halo’s Friedland (who left Vulcan to set up his own shop in 2005) argues that AI systems