Online Book Reader

Home Category

The Information - James Gleick [203]

By Root 938 0
groupthink—all potentially magnified by network effects and studied under the rubric of information cascades. Collective judgment has appealing possibilities; collective self-deception and collective evil have already left a cataclysmic record. But knowledge in the network is different from group decision making based on copying and parroting. It seems to develop by accretion; it can give full weight to quirks and exceptions; the challenge is to recognize it and gain access to it. In 2008, Google created an early warning system for regional flu trends based on data no firmer than the incidence of Web searches for the word flu; the system apparently discovered outbreaks a week sooner than the Centers for Disease Control and Prevention. This was Google’s way: it approached classic hard problems of artificial intelligence—machine translation and voice recognition—not with human experts, not with dictionaries and linguists, but with its voracious data mining of trillions of words in more than three hundred languages. For that matter, its initial approach to searching the Internet relied on the harnessing of collective knowledge.

Here is how the state of search looked in 1994. Nicholson Baker—in a later decade a Wikipedia obsessive; back then the world’s leading advocate for the preservation of card catalogues, old newspapers, and other apparently obsolete paper—sat at a terminal in a University of California library and typed, BROWSE SU[BJECT] CENSORSHIP.♦ He received an error message,

LONG SEARCH: Your search consists of one or more very common words, which will retrieve over 800 headings and take a long time to complete,

and a knuckle rapping:

Long searches slow the system down for everyone on the catalog and often do not produce useful results. Please type HELP or see a reference librarian for assistance.

All too typical. Baker mastered the syntax needed for Boolean searches with complexes of ANDs and ORs and NOTs, to little avail. He cited research on screen fatigue and search failure and information overload and admired a theory that electronic catalogues were “in effect, conducting a program of ‘aversive operant conditioning’ ” against online search.

Here is how the state of search looked two years later, in 1996. The volume of Internet traffic had grown by a factor of ten each year, from 20 terabytes a month worldwide in 1994 to 200 terabytes a month in 1995, to 2 petabytes in 1996. Software engineers at the Digital Equipment Corporation’s research laboratory in Palo Alto, California, had just opened to the public a new kind of search engine, named AltaVista, continually building and revising an index to every page it could find on the Internet—at that point, tens of millions of them. A search for the phrase truth universally acknowledged and the name Darcy produced four thousand matches. Among them:

The complete if not reliable text of Pride and Prejudice, in several versions, stored on computers in Japan, Sweden, and elsewhere, downloadable free or, in one case, for a fee of $2.25.

More than one hundred answers to the question, “Why did the chicken cross the road?” including “Jane Austen: Because it is a truth universally acknowledged that a single chicken, being possessed of a good fortune and presented with a good road, must be desirous of crossing.”

The statement of purpose of the Princeton Pacific Asia Review: “The strategic importance of the Asia Pacific is a truth universally acknowledged …”

An article about barbecue from the Vegetarian Society UK: “It is a truth universally acknowledged among meat-eaters that …”

The home page of Kevin Darcy, Ireland. The home page of Darcy Cremer, Wisconsin. The home page and boating pictures of Darcy Morse. The vital statistics of Tim Darcy, Australian footballer. The résumé of Darcy Hughes, a fourteen-year-old yard worker and babysitter in British Columbia.

Trivia did not daunt the compilers of this ever-evolving index. They were acutely aware of the difference between making a library catalogue—its target fixed, known, and finite—and searching a world

Return Main Page Previous Page Next Page

®Online Book Reader