The Filter Bubble - Eli Pariser [78]
This phenomenon is called ambient intelligence. It’s based on a simple observation: The items you own, where you put them, and what you do with them is, after all, a great signal about what kind of person you are and what kind of preferences you have. “In the near future,” writes a team of ambient intelligence experts led by David Wright, “every manufactured product—our clothes, money, appliances, the paint on our walls, the carpets on our floors, our cars, everything—will be embedded with intelligence, networks of tiny sensors and actuators, which some have termed ‘smart dust.’”
And there’s a third set of powerful signals that is getting cheaper and cheaper. In 1990, it cost about $10 to sequence a single base pair—one “letter”—of DNA. By 1999, that number had dropped to $.90. In 2004, it crossed the $.01 threshold, and now, as I write in 2010, it costs one ten-thousandth of $.01. By the time this book comes out, it’ll undoubtedly cost exponentially less. By some point mid-decade, we ought to be able to sequence any random whole human genome for less than the cost of a sandwich.
It seems like something out of Gattaca, but the allure of adding this data to our profiles will be strong. While it’s increasingly clear that our DNA doesn’t determine everything about us—other cellular information sets, hormones, and our environment play a large role—there are undoubtedly numerous correlations between genetic material and behavior to be made. It’s not just that we’ll be able to predict and avert upcoming health issues with far greater accuracy—though that alone will be enough to get many of us in the door. By adding together DNA and behavioral data—like the location information from iPhones or the text of Facebook status updates—an enterprising scientist could run statistical regression analysis on an entire society.
In all this data lie patterns yet undreamed of. Properly harnessed, it will fuel a level of filtering acuity that’s hard to imagine—a world in which nearly all of our objective experience is quantified, captured, and used to inform our environments. The biggest challenge, in fact, may be thinking of the right questions to ask of these enormous flows of binary digits. And increasingly, code will learn to ask these questions itself.
The End of Theory
In December 2010, researchers at Harvard, Google, Encyclopædia Britannica, and the American Heritage Dictionary announced the results of a four-year joint effort. The team had built a database spanning the entire contents of over five hundred years’ worth of books—5.2 million books in total, in English, French, Chinese, German, and other languages. Now any visitor to Google’s “N-Gram viewer” page can query it and watch how phrases rise and fall in popularity over time, from neologism to the long fade into obscurity. For the researchers, the tool suggested even grander possibilities—a “quantitative approach to the humanities,” in which cultural changes can be scientifically mapped and measured.
The initial findings suggest how powerful the tool can be. By looking at the references to previous dates, the team found that “humanity is forgetting its past faster with each passing year.” And, they argued, the tool could provide “a powerful tool for automatically identifying censorship and propaganda” by identifying countries and languages in which there was a statistically abnormal absence of certain ideas or phrases. Leon Trotsky, for example, shows up far less in midcentury Russian books than in English or French books from the same time.
The project is undoubtedly a great service to researchers and the casually curious public. But serving academia probably wasn’t Google’s only motive. Remember Larry Page’s declaration that he wanted to create a machine “that