The Filter Bubble - Eli Pariser [51]
In some cases, algorithmic sorting based on personal data can be even more discriminatory than people would be. For example, software that helps companies sift through résumés for talent might “learn” by looking at which of its recommended employees are actually hired. If nine white candidates in a row are chosen, it might determine that the company isn’t interested in hiring black people and exclude them from future searches. “In many ways,” writes NYU sociologist Dalton Conley, “such network-based categorizations are more insidious than the hackneyed groupings based on race, class, gender, religion, or any other demographic characteristic.” Among programmers, this kind of error has a name. It’s called overfitting.
The online movie rental Web site Netflix is powered by an algorithm called CineMatch. To start, it was pretty simple. If I had rented the first movie in the Lord of the Rings trilogy, let’s say, Netflix could look up what other movies Lord of the Rings watchers had rented. If many of them had rented Star Wars, it’d be highly likely that I would want to rent it, too.
This technique is called kNN (k-nearest-neighbor), and using it CineMatch got pretty good at figuring out what movies people wanted to watch based on what movies they’d rented and how many stars (out of five) they’d given the movies they’d seen. By 2006, CineMatch could predict within one star how much a given user would like any movie from Netflix’s vast hundred-thousand-film emporium. Already CineMatch was better at making recommendations than most humans. A human video clerk would never think to suggest Silence of the Lambs to a fan of The Wizard of Oz, but CineMatch knew people who liked one usually liked the other.
But Reed Hastings, Netflix’s CEO, wasn’t satisfied. “Right now, we’re driving the Model-T version of what’s possible,” he told a reporter in 2006. On October 2, 2006, an announcement went up on the Netflix Web site: “We’re interested, to the tune of $1 million.” Netflix had posted an enormous swath of data—reviews, rental records, and other information from its user database, scrubbed of anything that would obviously identify a specific user. And now the company was willing to give $1 million to the person or team who beat CineMatch by more than 10 percent. Like the longitude prize, the Netflix Challenge was open to everyone. “All you need is a PC and some great insight,” Hastings declared in the New York Times.
After nine months, about eighteen thousand teams from more than 150 countries were competing, using ideas from machine learning, neural networks, collaborative filtering, and data mining. Usually, contestants in high-stakes contests operate in secret. But Netflix encouraged the competing groups to communicate with one another and built a message board where they could coordinate around common obstacles. Read through the message board, and you get a visceral sense of the challenges that bedeviled the contestants during the three-year quest for a better algorithm. Overfitting comes up again and again.
There are two challenges in building pattern-finding algorithms. One is finding the patterns that are there in all the noise. The other problem is the opposite: not finding patterns in the data that aren’t actually really there. The pattern that describes “1, 2, 3” could be “add one to the previous number” or “list positive prime numbers from smallest to biggest.” You don’t know for sure until you get more data. And if you leap to conclusions, you’re overfitting.
Where movies are concerned, the dangers of overfitting are relatively small—many analog movie watchers have been led to believe that because they liked The Godfather and The Godfather: Part II, they’ll like The Godfather: Part III. But the overfitting problem gets to one of the central, irreducible problems of the filter bubble: Overfitting and stereotyping are synonyms.
The term stereotyping (which in this sense