Online Book Reader

Home Category

I'm Feeling Lucky_ The Confessions of Google Employee Number 59 - Douglas Edwards [89]

By Root 1933 0
the same way as a server assembly line.

"It was actually good for us," said Urs, who believed contractual obligations to continually refresh the index and meet latency guarantees made Google "a more grown-up company" by forcing it to closely monitor its own progress. "We always wanted to be fast, but the contract said, 'The maximum one-hour latency is this.'* You had to start measuring it. Once you can measure it, it's much easier to set goals and say, 'Can we make this ten percent better by next month?' So a lot of these things started then."

Last Crawl

Along with speed and capacity, Google had promised Yahoo fresher results. A seemingly reasonable offer, except that in March 2000 Google's crawler was crippled and barely running. The Googlebot software would stumble around the web gathering URLs, lose its equilibrium, then crash to a halt. The engineers would restart it and the same thing would happen. Try again. Crash. Google hadn't built a new index in four months—approximately the time I had been at the company, though I never pointed out that coincidence to anyone.

Creating an index required weeks of collecting information about what websites existed and what they contained, data that then had to be compiled into a usable list of URLs that could be ranked and presented as search results when someone submitted a query. Most users assumed that when they typed in a search term the results reflected the exact state of the web at the time they hit Enter, so they were puzzled and sometimes angry when they didn't find the very latest news and information. When an index hadn't been updated for more than a month, it became noticeably stale and user dissatisfaction increased. Google's index wasn't just stale, it was covered in mold. Without a working crawler, Google would violate its contract with Yahoo, and more important, Google.com would become increasingly useless.

"It wasn't really very tolerant of machine failures," Jeff Dean recalls about the crawler—"a half-working mess of Python scripts"—written before he joined the company.* "So when a machine failed in the middle of a crawl, it just ended. Well, that crawl was useless. Let's just start it over again. So you'd crawl for ten days and then the machine would die. Oh well. Throw it away. Start over. That was kind of painful." As Jeff put it, the backup plan was "Oh, shit. Let's try that again."

Unfortunately, the more the index grew, the more machines it needed to run, and the more machines that ran, the more likely it was that one or more would fail,† especially since Google hadn't ponied up for the kind of hardware that gave warnings when it had problems.

"Our machines didn't have parity," noted Jeff. Parity could tell you if a computer's memory was randomly flipping bits in ways it shouldn't. "parity memory was for high-end servers, and non-parity memory was what the cheap unwashed masses bought. The problem was, if you had a lot of data and you'd say, 'Sort it,' it ended up mostly sorted. And if you sorted it again, it ended up mostly sorted a different way." As a result, Google's engineers had to build data structures that were resilient in the face of what Ben Gomes called "adversarial memory."

"For a while I had all these bugs because the machines were crappy," Gomes told me. "I was writing this code—one of my first big projects out of school—and it was crashing all the time. I was sitting in this room with people I really respected, Jeff and Sanjay and Urs, and Jeff said, 'The pieces are mostly working, but the pageranker keeps crashing,' and I wanted to sink into the ground. I stayed up for nights on end trying to figure out why was it crashing? Finally I checked this one thing I had set to zero and I looked at it and it was at four. And I thought, 'Not my bug.' After that I felt a lot better."

Finally, in March 2000, a new crawl hobbled to the finish line—with, according to Gomes, "a lot of pain." "The March index was the last gasp of the old crawler, but it had so many bugs in it, it was impossible to push it out." That index came to be known internally

Return Main Page Previous Page Next Page

®Online Book Reader