Online Book Reader

Home Category

Complexity_ A Guided Tour - Melanie Mitchell [116]

By Root 410 0
270 cm). The average adult height is about 5 feet 5 inches (165 cm), and the vast majority of all adults have height somewhere between 5 and 7 feet.

FIGURE 15.7. A bell-curve (normal) distribution of human heights.

The distribution of human heights looks something like figure 15.7. The plot’s bell-like shape is why it is often called a bell curve. Lots of things have approximate bell-curve distributions—height, weight, test scores on some exams, winning scores in basketball games, abundance of different species, and so on. In fact, because so many quantities in the natural world have this distribution, the bell curve is also called the normal distribution.

Normal distributions are characterized by having a particular scale—e.g., 70–270 cm for height, 0–100 for test scores. In a bell-curve distribution, the value with highest frequency is the average—e.g., 165 cm for height. Most other values don’t vary much from the average—the distribution is fairly homogeneous. If in-degrees in the Web were normally distributed, then PageRank wouldn’t work at all, since nearly all Web pages would have close to the average number of in-links. The Web page www.beatles.com would have more or less the same number of in-links as all other pages containing the phrase “apple records”; thus “number of in-links” could not be used as a way to rank such pages in order of probable relevance.

Fortunately for us (and even more so for Google stockholders), the Web degree distribution has a scale-free rather than bell-curve structure. Scale-free networks have four notable properties: (1) a relatively small number of very high-degree nodes (hubs); (2) nodes with degrees over a very large range of different values (i.e., heterogeneity of degree values); (3) self-similarity; (4) small-world structure. All scale-free networks have the small-world property, though not all networks with the small-world property are scale-free.

In more scientific terms, a scale-free network always has a power law degree distribution. Recall that the approximate in-degree distribution for the Web is

Number of Web pages with in-degree k is proportional to .

Perhaps you will remember from high school math that also can be written as k−2. This is a “power law with exponent −2.” Similarly, (or, equivalently, k−1) is a power law with exponent −1.” In general, a power-law distribution has the form of xd, where x is a quantity such as in-degree. The key number describing the distribution is the exponent d; different exponents cause very different-looking distributions.

I will have more to say about power laws in chapter 17. The important thing to remember for now is scale-free network = power-law degree distribution.

Network Resilience

A very important property of scale-free networks is their resilience to the deletion of nodes. This means that if a set of random nodes (along with their links) are deleted from a large scale-free network, the network’s basic properties do not change: it still will have a heterogeneous degree distribution, short average path length, and strong clustering. This is true even if the number of deleted nodes is fairly large. The reason for this is simple: if nodes are deleted at random, they are overwhelmingly likely to be low-degree nodes, since these constitute nearly all nodes in the network. Deleting such nodes will have little effect on the overall degree distribution and path lengths. We can see many examples of this in the Internet and the Web. Many individual computers on the Internet fail or are removed all the time, but this doesn’t have any obvious effect on the operation of the Internet or on its average path length. Similarly, although individual Web pages and their links are deleted all the time, Web surfing is largely unaffected.

However, this resilience comes at a price: if one or more of the hubs is deleted, the network will be likely to lose all its scale-free properties and cease to function properly. For example, a blizzard in Chicago (a big airline hub) will probably cause flight delays or cancellations all over

Return Main Page Previous Page Next Page

®Online Book Reader