Online Book Reader

Home Category

Choose a category
All
Classic-Fiction

Complexity_ A Guided Tour - Melanie Mitchell [62]

By Root 620 0

being used.

Evolving Robby, the Soda-Can-Collecting Robot

To introduce you in more detail to the main ideas of GAs, I take you through a simple extended example. I have a robot named “Robby” who lives in a (computer simulated, but messy) two-dimensional world that is strewn with empty soda cans. I am going to use a genetic algorithm to evolve a “brain” (that is, a control strategy) for Robby.

Robby’s job is to clean up his world by collecting the empty soda cans. Robby’s world, illustrated in figure 9.1, consists of 100 squares (sites) laid out in a 10 × 10 grid. You can see Robby in site 0,0. Let’s imagine that there is a wall around the boundary of the entire grid. Various sites have been littered with soda cans (but with no more than one can per site).

Robby isn’t very intelligent, and his eyesight isn’t that great. From wherever he is, he can see the contents of one adjacent site in the north, south, east, and west directions, as well as the contents of the site he occupies. A site can be empty, contain a can, or be a wall. For example, in figure 9.1, Robby, at site 0,0, sees that his current site is empty (i.e., contains no soda cans), the “sites” to the north and west are walls, the site to the south is empty, and the site to the east contains a can.

FIGURE 9.1. Robby’s world. A 10 x 10 array, strewn with soda cans.

For each cleaning session, Robby can perform exactly 200 actions. Each action consists of one of the following seven choices: move to the north, move to the south, move to the east, move to the west, choose a random direction to move in, stay put, or bend down to pick up a can. Each action may generate a reward or a punishment. If Robby is in the same site as a can and picks it up, he gets a reward of ten points. However, if he bends down to pick up a can in a site where there is no can, he is fined one point. If he crashes into a wall, he is fined five points and bounces back into the current site.

Clearly, Robby’s reward is maximized when he picks up as many cans as possible, without crashing into any walls or bending down to pick up a can if no can is there.

Since this is a simple problem, it would probably be pretty easy for a human to figure out a good strategy for Robby to follow. However, the point of genetic algorithms is that humans, being intrinsically lazy, don’t have to figure out anything; we just let computer evolution figure it out for us. Let’s use a genetic algorithm to evolve a good strategy for Robby.

The first step is to figure out exactly what we are evolving; that is, what exactly constitutes a strategy? In general, a strategy is a set of rules that gives, for any situation, the action you should take in that situation. For Robby, a “situation” is simply what he can see: the contents of his current site plus the contents of the north, south, east, and west sites. For the question “what to do in each situation,” Robby has seven possible things he can do: move north, south, east, or west; move in a random direction; stay put; or pick up a can.

Therefore, a strategy for Robby can be written simply as a list of all the possible situations he could encounter, and for each possible situation, which of the seven possible actions he should perform.

How many possible situations are there? Robby looks at five different sites (current, north, south, east, west), and each of those sites can be labeled as empty, contains can, or wall. This means that there are 243 different possible situations (see the notes for an explanation of how I calculated this). Actually, there aren’t really that many, since Robby will never face a situation in which his current site is a wall, or one in which north, south, east, and west are all walls. There are other “impossible” situations as well. Again, being lazy, we don’t want to figure out what all the impossible situations are, so we’ll just list all 243 situations, and know that some of them will never be encountered.

Table 9.1 is an example of a strategy—actually, only part of a strategy, since an entire strategy would be too long to list

Online Book Reader

Complexity_ A Guided Tour - Melanie Mitchell [62]

®Online Book Reader