Online Book Reader

Home Category

Beautiful Code [107]

By Root 5229 0
programmer, using any modern computer language, have access to excellent software implementations of associative memory. Different languages call these implementations different things. Often they are implemented as hash tables; in Java, Perl, and Ruby, which use this technique, they are called Hashes, HashMaps, or something similar. In Python, they are called dictionaries, and in the computer algebra language Maple, simply tables.

Now if you're an eager search-algorithm fan just itching to write your own super-efficient search, this may sound like bad news, not good news. But think about those flavors of time; if you use the built-in associative store, the amount of programmer time and management invested in writing search algorithms goes to nearly zero.

By writing your own search, you might be able to save a little computer (and thus end-user) time, compared to the built-in version, but on the other hand, you might not; the people who write these things tend to be pretty clever. Andrew Kuchling has written Chapter 18 of this book on one such effort.

Associative stores are so important that dynamically typed languages such as Ruby and Python have not only built-in support, but special syntax for defining and using them. Let's use Ruby's hashes to count article popularity in Example 4-3.

Example 4-3. Counting article fetches

1 counts = {}

2 counts.default = 0

3

4 ARGF.each_line do |line|

5 if line =~ %r{GET /ongoing/When/\d\d\dx/(\d\d\d\d/\d\d/\d\d/[^ .]+) }

6 counts[$1] += 1

7 end

8 end

This program isn't that much different from the version in Example 4-2. Line 1 creates an empty Hash called counts. Line 2 gives the array a "default value" of zero; hold on for an explanation of that.

Then, in line 6, instead of printing out the article name, the name serves as the key to look up the number of fetches of this article seen so far in counts, add one to it, and store the value.

Now, consider what happens when the program sees some article name stored in $1 for the first time. I could write code along the lines of "if there is a counts[$1], then add one to it; otherwise, set counts[$1] to one." The designers of Ruby hate that kind of awkwardness; this is why they provided the notion of a "default value" for a Hash. If you look up a key the Hash doesn't know about, it says "OK, zero," allowing you to write counts[$1]+=1 and have it always just work.

I originally stated the problem as "Which of my articles have been read the most?" That's kind of fuzzy; let's interpret it to mean "Print out the top 10 most popular articles." The resulting program is shown in Example 4-4.

Example 4-4. Reporting the most popular articles

1 counts = {}

2 counts.default = 0

3

4 ARGF.each_line do |line|

5 if line =~ %r{GET /ongoing/When/\d\d\dx/(\d\d\d\d/\d\d/\d\d/[^ .]+) }

6 counts[$1] += 1

7 end

8 end

9

10 keys_by_count = counts.keys.sort { |a, b| counts[b] <=> counts[a] }

11 keys_by_count[0 .. 9].each do |key|

12 puts "#{counts[key]}: #{key}"

13 end

Line 10 looks a little less beautiful to me than most Ruby code, but it's easy enough to understand. The keys method of counts returns an array containing all of the Hash's keys. Because of the hash implementation, the keys are stored in no predictable order, and are also returned by the keys method in random order. So, I have to sort them and store them back in a new array.

In Ruby, sort is accompanied by a code block, here enclosed in curly braces. (In Ruby, you can delimit a block either with do and end or with { and }.) The sort works its way back and forth through the array being sorted, passing pairs of elements to the block, which has to return a negative number, 0, or a positive number depending on whether the first element is less than, equal to, or greater than the second.

In this case, we want to get the data out of the hash in an order defined by the values (the counts themselves) rather than by the filenames (the keys), so we have to sort the keys by their values. Have a close look at the code, and you'll see how it works. Because this is

Return Main Page Previous Page Next Page

®Online Book Reader