Online Book Reader

Home Category

Professional C__ - Marc Gregoire [295]

By Root 1379 0
the buckets, the insertion, deletion, and lookup operations all run in constant time.

This section assumes that you are familiar with hashed data structures. If you are not, consult Chapter 12, which includes a discussion on hash tables, or one of the standard data structure texts listed in Appendix B.

This section provides an implementation of a simple, but fully functional, hashmap that you can take with you between platforms, in case your compiler does not yet support the standard C++11 hash tables. Like a map, a hashmap stores key/value pairs. In fact, the operations it provides are almost identical to those provided by the map, but with different performance characteristics.

This hashmap implementation uses chained hashing (also called open hashing) and does not attempt to provide advanced features like rehashing. Chapter 12 explains the concept of chained hashing in the section on C++11 unordered associative containers.

It is recommended to use the C++11 unordered associative containers, also called hash tables, if your compiler supports them, instead of implementing your own. These C++11 hash tables, explained in Chapter 12, are called unordered_map, unordered_multimap, unordered_set and unordered_multiset. The hashmap in this chapter is used to demonstrate writing STL containers and as a solution in case your compiler does not support the C++11 hash tables.

The Hash Function

The first choice when writing a hashmap is how to handle hash functions. Recalling the adage that a good abstraction makes the easy case easy and the hard case possible, a good hashmap interface allows clients to specify their own hash function and number of buckets in order to customize the hashing behavior for their particular workload. On the other hand, clients that do not have the desire, or ability, to write a good hash function and choose a number of buckets should still be able to use the container without doing so. One solution is to allow clients to provide a hash function and number of buckets in the hashmap constructor, but also to provide defaults values. It also makes sense to package the hash function and the number of buckets into a hashing class. Our default hash class definition looks like this:

// Any Hash Class must provide two methods: hash() and numBuckets().

template

class DefaultHash

{

public:

// Throws invalid_argument if numBuckets is illegal

DefaultHash(size_t numBuckets = 101) throw (invalid_argument);

size_t hash(const T& key) const;

size_t numBuckets() const { return mNumBuckets; }

protected:

size_t mNumBuckets;

};

Code snippet from Hashmap\BasicHashmap\hashmap.h

Note that the DefaultHash class is templatized on the key type that it hashes, in order to support a templatized hashmap container. The implementation of the constructor is trivial:

// Throws invalid_argument if numBuckets is illegal

template

DefaultHash::DefaultHash(size_t numBuckets) throw (invalid_argument)

{

if (numBuckets <= 0) {

throw invalid_argument("numBuckets must be > 0");

}

mNumBuckets = numBuckets;

}

Code snippet from Hashmap\BasicHashmap\hashmap.cpp

The implementation of hash() is trickier, partially because it must apply to keys of any type. It is supposed to map the key to one of the mNumBuckets buckets. The following hash() function first computes an integer-sized hash value, then limits the calculated hash value to the number of hash buckets by taking the modulo of the calculated value:

// Uses the division method for hashing.

// Treats the key as a sequence of bytes, sums the ASCII

// values of the bytes, and mods the total by the number

// of buckets.

template

size_t DefaultHash::hash(const T& key) const

{

size_t bytes = sizeof(key);

size_t res = 0;

for (size_t i = 0; i < bytes; ++i) {

unsigned char b = *((unsigned char*)&key + i);

res += b;

}

return (res % mNumBuckets);

}

Code snippet from Hashmap\BasicHashmap\hashmap.cpp

Unfortunately, when using the preceding method on strings, the function will calculate the hash of the entire

Return Main Page Previous Page Next Page

®Online Book Reader