Online Book Reader

Home Category

Professional C__ - Marc Gregoire [215]

By Root 1575 0
’s so similar to set and multimap.

UNORDERED ASSOCIATIVE CONTAINERS/ HASH TABLES


C++11 adds new kind of containers to the STL called unordered associative containers or hash tables. There are four of them: unordered_map, unordered_multimap, unordered_set, and unordered_multiset. The map, multimap, set, and multiset containers discussed earlier sort their elements while these new unordered variants do not sort their elements. If you don’t need ordering, use the new unordered associative containers because on average they have faster insertion, deletion, and lookup.

As mentioned in the container overview in Chapter 11, better names would have been hash_map, hash_set, and so on. Unfortunately, hash tables were not part of the C++ standard library before C++11, which means a lot of third-party libraries implemented hash tables themselves using names with a prefix hash like hash_map. Because of this, the C++ standard committee decided to use the prefix unordered instead of hash to avoid name clashes.

Hash Functions

The new containers are also called hash tables. That is because the implementation of these new containers makes use of so called hash functions. The implementation will usually consist of some kind of array where each element in the array is called a bucket. Each bucket has a specific numerical index like 0, 1, 2 up until the last bucket. A hash function transforms a key into a bucket index. The value associated with that key is then stored in that bucket. The result of a hash function is not always unique. The situation in which two or more keys hash to the same bucket index is called a collision. There are many approaches to handling collisions, for example quadratic re-hashing, linear chaining, etc. Those who are interested may consult one of the references in the Algorithms and Data Structures section in Appendix B. The STL standard does not specify which collision handling algorithm is required, but most current implementations have chosen to resolve collisions by linear chaining. With linear chaining, buckets do not directly contain the data values associated with the keys, but contain a pointer to a linked list. This linked list contains all the data values for that specific bucket. Figure 12-1 shows how this works.

FIGURE 12-1

In Figure 12-1, applying the hash function to the keys “Marc G.” and “Zulija N.” resulted in the same bucket index 128. This bucket then points to a linked list containing the keys “Marc G.” and “Zulija N.” together with their associated data values. From Figure 12-1 it is also clear how lookups based on keys work and what the complexity is. A lookup involves a single hash function call to calculate the bucket index and after that one or more equality operations to find the right key in the linked list. This shows that lookups can be much faster compared to lookups with normal maps, but it all depends on how many collisions there are.

The choice of the hash function is very important. A hash function that creates no collisions is known as a “perfect hash.” A perfect hash has a lookup time which is constant; a regular hash has a lookup time which is, on average, close to 1, independent of the number of elements. As the number of collisions increases, the lookup time increases, reducing performance. Collisions can be reduced by increasing the basic hash table size, but you need to take cache sizes into account.

Generally, the default hash function is suitable for most purposes. Creating a perfect hash is a nontrivial exercise, even when the set of keys is fixed and known. Even modifying a regular hash is not an exercise for the unwary!

unordered_map

The unordered_map is defined in the header file as a class template as follows:

template class T,

class Hash = hash,

class Pred = std::equal_to,

class Alloc = std::allocator > >

class unordered_map;

There are five template parameters: the key type, the value type, the hash type, the equal comparison type, and the allocator type. With the last three parameters

Return Main Page Previous Page Next Page

®Online Book Reader