Beautiful Code [227]
Python's Dictionary Implementation: Being All Things to All People > Inside the Dictionary
18. Python's Dictionary Implementation: Being All Things to All People
Andrew Kuchling
Dictionaries are a fundamental data type in the python programming language. Like awk's associative arrays and Perl's hashes, dictionaries store a mapping of unique keys to values. Basic operations on a dictionary include:
Adding a new key/value pair
Retrieving the value corresponding to a particular key
Removing existing pairs
Looping over the keys, values, or key/value pairs
Here's a brief example of using a dictionary at the Python interpreter prompt. (To try out this example, you can just run the python command on Mac OS and most Linux distributions. If Python isn't already installed, you can download it from http://www.python.org.)
In the following interactive session, the >>> signs represent the Python interpreter's prompts, and d is the name of the dictionary I'm playing with:
>>> d = {1: 'January', 2: 'February',
... 'jan': 1, 'feb': 2, 'mar': 3}
{'jan': 1, 1: 'January', 2: 'February', 'mar': 3, 'feb': 2}
>>> d['jan'], d[1]
(1, 'January')
>>> d[12]
Traceback (most recent call last):
File " KeyError: 12 >>> del d[2] >>> for k, v in d.items(): print k,v # Looping over all pairs. jan 1 1 January mar 3 >feb 2 ... Two things to note about Python's dictionary type are: A single dictionary can contain keys and values of several different data types. It's legal to store the keys 1, 3+4j (a complex number), and "abc" (a string) in the same dictionary. Values retain their type; they aren't all converted to strings. Keys are not ordered. Methods such as .values() that return the entire contents of a dictionary will return the data in some arbitrary arrangement, not ordered by value or by insertion time. It's important that retrieval of keys be a very fast operation, so dictionary-like types are usually implemented as hash tables. For the C implementation of Python (henceforth referred to as CPython), dictionaries are even more pivotal because they underpin several other language features. For example, classes and class instances use a dictionary to store their attributes: >>> obj = MyClass() # Create a class instance >>> obj.name = 'object' # Add a .name attribute >>> obj.id = 14 # Add a .id attribute >>> obj._ _dict_ _ # Retrieve the underlying dictionary {'name': 'object', 'id': 14} >>> obj._ _dict_ _['id'] = 12 # Store a new value in the dictionary >>> obj.id # Attribute is changed accordingly 12 Module contents are also represented as a dictionary, most notably the _ _builtin_ _ module that contains built-in identifiers such as int and open. Any expression that uses such built-ins will therefore result in a few dictionary lookups. Another use of dictionaries is to pass keyword arguments to a function, so a dictionary could potentially be created and destroyed on every function call. This internal use of the dictionary type means that any running Python program has many dictionaries active at the same time, even if the user's program code doesn't explicitly use a dictionary. It's therefore important that dictionaries can be created and destroyed quickly and not use an overly large amount of memory. The implementation of dictionaries in Python teaches several lessons about performance-critical code. First, one has to trade off the advantages of an optimization