Online Book Reader

Home Category

Beautiful Code [231]

By Root 5022 0
or medium-size dictionaries with 50,000 keys or fewer, the new size is ma_used*4. Most Python programs that work with large dictionaries build up the dictionary in an initial phase of processing, and then look up individual keys or loop over the entire contents. Quadrupling the dictionary size like this keeps the dictionary sparse (the fill ratio starts out at 1/4) and reduces the number of resize operations performed during the build phase. Large dictionaries with more than 50,000 keys use ma_used*2 to avoid consuming too much memory for empty slots.

On deleting a key from a dictionary, the slot occupied by the key is changed to point to a dummy key, and the ma_used count is updated, but the number of full slots in the table isn't checked. This means dictionaries are never resized on deletion. If you build a large dictionary and then delete many keys from it, the dictionary's hash table may be larger than if you'd constructed the smaller dictionary directly. This usage pattern is quite infrequent, though. Keys are almost never deleted from the many small dictionaries used for objects and for passing function arguments. Many Python programs will build a dictionary, work with it for a while, and then discard the whole dictionary. Therefore, very few Python programs will encounter high memory usage because of the no-resize-on-deletion policy.

18.4.2. A Memory Trade-Off That's Worth It: The Free List

Many dictionary instances are used by Python itself to hold the keyword arguments in function calls. These are therefore created very frequently and have a very short lifetime, being destroyed when the function returns. An effective optimization when facing a high creation rate and short lifetime is to recycle unused data structures, reducing the number of malloc() and free() calls.

Python therefore maintains a free_dicts array of dictionary structures no longer in use. In Python 2.5, this array is 80 elements long. When a new PyDictObject is required, a pointer is taken from free_dicts and the structure is reused. Dictionaries are added to the array when deletion is requested; if free_dicts is full, the structure is simply freed.

Python's Dictionary Implementation: Being All Things to All People > Iterations and Dynamic Changes

18.5. Iterations and Dynamic Changes

A common use case is looping through the contents of a dictionary. The keys(), values(), and items() methods return lists containing all of the keys, values, or key/value pairs in the dictionary. To conserve memory, the user can call the iterkeys(), itervalues(), and iteritems() methods instead; they return an iterator object that returns elements one by one. But when these iterators are used, Python has to forbid any statement that adds or deletes an entry in the dictionary during the loop.

This restriction turns out to be fairly easy to enforce. The iterator records the number of items in the dictionary when an iter*() method is first called. If the size changes, the iterator raises a RuntimeError exception with the message dictionary changed size during iteration.

One special case that modifies a dictionary while looping over it is code that assigns a new value for the same key:

for k, v in d.iteritems():

d[k] = d[k] + 1

It's convenient to avoid raising a RuntimeError exception during such operations. Therefore, the C function that handles dictionary insertion, PyDict_SetItem(), guarantees not to resize the dictionary if it inserts a key that's already present. The lookdict() and lookdict_ string search functions support this feature by the way they report failure (not finding the searched-for key): on failure, they return a pointer to the empty slot where the searched-for key would have been stored. This makes it easy for PyDict_SetItem to store the new value in the returned slot, which is either an empty slot or a slot known to be occupied by the same key. When the new value is recorded in a slot already occupied by the same key, as in d[k]=d[k]+1, the dictionary's size isn't checked for a possible resize operation, and the RuntimeError is

Return Main Page Previous Page Next Page

®Online Book Reader