Learning Python - Mark Lutz [282]
[0, 4, 16, 36, 64]
>>> {x * x for x in range(10) if x % 2 == 0} # But sets are not
{0, 16, 4, 64, 36}
>>> {x: x * x for x in range(10) if x % 2 == 0} # Neither are dict keys
{0: 0, 8: 64, 2: 4, 4: 16, 6: 36}
Nested for loops work as well, though the unordered and no-duplicates nature of both types of objects can make the results a bit less straightforward to decipher:
>>> [x + y for x in [1, 2, 3] for y in [4, 5, 6]] # Lists keep duplicates
[5, 6, 7, 6, 7, 8, 7, 8, 9]
>>> {x + y for x in [1, 2, 3] for y in [4, 5, 6]} # But sets do not
{8, 9, 5, 6, 7}
>>> {x: y for x in [1, 2, 3] for y in [4, 5, 6]} # Neither do dict keys
{1: 6, 2: 6, 3: 6}
Like list comprehensions, the set and dictionary varieties can also iterate over any type of iterator—lists, strings, files, ranges, and anything else that supports the iteration protocol:
>>> {x + y for x in 'ab' for y in 'cd'}
{'bd', 'ac', 'ad', 'bc'}
>>> {x + y: (ord(x), ord(y)) for x in 'ab' for y in 'cd'}
{'bd': (98, 100), 'ac': (97, 99), 'ad': (97, 100), 'bc': (98, 99)}
>>> {k * 2 for k in ['spam', 'ham', 'sausage'] if k[0] == 's'}
{'sausagesausage', 'spamspam'}
>>> {k.upper(): k * 2 for k in ['spam', 'ham', 'sausage'] if k[0] == 's'}
{'SAUSAGE': 'sausagesausage', 'SPAM': 'spamspam'}
For more details, experiment with these tools on your own. They may or may not have a performance advantage over the generator or for loop alternatives, but we would have to time their performance explicitly to be sure—which seems a natural segue to the next section.
Timing Iteration Alternatives
We’ve met quite a few iteration alternatives in this book. To summarize, let’s work through a larger case study that pulls together some of the things we’ve learned about iteration and functions.
I’ve mentioned a few times that list comprehensions have a speed performance advantage over for loop statements, and that map performance can be better or worse depending on call patterns. The generator expressions of the prior sections tend to be slightly slower than list comprehensions, though they minimize memory space requirements.
All that’s true today, but relative performance can vary over time because Python’s internals are constantly being changed and optimized. If you want to verify their performance for yourself, you need to time these alternatives on your own computer and your own version of Python.
Timing Module
Luckily, Python makes it easy to time code. To see how the iteration options stack up, let’s start with a simple but general timer utility function coded in a module file, so it can be used in a variety of programs:
# File mytimer.py
import time
reps = 1000
repslist = range(reps)
def timer(func, *pargs, **kargs):
start = time.clock()
for i in repslist:
ret = func(*pargs, **kargs)
elapsed = time.clock() - start
return (elapsed, ret)
Operationally, this module times calls to any function with any positional and keyword arguments by fetching the start time, calling the function a fixed number of times, and subtracting the start time from the stop time. Points to notice:
Python’s time module gives access to the current time, with precision that varies per platform. On Windows, this call is claimed to give microsecond granularity and so is very accurate.
The range call is hoisted out of the timing loop, so its construction cost is not charged to the timed function in Python 2.6. In 3.0 range is an iterator, so this step isn’t required (but doesn’t hurt).
The reps count is a global that importers can change if needed: mytimer.reps = N.
When complete, the total elapsed time for all calls is returned in a tuple, along with the timed function’s final return value so callers can verify its operation.
From a larger perspective, because this function is coded in a module file, it becomes a generally useful tool anywhere we wish to import it. You’ll learn more about modules and imports in the next part of this book—simply import the module and call the function to use this file’s timer (and see Chapter 3’s coverage of module