Learning Python - Mark Lutz [273]
>>> [line.rstrip() for line in open('myfile')]
['aaa', 'bbb', 'ccc']
>>> list(map((lambda line: line.rstrip()), open('myfile')))
['aaa', 'bbb', 'ccc']
The last two of these make use of file iterators (which essentially means that you don’t need a method call to grab all the lines in iteration contexts such as these). The map call is slightly longer than the list comprehension, but neither has to manage result list construction explicitly.
A list comprehension can also be used as a sort of column projection operation. Python’s standard SQL database API returns query results as a list of tuples much like the following—the list is the table, tuples are rows, and items in tuples are column values:
listoftuple = [('bob', 35, 'mgr'), ('mel', 40, 'dev')]
A for loop could pick up all the values from a selected column manually, but map and list comprehensions can do it in a single step, and faster:
>>> [age for (name, age, job) in listoftuple]
[35, 40]
>>> list(map((lambda row: row[1]), listoftuple))
[35, 40]
The first of these makes use of tuple assignment to unpack row tuples in the list, and the second uses indexing. In Python 2.6 (but not in 3.0—see the note on 2.6 argument unpacking in Chapter 18), map can use tuple unpacking on its argument, too:
# 2.6 only
>>> list(map((lambda (name, age, job): age), listoftuple))
[35, 40]
See other books and resources for more on Python’s database API.
Beside the distinction between running functions versus expressions, the biggest difference between map and list comprehensions in Python 3.0 is that map is an iterator, generating results on demand; to achieve the same memory economy, list comprehensions must be coded as generator expressions (one of the topics of this chapter).
* * *
* * *
[43] These performance generalizations can depend on call patterns, as well as changes and optimizations in Python itself. Recent Python releases have sped up the simple for loop statement, for example. Usually, though, list comprehensions are still substantially faster than for loops and even faster than map (though map can still win for built-in functions). To time these alternatives yourself, see the standard library’s time module’s time.clock and time.time calls, the newer timeit module added in Release 2.4, or this chapter’s upcoming section Timing Iteration Alternatives.
Iterators Revisited: Generators
Python today supports procrastination much more than it did in the past—it provides tools that produce results only when needed, instead of all at once. In particular, two language constructs delay result creation whenever possible:
Generator functions are coded as normal def statements but use yield statements to return results one at a time, suspending and resuming their state between each.
Generator expressions are similar to the list comprehensions of the prior section, but they return an object that produces results on demand instead of building a result list.
Because neither constructs a result list all at once, they save memory space and allow computation time to be split across result requests. As we’ll see, both of these ultimately perform their delayed-results magic by implementing the iteration protocol we studied in Chapter 14.
Generator Functions: yield Versus return
In this part of the book, we’ve learned about coding normal functions that receive input parameters and send back a single result immediately. It is also possible, however, to write functions that may send back a value and later be resumed, picking up where they left off. Such functions are known as generator functions because they generate a sequence of values over time.
Generator functions are like normal functions in most respects, and in fact are coded with normal def statements. However, when created, they are automatically made to implement the iteration protocol so that they can appear in iteration contexts. We studied iterators in Chapter 14; here, we’ll revisit them to see how they relate to generators.
State suspension
Unlike normal