Learning Python - Mark Lutz [593]
>>>
>>> def dictcomp(I):
... return {i: i for i in range(I)}
...
>>> def dictloop(I):
... new = {}
... for i in range(I): new[i] = i
... return new
...
>>> dictcomp(10)
{0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
>>> dictloop(10)
{0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
>>>
>>> from mytimer import best, timer
>>> best(dictcomp, 10000)[0] # 10,000-item dict
0.0013519874732672577
>>> best(dictloop, 10000)[0]
0.001132965223233029
>>>
>>> best(dictcomp, 100000)[0] # 100,000 items: 10 times slower
0.01816089754424155
>>> best(dictloop, 100000)[0]
0.01643484018219965
>>>
>>> best(dictcomp, 1000000)[0] # 1,000,000 items: 10X time
0.18685105229855026
>>> best(dictloop, 1000000)[0] # Time for making one dict
0.1769041177020938
>>>
>>> timer(dictcomp, 1000000, _reps=50)[0] # 1,000,000-item dict
10.692516087938543
>>> timer(dictloop, 1000000, _reps=50)[0] # Time for making 50
10.197276050447755
Part V, Modules
See Test Your Knowledge: Part V Exercises in Chapter 24 for the exercises.
Import basics. When you’re done, your file (mymod.py) and interaction should look similar to the following; remember that Python can read a whole file into a list of line strings, and the len built-in returns the lengths of strings and lists:def countLines(name):
file = open(name)
return len(file.readlines())
def countChars(name):
return len(open(name).read())
def test(name): # Or pass file object
return countLines(name), countChars(name) # Or return a dictionary
% python
>>> import mymod
>>> mymod.test('mymod.py')
(10, 291)
Note that these functions load the entire file in memory all at once, so they won’t work for pathologically large files too big for your machine’s memory. To be more robust, you could read line by line with iterators instead and count as you go:def countLines(name):
tot = 0
for line in open(name): tot += 1
return tot
def countChars(name):
tot = 0
for line in open(name): tot += len(line)
return tot
A generator expression can have the same effect: sum(len(line) for line in open(name)). On Unix, you can verify your output with a wc command; on Windows, right-click on your file to view its properties. Note that your script may report fewer characters than Windows does—for portability, Python converts Windows \r\n line-end markers to \n, thereby dropping one byte (character) per line. To match byte counts with Windows exactly, you must open in binary mode ('rb'), or add the number of bytes corresponding to the number of lines.
The “ambitious” part of this exercise (passing in a file object so you only open the file once), will require you to use the seek method of the built-in file object. It works like C’s fseek call (and calls it behind the scenes): seek resets the current position in the file to a passed-in offset. After a seek, future input/output operations are relative to the new position. To rewind to the start of a file without closing and reopening it, call file.seek(0); the file read methods all pick up at the current position in the file, so you need to rewind to reread. Here’s what this tweak would look like:def countLines(file):
file.seek(0) # Rewind to start of file
return len(file.readlines())
def countChars(file):
file.seek(0) # Ditto (rewind if needed)
return len(file.read())
def test(name):
file = open(name) # Pass file object
return countLines(file), countChars(file) # Open file only once
>>> import mymod2
>>> mymod2.test("mymod2.py")
(11, 392)
from/from *. Here’s the from * part; replace * with countChars to do the