Online Book Reader

Home Category

Learning Python - Mark Lutz [496]

By Root 1749 0
the underlying platform—text files return a str for reads and expect one for writes, but binary files return a bytes for reads and expect one (or a bytearray) for writes.

Text File Basics

To demonstrate, let’s begin with basic file I/O. As long as you’re processing basic text files (e.g., ASCII) and don’t care about circumventing the platform-default encoding of strings, files in 3.0 look and feel much as they do in 2.X (for that matter, so do strings in general). The following, for instance, writes one line of text to a file and reads it back in 3.0, exactly as it would in 2.6 (note that file is no longer a built-in name in 3.0, so it’s perfectly OK to use it as a variable here):

C:\misc> c:\python30\python

# Basic text files (and strings) work the same as in 2.X

>>> file = open('temp', 'w')

>>> size = file.write('abc\n') # Returns number of bytes written

>>> file.close() # Manual close to flush output buffer

>>> file = open('temp') # Default mode is "r" (== "rt"): text input

>>> text = file.read()

>>> text

'abc\n'

>>> print(text)

abc

Text and Binary Modes in 3.0

In Python 2.6, there is no major distinction between text and binary files—both accept and return content as str strings. The only major difference is that text files automatically map \n end-of-line characters to and from \r\n on Windows, while binary files do not (I’m stringing operations together into one-liners here just for brevity):

C:\misc> c:\python26\python

>>> open('temp', 'w').write('abd\n') # Write in text mode: adds \r

>>> open('temp', 'r').read() # Read in text mode: drops \r

'abd\n'

>>> open('temp', 'rb').read() # Read in binary mode: verbatim

'abd\r\n'

>>> open('temp', 'wb').write('abc\n') # Write in binary mode

>>> open('temp', 'r').read() # \n not expanded to \r\n

'abc\n'

>>> open('temp', 'rb').read()

'abc\n'

In Python 3.0, things are bit more complex because of the distinction between str for text data and bytes for binary data. To demonstrate, let’s write a text file and read it back in both modes in 3.0. Notice that we are required to provide a str for writing, but reading gives us a str or a bytes, depending on the open mode:

C:\misc> c:\python30\python

# Write and read a text file

>>> open('temp', 'w').write('abc\n') # Text mode output, provide a str

4

>>> open('temp', 'r').read() # Text mode input, returns a str

'abc\n'

>>> open('temp', 'rb').read() # Binary mode input, returns a bytes

b'abc\r\n'

Notice how on Windows text-mode files translate the \n end-of-line character to \r\n on output; on input, text mode translates the \r\n back to \n, but binary mode does not. This is the same in 2.6, and it’s what we want for binary data (no translations should occur), although you can control this behavior with extra open arguments in 3.0 if desired.

Now let’s do the same again, but with a binary file. We provide a bytes to write in this case, and we still get back a str or a bytes, depending on the input mode:

# Write and read a binary file

>>> open('temp', 'wb').write(b'abc\n') # Binary mode output, provide a bytes

4

>>> open('temp', 'r').read() # Text mode input, returns a str

'abc\n'

>>> open('temp', 'rb').read() # Binary mode input, returns a bytes

b'abc\n'

Note that the \n end-of-line character is not expanded to \r\n in binary-mode output—again, a desired result for binary data. Type requirements and file behavior are the same even if the data we’re writing to the binary file is truly binary in nature. In the following, for example, the "\x00" is a binary zero byte and not a printable character:

# Write and read truly binary data

>>> open('temp', 'wb').write(b'a\x00c') # Provide a bytes

3

>>> open('temp', 'r').read() # Receive a str

'a\x00c'

>>> open('temp', 'rb').read() # Receive a bytes

b'a\x00c'

Binary-mode files always return contents as a bytes object, but accept either a bytes or bytearray object for writing; this naturally follows, given that bytearray is basically just a mutable variant of bytes. In fact, most APIs in Python 3.0

Return Main Page Previous Page Next Page

®Online Book Reader