Code_ The Hidden Language of Computer Hardware and Software - Charles Petzold [135]
Each 32-byte directory entry contains the following information:
Bytes
Meaning
0
Usually set to 0
1–8
Filename
9–11
File type
12
File extent
13–14
Reserved (set to 0)
15
Sectors in last block
16–31
Disk map
The first byte in the directory entry is used only when the file system can be shared by two or more people at the same time. Under CP/M, this byte is normally set to 0, as are bytes 13 and 14.
Under CP/M, each file is identified with a two-part name. The first part is known as the filename and can have up to eight characters stored in bytes 1 through 8 of the directory entry; the second part is known as the file type and can have up to three characters stored in bytes 9 through 11. There are several standard file types. For example, TXT indicates a text file (that is, a file containing only ASCII codes), and COM (which is short for command) indicates a file containing 8080 machine-code instructions—a program. When specifying a file, the two parts are separated by a period, like this:
MYLETTER.TXT
CALC.COM
This file-naming convention has come to be known as 8.3 (pronounced eight dot three), indicating the maximum eight letters before the period and the three letters after.
The disk map of the directory entry indicates the allocation blocks in which the file is stored. Suppose the first four entries in the disk map are 14h, 15h, 07h, and 23h, and the rest are zeros. This means that the file occupies four allocation blocks, or 4 KB of space. The file might actually be a bit shorter. Byte 15 in the directory entry indicates how many 128-byte sectors are actually used in the last allocation block.
The disk map is 16 bytes long; that length accommodates a file up to 16,384 bytes. A file longer than 16 KB must use multiple directory entries, which are called extents. In that case, byte 12 is set to 0 in the first directory entry, 1 in the second directory entry, and so forth.
I mentioned text files. Text files are also called ASCII files, or text-only files, or pure-ASCII files, or something along those lines. A text file contains ASCII codes (including carriage return and linefeed codes) that correspond to text readable by human beings. A file that isn't a text file is called a binary file. A CP/M COM file is a binary file because it contains 8080 machine code.
Suppose a file (a very small file) must contain three 16-bit numbers—for example, 5A48h, 78BFh, and F510h. A binary file with these three numbers is just 6 bytes long:
48 5A BF 78 10 F5
Of course, that's the Intel format for storing multibyte numbers. The least-significant byte comes first. A program written for Motorola processors might be more inclined to create the file this way:
5A 48 78 BF F5 10
An ASCII text file storing these same four 16-bit values contains the bytes
35 41 34 38 68 0D 0A 37 38 42 46 68 0D 0A 46 35 31 30 68 0D 0A
These bytes are ASCII codes for numbers and letters, where each number is terminated by a carriage return (0Dh) and a linefeed (0A) character. The text file is more conveniently displayed not as a string of bytes that happen to be ASCII codes, but as the characters themselves:
5A48h
78BFh
F510h
An ASCII text file that stores these three numbers could also contain these bytes:
32 33 31 31 32 0D 0A 33 30 39 31 31 0D 0A 36 32 37 33 36 0D 0A
These bytes are the ASCII codes for the decimal equivalents of the three numbers:
23112
30911
62736
Since the intent of using text files is to make the files easier for humans to read, there's really no reason not to use decimal rather than hexadecimal numbers.
As I mentioned, CP/M itself is stored on the first two tracks of a disk. To run, CP/M must be loaded from the