Online Book Reader

Home Category

Classic Shell Scripting - Arnold Robbins [230]

By Root 1029 0
wear out and fail. However, at the time of this writing, they remain considerably more expensive than alternatives, have lower capacity, and can be rewritten only a limited number of times.

* * *

[1] Some systems offer special fast filesystems that reside in central random-access memory (RAM), allowing temporary files to be shared between processes. With common RAM technologies, such filesystems require a constant electrical supply, and thus are generally created anew on system restart. However, some embedded computer systems use nonvolatile RAM to provide a long-term filesystem.

How Are Files Named?

Early computer operating systems did not name files: files were submitted by their owners for processing, and were handled one at a time by human computer operators. It soon became evident that something better was needed if file processing was to be automated: files need names that humans can use to classify and manage them, and that computers can use to identify them.

Once we can assign names to files, we soon discover the need to handle name collisions that arise when the same name is assigned to two or more different files. Modern filesystems solve this problem by grouping sets of uniquely named files into logical collections called directories, or folders. We look at these in Section B.4 later in this Appendix.

We name files using characters from the host operating system's character set. In the early days of computing, there was considerable variation in character sets, but the need to exchange data between unlike systems made it evident that standardization was desirable.

In 1963, the American Standards Association [2] proposed a 7-bit character set with the ponderous name American Standard Code for Information Interchange, thankfully known ever since by its initial letters, ASCII (pronounced ask-ee). Seven bits permit the representation of 27 = 128 different characters, which is sufficient to handle uppercase and lowercase letters of the Latin alphabet, decimal digits, and a couple of dozen special symbols and punctuation characters, including space, with 33 left over for use as control characters. The latter have no assigned printable graphic representation. Some of them serve for marking line and page breaks, but most have only specialized uses. ASCII is supported on virtually all computer systems today. For a view of the ASCII character set, issue the command man ascii.

ASCII, however, is inadequate for representing text in most of the world's languages: its character repertoire is much too small. Since most computer systems now use 8-bit bytes as the smallest addressable unit of storage, and since that byte size permits 28 = 256 different characters, systems designers acted quickly to populate the upper half of that 256-element set, leaving ASCII in the lower half. Unfortunately, they weren't guided by international standards, so hundreds of different assignments of various characters have been put into use; they are sometimes known as code pages. Even a single set of 128 additional character slots does not suffice for all the languages of Europe, so the International Organization for Standardization (ISO) has developed a family of code pages known as ISO 8859-1,[3] ISO 8859-2, ISO 8859-3, and so on.

In the 1990s, collaborative efforts were begun to develop the ultimate single universal character set, known as Unicode.[4] This will eventually require about 21 bits per character, but current implementations in several operating systems use only 16 bits. Unix systems use a variable-byte-width encoding called UTF-8 [5] that permits existing ASCII files to be valid Unicode files.

The point of this digression into character sets is this: with the sole exception of the IBM mainframe EBCDIC [6] character set, all current ones include the ASCII characters in the lower 128 slots. Thus, by voluntarily restricting filenames to the ASCII subset, we can make it much more likely that the names are usable everywhere. The existence of the Internet and the World Wide Web gives ample evidence that files are

Return Main Page Previous Page Next Page

®Online Book Reader