Classic Shell Scripting - Arnold Robbins [246]
List verbose file information for first four matching files
2220 -r-xr-xr-t 1 sys 2270300 Nov 4 1999 /lib/libc.so.1
60 -r--r--r-- 1 sys 59348 Nov 4 1999 /lib/libcpr.so
108 -r--r--r-- 1 sys 107676 Nov 4 1999 /lib/libdisk.so
28 -r--r--r-- 1 sys 27832 Nov 4 1999 /lib/libmalloc.so
Block sizes are operating- and filesystem-dependent: to find the block size, divide the file size in bytes by the size in blocks, and then round up to a power of two. On the system from that last example, we find 2270300/2220 = 1022.6, so the block size is 210 = 1024 bytes. Storage devices are getting increasingly intelligent, so the block size that you figure out in this way may differ from what is present on the device. Also, vendor and GNU versions of ls on some systems disagree as well, so block sizes obtained in this way are probably not reliable, except for comparisons on the same system with the same ls command.
* * *
Tip
Occasionally, you may encounter files for which the block count seems too small: such a file probably contains holes, caused by using direct access to write bytes at specified positions. Database programs often do this, since they store sparse tables in the filesystem. The inode structure in the filesystem handles files with holes properly, but programs that simply read such a file sequentially see zero bytes from the (imaginary) disk blocks corresponding to the holes.
* * *
* * *
Note
Copying such a file fills the holes with physical zeroed disk blocks, possibly increasing the size substantially. While this is transparent to the software that created the original file, it is a filesystem feature that well-written backup utilities need to deal with. GNU tar offers the —sparse option to request checking for such files, but most other tar implementations do not. GNU cp has the —sparse option to control the handling of files with holes.
Use of the administrative dump/restore tools may be the only way on some systems to avoid filling in the holes while copying a file tree: these utilities tend to be highly system-dependent, so we ignore them in this book.
* * *
You might have spotted another difference between the last two sample outputs: the timestamp is displayed differently. To reduce line width, ls normally displays that value as Mmm dd hh:mm for a timestamp within the last six months, and otherwise, as Mmm dd yyyy for an older one. Some people find this a nuisance, and now that windowing systems have removed the 80-character line-width limit of old-style ASCII terminals,[25] there is little need for that economization. Most humans, however, find long lines hard to read, and recent GNU ls versions try harder to keep the output lines short.
Depending on the locale, GNU ls may produce something close to the yyyy-mm-dd hh:mm:ss format defined in ISO 8601:2000: Data elements and interchange formats—Information interchange—Representation of dates and times, but without the seconds field, as shown in earlier sample outputs.
The GNU ls option —full-time can be used to expose the complete timestamp recorded in the filesystem, as shown in Chapter 10.
Other File Metadata
There are a few remaining file properties recorded in inode entries that we have not yet mentioned. However, the only one visible in the output of ls -l is the file type, recorded as the first character of the line, immediately before the permissions. This is - (hyphen) for an ordinary file, d for a directory, and l for a symbolic link.
Those three characters are about the only ones that you'll see in ordinary directories. However, in /dev, you'll encounter at least two more: b for block device, and c for character device. Neither of them is relevant for anything in this book.
Two other rarely seen file types are p for a named pipe, and s for socket (a special network connection). Sockets are an advanced topic that this book does not cover. Named pipes, however, are occasionally useful in programs and shell scripts: they allow for client-server communication via the filesystem namespace, and they provide