Classic Shell Scripting - Arnold Robbins [159]
The output from the substituted command can sometimes be lengthy, with the result that a nasty kernel limit on the combined length of a command line and its environment variables is exceeded. When that happens, you'll see this instead:
$ grep POSIX_OPEN_MAX /dev/null $(find /usr/include -type f | sort)
/usr/local/bin/grep: Argument list too long.
That limit can be found with getconf:
$ getconf ARG_MAX
Get system configuration value of ARG_MAX
131072
On the systems that we tested, the reported values ranged from a low of 24,576 (IBM AIX) to a high of 1,048,320 (Sun Solaris).
The solution to the ARG_MAX problem is provided by xargs: it takes a list of arguments on standard input, one per line, and feeds them in suitably sized groups (determined by the host's value of ARG_MAX) to another command given as arguments to xargs. Here is an example that eliminates the obnoxious Argument list too long error:
$ find /usr/include -type f | xargs grep POSIX_OPEN_MAX /dev/null
/usr/include/bits/posix1_lim.h:#define _POSIX_OPEN_MAX 16
/usr/include/bits/posix1_lim.h:#define _POSIX_FD_SETSIZE _POSIX_OPEN_MAX
Here, the /dev/null argument ensures that grep always sees at least two file arguments, causing it to print the filename at the start of each reported match. If xargs gets no input filenames, it terminates silently without even invoking its argument program.
GNU xargs has the —null option to handle the NUL-terminated filename lists produced by GNU find's -print0 option. xargs passes each such filename as a complete argument to the command that it runs, without danger of shell (mis)interpretation or newline confusion; it is then up to that command to handle its arguments sensibly.
xargs has options to control where the arguments are substituted, and to limit the number of arguments passed to one invocation of the argument command. The GNU version can even run multiple argument processes in parallel. However, the simple form shown here suffices most of the time. Consult the xargs(1) manual pages for further details, and for examples of some of the wizardry possible with its fancier features.
Filesystem Space Information
With suitable options, the find and ls commands report file sizes, so with the help of a short awk program, you can report how many bytes your files occupy:
$ find -ls | awk '{Sum += $7} END {printf("Total: %.0f bytes\n", Sum)}'
Total: 23079017 bytes
However, that report underestimates the space used, because files are allocated in fixed-size blocks, and it tells us nothing about the used and available space in the entire filesystem. Two other useful tools provide better solutions: df and du.
The df Command
df (disk free) gives a one-line summary of used and available space on each mounted filesystem. The units are system-dependent blocks on some systems, and kilobytes on others. Most modern implementations support the -k option to force kilobyte units, and the -l (lowercase L) option to include only local filesystems, excluding network-mounted ones. Here is a typical example from one of our web servers:
$ df -k
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda5 5036284 2135488 2644964 45% /
/dev/sda2 38890 8088 28794 22% /boot
/dev/sda3 10080520 6457072 3111380 68% /export
none 513964 0 513964 0% /dev/shm
/dev/sda8 101089 4421 91449 5% /tmp
/dev/sda9 13432904 269600 12480948 3% /var
/dev/sda6 4032092 1683824 2143444 44% /ww
GNU df provides the -h (human-readable) option to produce a more compact, but possibly more confusing, report:
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda5 4.9G 2.1G 2.6G 45% /
/dev/sda2 38M 7.9M 29M 22% /boot
/dev/sda3 9.7G 6.2G 3.0G 68% /export
none 502M 0 502M 0% /dev/shm
/dev/sda8 99M 4.4M 90M 5% /tmp
/dev/sda9 13G 264M 12G 3% /var
/dev/sda6 3.9G 1.7G 2.1G 44% /ww
The output line order may be arbitrary, but the presence of the one-line header makes it harder to apply sort while preserving that header. Fortunately, on most systems, the output is only a few lines long.