Classic Shell Scripting - Arnold Robbins [137]
Here is how you can use a command pipeline in a loop:
command = "head -n 15 /etc/hosts"
while ((command | getline s) > 0)
print s
close(command)
We used a variable to hold the pipeline to avoid repetition of a possibly complicated string, and to ensure that all uses of the command match exactly. In command strings, every character is significant, and even an inadvertent difference of a single space would refer to a different command.
Output Redirection
The print and printf statements (see Section 9.9.8) normally send their output to standard output. However, the output can be sent to a file instead:
print "Hello, world" > file
printf("The tenth power of %d is %d\n", 2, 2^10) > "/dev/tty"
To append to an existing file (or create a new one if it does not yet exist), use >> output redirection:
print "Hello, world" >> file
You can use output redirection to the same file on any number of output statements. When you are finished writing output, use close( file ) to close the file and free its resources.
Avoid mixing > and >> for the same file without an intervening close( ). In awk, these operators tell how the output file should be opened. Once open, the file remains open until it is explicitly closed, or until the program terminates. Contrast that behavior with the shell, where redirection requires the file to be opened and closed at each command.
Alternatively, you can send output to a pipeline:
for (name in telephone)
print name "\t" telephone[name] | "sort"
close("sort")
As with input from a pipeline, close an output pipeline as soon as you are through with it. This is particularly important if you need to read the output in the same program. For example, you can direct the output to a temporary file, and then read it after it is complete:
tmpfile = "/tmp/telephone.tmp"
command = "sort > " tmpfile
for (name in telephone)
print name "\t" telephone[name] | command
close(command)
while ((getline < tmpfile) > 0)
close(tmpfile)
Pipelines in awk put the entire Unix toolbox at our disposal, eliminating the need for much of the library support offered in other programming languages, and helping to keep the language small. For example, awk does not provide a built-in function for sorting because it would just duplicate functionality already available in the powerful sort command described in Section 4.1.
Recent awk implementations, but not POSIX, provide a function to flush buffered data to the output stream: fflush( file ). Notice the doubled initial ff (for file flush). It returns 0 on success and -1 on failure. The behavior of calls to fflush( ) (omitted argument) and fflush("") (empty string argument) is implementation-dependent: avoid such uses in portable programs.
Running External Programs
We showed earlier how the getline statement and output redirection in awk pipelines can communicate with external programs. The system( command ) function provides a third way: its return value is the exit status code of the command. It first flushes any buffered output, then starts an instance of /bin/sh, and sends it the command. The shell's standard error and standard output are the same as that of the awk program, so unless the command's I/O is redirected, output from both the awk program and the shell command appears in the expected order.
Here is a shorter solution to the telephone-directory sorting problem, using a temporary file and system( ) instead of an awk pipeline:
tmpfile = "/tmp/telephone.tmp"
for (name in telephone)
print name "\t" telephone[name] > tmpfile
close(tmpfile)
system("sort < " tmpfile)
The temporary file must be closed before the call to system() to ensure that any buffered output is properly recorded in the file.
There is no need to call close( ) for commands run by system( ), because close( ) is only for files or pipes opened with the I/O redirection operators and getline, print, or printf.
The system( ) function provides an easy way to remove the script's temporary file:
system("rm -f " tmpfile)
The command passed to system(