Classic Shell Scripting - Arnold Robbins [44]
$ awk -F: -v 'OFS=**' '{ print $1, $5 }' /etc/passwd
Process /etc/passwd
root**root Administrative accounts
...
tolstoy**Leo Tolstoy Real users
austen**Jane Austen
camus**Albert Camus
...
We will see shortly that there are other ways to set these variables. They may be more legible, depending on your taste.
Printing lines
As we've shown so far, most of the time you just want to print selected fields, or arrange them in a different order. Simple printing is done with the print statement. You supply it a list of fields, variables, or strings to print:
$ awk -F: '{ print "User", $1, "is really", $5 }' /etc/passwd
User root is really root
...
User tolstoy is really Leo Tolstoy
User austen is really Jane Austen
User camus is really Albert Camus
...
A plain print statement, without any arguments, is equivalent to print $0, which prints the whole record.
For cases like the example just shown, when you want to mix text and values, it is usually clearer to use awk's version of the printf statement. It is similar enough to the shell (and C) version of printf described in Section 2.5.4, that we won't go into the details again. Here is the previous example, using printf:
$ awk -F: '{ printf "User %s is really %s\n", $1, $5 }' /etc/passwd
User root is really root
...
User tolstoy is really Leo Tolstoy
User austen is really Jane Austen
User camus is really Albert Camus
...
As with the shell-level echo and printf, awk's print statement automatically supplies a final newline, whereas with the printf statement you must supply it yourself, using the \n escape sequence.
* * *
Tip
Be sure to separate arguments to print with a comma! Without the comma, awk concatenates adjacent values:
$ awk -F: '{ print "User" $1 "is really" $5 }' /etc/passwdUserrootis reallyroot
...
Usertolstoyis reallyLeo Tolstoy
Useraustenis reallyJane Austen
Usercamusis reallyAlbert Camus
...
String concatenation of this form is unlikely to be what you want. Omitting the comma is a common, and hard-to-find, mistake.
* * *
Startup and cleanup actions
Two special "patterns," BEGIN and END, let you provide startup and cleanup actions for your awk programs. It is more common to use them in larger awk programs, usually written in separate files instead of on the command line:
BEGIN { startup code }
pattern1 { action1 }
pattern2 { action2 }
END { cleanup code }
BEGIN and END blocks are optional. If you have them, it is conventional, but not required, to place them at the beginning and end, respectively, of the awk program. You can also have multiple BEGIN and END blocks; awk executes them in the order they're encountered in the program: all the BEGIN blocks once at the beginning, and all the END blocks once at the end. For simple programs, BEGIN is used for setting variables:
$ awk 'BEGIN { FS = ":" ; OFS = "**" }
Use BEGIN to set variables
> { print $1, $5 }' /etc/passwd
Quoted program continues on second line
root**root
...
tolstoy**Leo Tolstoy Output, as before
austen**Jane Austen
camus**Albert Camus
...
* * *
Warning
The POSIX standard describes the awk language and the options for the awk program. POSIX awk is based on so-called "new awk," first released to the world with System V Release 3.1 in 1987, and modified somewhat for System V Release 4 in 1989.
Alas, as late as 2005, the Solaris /bin/awk is still the original V7 version of awk, from 1979! On Solaris systems, you should use /usr/xpg4/bin/awk, or install one of the free versions of awk mentioned in Chapter 9.
* * *
* * *
[8] This can be worked around with expand and unexpand: see the manual pages for expand(1).
Summary
The grep program is the primary tool for extracting interesting lines of text from input