Classic Shell Scripting - Arnold Robbins [127]
The ** and **= operators are not part of POSIX awk and are not recognized by mawk. They should therefore be avoided in new code: use ^ and ^= instead.
* * *
Tip
Be sure to note the difference between assignment with =, and equality test with = =. Because assignments are valid expressions, the expression (r = s) ? t : u is syntactically correct, but is probably not what you intended. It assigns s to r, and then if that value is nonzero, it returns t, and otherwise returns u. This warning also applies to C, C++, Java, and other languages with = and = = operators.
* * *
The built-in function int( ) returns the integer part of its argument: int(-3.14159) evaluates to -3.
awk provides some of the common elementary mathematical functions that may be familiar to you from calculators and from other programming languages: sqrt( ), sin( ), cos( ), log( ), exp( ), and so on. They are summarized in Section 9.10.
Scalar Variables
Variables that hold a single value are called scalar variables. In awk, as in most scripting languages, variables are not explicitly declared. Instead, they are created automatically at their first use in the program, usually by assignment of a value, which can be either a number or a string. When a variable is used, the context makes it clear whether a number or a string is expected, and the value is automatically converted from one to the other as needed.
All awk variables are created with an initial empty string value that is treated as zero when a numeric value is required.
awk variable names begin with an ASCII letter or underscore, and optionally continue with letters, underscores, and digits. Thus, variable names match the regular expression [A-Za-z_][A-Za-z_0-9]*. There is no practical limit on the length of a variable name.
awk variable names are case-sensitive: foo, Foo, and FOO are distinct names. A common, and recommended, convention is to name local variables in lowercase, global variables with an initial uppercase letter, and built-in variables in uppercase.
awk provides several built-in variables, all spelled in uppercase. The important ones that we often need for simple programs are shown in Table 9-2Table 9-2.
Table 9-3. Commonly used built-in scalar variables in awk
Variable
Description
FILENAME
Name of the current input file
FNR
Record number in the current input file
FS
Field separator (regular expression) (default: " ")
NF
Number of fields in current record
NR
Record number in the job
OFS
Output field separator (default: " ")
ORS
Output record separator (default: "\n")
RS
Input record separator (regular expression in gawk and mawk only) (default: "\n")
Array Variables
Array variables in awk follow the same naming conventions as scalar variables, but contain zero or more data items, selected by an array index following the name.
Most programming languages require arrays to be indexed by simple integer expressions, but awk allows array indices to be arbitrary numeric or string expressions, enclosed in square brackets after the array name. If you have not encountered such arrays before, they may seem rather curious, but awk code like this fragment of an office-directory program makes their utility obvious:
telephone["Alice"] = "555-0134"
telephone["Bob"] = "555-0135"
telephone["Carol"] = "555-0136"
telephone["Don"] = "555-0141"
Arrays with arbitrary indices are called associative arrays because they