Online Book Reader

Home Category

Classic Shell Scripting - Arnold Robbins [139]

By Root 1014 0
for this.

As in most programming languages, awk functions can call themselves: this is known as recursion. Obviously, the programmer must make some provision for eventual termination: this is usually done by making the job smaller for each successive invocation so that at some point, no further recursion is needed. Example 9-3 shows a famous example from elementary number theory that uses a method credited to the Greek mathematician Euclid (ca. 300 BCE), but probably known at least 200 years earlier, to find the greatest common denominator of two integers.

Example 9-3. Euclid's greatest common denominator algorithm

function gcd(x, y, r)

{

# return the greatest common denominator of integer x, y

x = int(x)

y = int(y)

# print x, y

r = x % y

return (r = = 0) ? y : gcd(y, r)

}

If we add this action

{ g = gcd($1, $2); print "gcd(" $1 ", " $2 ") =", g }

to the code in Example 9-3 and then we uncomment the print statement and run it from a file, we can see how the recursion works:

$ echo 25770 30972 | awk -f gcd.awk

25770 30972

30972 25770

25770 5202

5202 4962

4962 240

240 162

162 78

78 6

gcd(25770, 30972) = 6

Euclid's algorithm always takes relatively few steps, so there is no danger of overflowing the call stack inside awk that keeps track of the nested function-call history. However, that is not always the case. There is a particularly nasty function discovered by the German mathematician Wilhelm Ackermann[3] in 1926 whose value, and recursion depth, grow much faster than exponentially. It can be defined in awk with the code in Example 9-4.

Example 9-4. Ackermann's worse-than-exponential function

function ack(a, b)

{

N++ # count recursion depth

if (a = = 0)

return (b + 1)

else if (b = = 0)

return (ack(a - 1, 1))

else

return (ack(a - 1, ack(a, b - 1)))

}

If we augment it with a test action:

{ N = 0; print "ack(" $1 ", " $2 ") = ", ack($1, $2), "[" N " calls]" }

and run it from a test file, we find:

$ echo 2 2 | awk -f ackermann.awk

ack(2, 2) = 7 [27 calls]

$ echo 3 3 | awk -f ackermann.awk

ack(3, 3) = 61 [2432 calls]

$ echo 3 4 | awk -f ackermann.awk

ack(3, 4) = 125 [10307 calls]

$ echo 3 8 | awk -f ackermann.awk

ack(3, 8) = 2045 [2785999 calls]

ack(4, 4) is completely uncomputable.

* * *

[3] See http://mathworld.wolfram.com/AckermannFunction.html for background and history of the Ackermann function.

String Functions

In Section 9.3.2 we introduced the length( string ) function, which returns the length of a string string. Other common string operations include concatenation, data formatting, lettercase conversion, matching, searching, splitting, string substitution, and substring extraction.

Substring Extraction

The substring function, substr( string, start, len ), returns a copy of the substring of len characters from string starting from character start. Character positions are numbered starting from one: substr("abcde", 2, 3) returns "bcd". The len argument can be omitted, in which case, it defaults to length( string ) - start + 1, selecting the remainder of the string.

It is not an error for the arguments of substr( ) to be out of bounds, but the result may be implementation-dependent. For example, nawk and gawk evaluate substr("ABC", -3, 2) as "AB", whereas mawk produces the empty string "". All of them produce an empty string for substr("ABC", 4, 2) and for substr("ABC", 1, 0). gawk's —lint option diagnoses out-of-bounds arguments in substr( ) calls.

Lettercase Conversion

Some alphabets have uppercase and lowercase forms of each letter, and in string searching and matching, it is often desirable to ignore case differences. awk provides two functions for this purpose: tolower( string ) returns a copy of string with all characters replaced by their lowercase equivalents, and toupper( string ) returns a copy with uppercase equivalents. Thus, tolower("aBcDeF123") returns "abcdef123", and toupper("aBcDeF123") returns "ABCDEF123". These functions are fine for ASCII letters, but they do not correctly case-convert accented

Return Main Page Previous Page Next Page

®Online Book Reader