Online Book Reader

Home Category

Classic Shell Scripting - Arnold Robbins [39]

By Root 892 0
$ (as in ed and ex) means "the last line." For example, this script is a quick way to print the last line of a file:

sed -n '$p' "$1" Quoting as shown required!

For sed, the "last line" means the last line of the input. Even when processing multiple files, sed views them as one long input stream, and $ applies only to the last line of the last file. (GNU sed has an option to cause addresses to apply separately to each file; see its documentation.)

Line numbers

You can use an absolute line number as an address. An example is provided shortly.

Ranges

You can specify a range of lines by separating addresses with a comma:

sed -n '10,42p' foo.xml Print only lines 10-42

sed '/foo/,/bar/ s/baz/quux/g' Make substitution only on range of lines

The second command says "starting with lines matching foo, and continuing through lines matching bar, replace all occurrences of baz with quux." (Readers familiar with ed, ex, or the colon command prompt in vi will recognize this usage.)

The use of two regular expressions separated by commas is termed a range expression. In sed, it always includes at least two lines.

Negated regular expressions

Occasionally it's useful to apply a command to all lines that don't match a particular pattern. You specify this by adding an ! character after a regular expression to look for:

/used/!s/new/used/g Change new to used on lines not matching used

The POSIX standard indicates that the behavior when whitespace follows the ! is "unspecified," and recommends that completely portable applications not place any space after it. This is apparently due to some historical versions of sed not allowing it.

Example 3-1 demonstrates the use of absolute line numbers as addresses by presenting a simple version of the head program using sed.

Example 3-1. A version of the head command using sed

# head --- print first n lines

#

# usage: head N file

count=$1

sed ${count}q "$2"

When invoked as head 10 foo.xml, sed ends up being invoked as sed 10q foo.xml. The q command causes sed to quit, immediately; no further input is read or commands executed. Later, in Section 7.6.1, we show how to make this script look more like the real head command.

As we've seen so far, sed uses / characters to delimit patterns to search for. However, there is provision for using a different delimiter in patterns. This is done by preceding the character with a backslash:

$ grep tolstoy /etc/passwd

Show original line

tolstoy:x:2076:10:Leo Tolstoy:/home/tolstoy:/bin/bash

$ sed -n '\:tolstoy: s;;Tolstoy;p' /etc/passwd

Make a change

Tolstoy:x:2076:10:Leo Tolstoy:/home/tolstoy:/bin/bash

In this example, the colon delimits the pattern to search for, and semicolons act as delimiters for the s command. (The editing operation itself is trivial; our point here is to demonstrate the use of different delimiters, not to make the change for its own sake.)

How Much Text Gets Changed?

One issue we haven't discussed yet is the question "how much text matches?" Really, there are two questions. The second question is "where does the match start?" Indeed, when doing simple text searches, such as with grep or egrep, both questions are irrelevant. All you want to know is whether a line matched, and if so, to see the line. Where in the line the match starts, or to where in the line it extends, doesn't matter.

However, knowing the answer to these questions becomes vitally important when doing text substitution with sed or programs written in awk. (Understanding this is also important for day-to-day use when working inside a text editor, although we don't cover text editing in this book.)

The answer to both questions is that a regular expression matches the longest, leftmost substring of the input text that can match the entire expression. In addition, a match of the null string is considered to be longer than no match at all. (Thus, as we explained earlier, given the regular expression ab*c, matching the text ac, the b* successfully matches the null string between a and c.) Furthermore, the POSIX standard states:

Return Main Page Previous Page Next Page

®Online Book Reader