Classic Shell Scripting - Arnold Robbins [36]
Many shell scripting tasks start by extracting interesting text with grep or egrep. The initial results of a regular expression search then become the "raw data" for further processing. Often, at least one step consists of text substitution—that is, replacing one bit of text with something else, or removing some part of the matched line.
Most of the time, the right program to use for text substitutions is sed, the Stream Editor. sed is designed to edit files in a batch fashion, rather than interactively. When you know that you have multiple changes to make, whether to one file or to many files, it is much easier to write down the changes in an editing script and apply the script to all the files that need to be changed. sed serves this purpose. (While it is possible to write editing scripts for use with the ed or ex line editors, doing so is more cumbersome, and it is much harder to [remember to] save the original file.)
We have found that for shell scripting, sed's primary use is making simple text substitutions, so we cover that first. We then provide some additional background and explanation of sed's capabilities, but we purposely don't go into a lot of detail. sed in all its glory is described in the book sed & awk (O'Reilly).
GNU sed is available at the location ftp://ftp.gnu.org/gnu/sed/. It has a number of interesting extensions that are documented in the manual that comes with it. The GNU sed manual also contains some interesting examples, and the distribution includes a test suite with some unusual programs. Perhaps the most amazing is an implementation of the Unix dc arbitrary-precision calculator, written as a sed script!
An excellent source for all things sed is http://sed.sourceforge.net/. It includes links to two FAQ documents on sed on the Internet. The first is available from http://www.dreamwvr.com/sed-info/sed-faq.html. The second, and older, FAQ is available from ftp://rtfm.mit.edu/pub/faqs/editor-faq/sed.
* * *
sed
Usage
sed [ -n ] 'editing command' [ file ... ]
sed [ -n ] -e 'editing command' ... [ file ... ]
sed [ -n ] -f script-file ... [ file ... ]
Purpose
To edit its input stream, producing results on standard output, instead of modifying files in place the way an interactive editor does. Although sed has many commands and can do complicated things, it is most often used for performing text substitutions on an input stream, usually as part of a pipeline.
Major options
-e 'editing command'
Use editing command on the input data. -e must be used when there are multiple commands.
-f script-file
Read editing commands from script-file. This is useful when there are many commands to execute.
-n
Suppress the normal printing of each final modified line. Instead, lines must be printed explicitly with the p command.
Behavior
This reads each line of each input file, or standard input if no files. For each line, sed executes every editing command that applies to the input line. The result is written on standard output (by default, or explicitly with the p command and the -n option). With no -e or -f options, sed treats the first argument as the editing command to use.
* * *
Basic Usage
Most of the time, you'll use sed in the middle of a pipeline to perform a substitution. This is done with the s command, which takes a regular expression to look for, replacement text with which to replace matched text, and optional flags:
sed 's/:.*//' /etc/passwd | Remove everything after the first colon
sort -u Sort list and remove duplicates
Here, the / character acts as a delimiter, separating the regular expression from the replacement text. In this instance, the replacement text is empty (the infamous null string), which effectively deletes the matched text. Although the / is the most commonly used delimiter, any printable character may be used instead. When working with filenames, it is common to use punctuation characters for the delimiter (such as a semicolon, colon, or comma):
find /home/tolstoy -type d -print | Find all directories
sed 's;/home/tolstoy/;/home/lt/;'