Classic Shell Scripting - Arnold Robbins [37]
sed 's/^/mkdir /' | Insert mkdir command
sh -x Execute, with shell tracing
This script creates a copy of the directory structure in /home/tolstoy in /home/lt (perhaps in preparation for doing backups). (The find command is described in Chapter 10. Its output in this case is a list of directory names, one per line, of every directory underneath /home/tolstoy.) The script uses the interesting trick of generating commands and then feeding the stream of commands as input to the shell. This is a powerful and general technique that is not used as often as it should be.[7]
Substitution details
We've already mentioned that any delimiter may be used besides slash. It is also possible to escape the delimiter within the regular expression or the replacement text, but doing so can be much harder to read:
sed 's/\/home\/tolstoy\//\/home\/lt\//'
Earlier, in Section 3.2.2.2, when describing POSIX BREs, we mentioned the use of backreferences in regular expressions. sed understands backreferences. Furthermore, they may be used in the replacement text to mean "substitute at this point the text matched by the nth parenthesized subexpression." This sounds worse than it is:
$ echo /home/tolstoy/ | sed 's;\(/home\)/tolstoy/;\1/lt/;'
/home/lt/
sed replaces the \1 with the text that matched the /home part of the regular expression. In this case, all of the characters are literal ones, but any regular expression can be enclosed between the \( and the \). Up to nine backreferences are allowed.
A few other characters are special in the replacement text as well. We've already mentioned the need to backslash-escape the delimiter character. This is also, not surprisingly, necessary for the backslash character itself. Finally, the & in the replacement text means "substitute at this point the entire text matched by the regular expression." For example, suppose that we work for the Atlanta Chamber of Commerce, and we need to change our description of the city everywhere in our brochure:
mv atlga.xml atlga.xml.old
sed 's/Atlanta/&, the capital of the South/' < atlga.xml.old > atlga.xml
(Being a modern shop, we use XML for all the possibilities it gives us, instead of an expensive proprietary word processor.) This script saves the original brochure file, as a backup. Doing something like this is always a good idea, especially when you're still learning to work with regular expressions and substitutions. It then applies the change with sed.
To get a literal & character in the replacement text, backslash-escape it. For instance, the following small script can be used to turn literal backslashes in DocBook/XML files into the corresponding DocBook \ entity:
sed 's/\/\\/g'
The g suffix on the previous s command stands for global. It means "replace every occurrence of the regular expression with the replacement text." Without it, sed replaces only the first occurrence. Compare the results from these two invocations, with and without the g:
$ echo Tolstoy reads well. Tolstoy writes well. > example.txt
Sample input
$ sed 's/Tolstoy/Camus/' < example.txt
No "g"
Camus reads well. Tolstoy writes well.
$ sed 's/Tolstoy/Camus/g' < example.txt
With "g"
Camus reads well. Camus writes well.
A little-known fact (amaze your friends!) is that you can specify a trailing number to indicate that the nth occurrence should be replaced:
$ sed 's/Tolstoy/Camus/2' < example.txt
Second occurrence only
Tolstoy reads well. Camus writes well.
So far, we've done only one substitution at a time. While you can string multiple instances of sed together in a pipeline, it's easier to give sed multiple commands. On the command line, this is done with the -e option. Each command is provided by using one -e option per editing command:
sed -e 's/foo/bar/g' -e 's/chicken/cow/g' myfile.xml > myfile2.xml
When you have more than a few edits, though, this form gets tedious. At some point, it's better to put all your edits into a script file, and then run sed using the -f option:
$ cat fixup.sed