Classic Shell Scripting - Arnold Robbins [15]
Self-Contained Scripts: The #! First Line
When the shell runs a program, it asks the Unix kernel to start a new process and run the given program in that process. The kernel knows how to do this for compiled programs. Our nusers shell script isn't a compiled program; when the shell asks the kernel to run it, the kernel will fail to do so, returning a "not executable format file" error. The shell, upon receiving this error, says "Aha, it's not a compiled program, it must be a shell script," and then proceeds to start a new copy of /bin/sh (the standard shell) to run the program.
The "fall back to /bin/sh" mechanism is great when there's only one shell. However, because current Unix systems have multiple shells, there needs to be a way to tell the Unix kernel which shell to use when running a particular shell script. In fact, it helps to have a general mechanism that makes it possible to directly invoke any programming language interpreter, not just a command shell. This is done via a special first line in the script file—one that begins with the two characters #!.
When the first two characters of a file are #!, the kernel scans the rest of the line for the full pathname of an interpreter to use to run the program. (Any intervening whitespace is skipped.) The kernel also scans for a single option to be passed to that interpreter. The kernel invokes the interpreter with the given option, along with the rest of the command line. For example, assume a csh script[3] named /usr/ucb/whizprog, with this first line:
#! /bin/csh -f
Furthermore, assume that /usr/ucb is included in the shell's search path (described later). A user might type the command whizprog -q /dev/tty01. The kernel interprets the #! line and invokes csh as follows:
/bin/csh -f /usr/ucb/whizprog -q /dev/tty01
This mechanism makes it easy to invoke any interpreted language. For example, it is a good way to invoke a standalone awk program:
#! /bin/awk -f
awk program here
Shell scripts typically start with #! /bin/sh. Use the path to a POSIX-compliant shell if your /bin/sh isn't POSIX compliant. There are also some low-level "gotchas" to watch out for:
On modern systems, the maximum length of the #! line varies from 63 to 1024 characters. Try to keep it less than 64 characters. (See Table 2-1 for a representative list of different limits.)
On some systems, the "rest of the command line" that is passed to the interpreter includes the full pathname of the command. On others, it does not; the command line as entered is passed to the program. Thus, scripts that look at the command-line arguments cannot portably depend on the full pathname being present.
Don't put any trailing whitespace after an option, if present. It will get passed along to the invoked program along with the option.
You have to know the full pathname to the interpreter to be run. This can prevent cross-vendor portability, since different vendors put things in different places (e.g., /bin/awk versus /usr/bin/awk).
On antique systems that don't have #! interpretation in the kernel, some shells will do it themselves, and they may be picky about the presence or absence of whitespace characters between the #! and the name of the interpreter.
Table 2-1 lists the different line length limits for the #! line on different Unix systems. (These were discovered via experimentation.) The results are surprising, in that they are often not powers of two.
Table 2-1. #! line length limits on different systems
Vendor platform
O/S version
Maximum length
Apple Power Mac
Mac Darwin 7.2 (Mac OS 10.3.2)
512
Compaq/DEC Alpha
OSF/1 4.0
1024
Compaq/DEC/HP Alpha
OSF/1 5.1
1000
GNU/Linux[4]
Red Hat 6, 7, 8, 9; Fedora 1
127
HP PA-RISC and