Classic Shell Scripting - Arnold Robbins [24]
Apart from the names C and POSIX, locale names are not standardized. However, most vendors have adopted similar, but not identical, naming conventions. The locale name encodes a language, a territory, and optionally, a codeset and a modifier. It is normally represented by a lowercase two-letter ISO 639 language code,[8] an underscore, and an uppercase two-letter ISO 3166-1 country code,[9] optionally followed by a dot and the character-set encoding, and an at-sign and a modifier word. Language names are sometimes used as well. You can list all of the recognized locale names on your system like this:
$ locale -a
List all locales
...
français
fr_BE
fr_BE@euro
fr_BE.iso88591
fr_BE.iso885915@euro
fr_BE.utf8
fr_BE.utf8@euro
fr_CA
fr_CA.iso88591
fr_CA.utf8
...
french
...
You can query the details of a particular locale variable by defining a locale in the environment (here, as a prefix to the command) and running the locale command with the -ck option and an LC_xxx variable. Here is an example from a Sun Solaris system that reports information about the Danish time locale:
$ LC_ALL=da locale -ck LC_TIME
Get locale information for Danish time
LC_TIME
d_t_fmt="%a %d %b %Y %T %Z"
d_fmt="%d-%m-%y"
t_fmt="%T"
t_fmt_ampm="%I:%M:%S %p"
am_pm="AM";"PM"
day="s⊘ndag";"mandag";"tirsdag";"onsdag";"torsdag";"fredag";"l⊘rdag"
abday="s⊘n";"man";"tir";"ons";"tor";"fre";"l⊘r"
mon="januar";"februar";"marts";"april";"maj";"juni";"juli";"august"; \
"september";"oktober";"november";"december"
abmon="jan";"feb";"mar";"apr";"maj";"jun";"jul";"aug";"sep";"okt"; \
"nov";"dec"
era=""
era_d_fmt=""
era_d_t_fmt=""
era_t_fmt=""
alt_digits=""
The number of available locales varies widely. A survey of about 20 flavors of Unix found none at all on BSD systems (they lack the locale command), as few as five on some systems, and almost 500 on recent GNU/Linux releases. Locale support may be an installation option at the discretion of the system manager, so even the same operating system release on two similar machines may have differing locale support. We found filesystem requirements for locale support approaching 300MB[10] on some systems.
Several GNU packages have been internationalized, and localization support has been added for many locales. For example, in an Italian locale, GNU ls offers help like this:
$ LC_ALL=it_IT ls --help
Get help for GNU ls in Italian
Uso: ls [OPZIONE]... [FILE]...
Elenca informazioni sui FILE (predefinito: la directory corrente).
Ordina alfabeticamente le voci se non è usato uno di -cftuSUX oppure --sort.
""
Mandatory arguments to long options are mandatory for short options too.
-a, --all non nasconde le voci che iniziano con .
-A, --almost-all non elenca le voci implicite . e ..
--author stampa l'autore di ogni file
-b, --escape stampa escape ottali per i caratteri non grafici
--block-size=DIMENS usa blocchi lunghi DIMENS byte
...
Notice that when a translation is unavailable (fifth output line), the fallback is to the original language, English. Program names and option names are not translated, because that would destroy software portability.
There is currently little support on most systems for the shell programmer to address the issues of internationalization and localization. However, shell scripts are often affected by locales, notably in collation order, and in bracket-expression character ranges in regular expressions. Although we describe character classes, collating symbols, and equivalence classes in Section 3.2.1, it appears to be quite difficult on most Unix systems to determine from locale documentation or tools exactly what characters are members of the character and equivalence classes, and what collating symbols are available. This reflects the immaturity of locale support on current systems.
When the GNU gettext package[11] is installed, it is possible to use it to support the internationalization and localization of shell scripts. This is an advanced topic that we do not cover in this book, but you can find the