Online Book Reader

Home Category

Classic Shell Scripting - Arnold Robbins [29]

By Root 795 0
different classes of characters such as alphabetic characters, control characters, and so on. See Table 3-3.

Collating symbols

A collating symbol is a multicharacter sequence that should be treated as a unit. It consists of the characters bracketed by [. and .]. Collating symbols are specific to the locale in which they are used.

Equivalence classes

An equivalence class lists a set of characters that should be considered equivalent, such as e and è. It consists of a named element from the locale, bracketed by [= and =].

All three of these constructs must appear inside the square brackets of a bracket expression. For example, [[:alpha:]!] matches any single alphabetic character or the exclamation mark, and [[.ch.]] matches the collating element ch, but does not match just the letter c or the letter h. In a French locale, [[=e=]] might match any of e, è, ë, ê, or é. We provide more information on character classes, collating symbols, and equivalence classes shortly.

Table 3-3 describes the POSIX character classes.

Table 3-3. POSIX character classes

Class

Matching characters

Class

Matching characters

[:alnum:]

Alphanumeric characters

[:lower:]

Lowercase characters

[:alpha:]

Alphabetic characters

[:print:]

Printable characters

[:blank:]

Space and tab characters

[:punct:]

Punctuation characters

[:cntrl:]

Control characters

[:space:]

Whitespace characters

[:digit:]

Numeric characters

[:upper:]

Uppercase characters

[:graph:]

Nonspace characters

[:xdigit:]

Hexadecimal digits

BREs and EREs share some common characteristics, but also have some important differences. We'll start by explaining BREs, and then we'll explain the additional metacharacters in EREs, as well as the cases where the same (or similar) metacharacters are used but have different semantics (meaning).

Basic Regular Expressions

BREs are built up of multiple components, starting with several ways to match single characters, and then combining those with additional metacharacters for matching multiple characters.

Matching single characters

The first operation is to match a single character. This can be done in several ways: with ordinary characters; with an escaped metacharacter; with the . (dot) metacharacter; or with a bracket expression:

Ordinary characters are those not listed in Table 3-1. These include all alphanumeric characters, most whitespace characters, and most punctuation characters. Thus, the regular expression a matches the character a. We say that ordinary characters stand for themselves, and this usage should be pretty straightforward and obvious. Thus, shell matches shell, WoRd matches WoRd but not word, and so on.

If metacharacters don't stand for themselves, how do you match one when you need to? The answer is by escaping it. This is done by preceding it with a backslash. Thus, \* matches a literal *, \ matches a single literal backslash, and \[ matches a left bracket. (If you put a backslash in front of an ordinary character, the POSIX standard leaves the behavior as explicitly undefined. Typically, the backslash is ignored, but it's poor practice to do something like that.)

The . (dot) character means "any single character." Thus, a.c matches all of abc, aac, aqc, and so on. The single dot by itself is only occasionally useful. It is much more often used together with other metacharacters that allow the combination to match multiple characters, as described shortly.

The last way to match a single character is with a bracket expression. The simplest form of a bracket expression is to enclose a list of characters between square brackets, such as [aeiouy], which matches any lowercase English vowel. For example, c[aeiouy]t matches cat, cot, and cut (as well as cet, cit, and cyt), but won't match cbt.

Supplying a caret (^) as the first character in the bracket expression complements the set of characters that are matched; such a complemented set matches any character not in the bracketed list. Thus, [^aeiouy] matches

Return Main Page Previous Page Next Page

®Online Book Reader