The pattern matching notation described below is used to specify patterns for matching strings in the shell. Historically, pattern matching notation is related to, but slightly different from, the regular expression notation. For this reason, the description of the rules for this pattern matching notation is based on the description of regular expression notation described on the
(5) manual page.
Patterns Matching a Single Character
The following patterns match a single character:
ordinary characters,
special pattern characters and
pattern bracket expressions. The pattern bracket expression will also match a single collating element.
An ordinary character is a pattern that matches itself. It can be any character in the supported character set except for NUL, those special shell characters that require quoting, and the following three special pattern characters. Matching is based on the bit pattern used for encoding the character, not on the graphic representation of the character. If any character (ordinary, shell special, or pattern special) is quoted, that pattern will match the character itself. The shell special characters always require quoting.
When unquoted and outside a bracket expression, the following three characters will have special meaning in the specification of patterns:
?
A question-mark is a pattern that will match any character.
*
An asterisk is a pattern that will match multiple characters, as described in Patterns Matching Multiple Characters, below.
[
The open bracket will introduce a pattern bracket expression.
The description of basic regular expression bracket expressions on the regex(5) manual page also applies to the pattern bracket expression, except that the exclamation-mark character ( ! ) replaces the circumflex character ( ^) in its role in a non-matching list in the regular expression notation. A bracket expression starting with an unquoted circumflex character produces unspecified results.
The restriction on a circumflex in a bracket expression is to allow implementations that support pattern matching using the circumflex as the negation character in addition to the exclamation-mark. A portable application must use something like [\^!] to match either character.
When pattern matching is used where shell quote removal is not performed (such as in the argument to the find -name primary when find is being called using one of the exec functions, or in the pattern argument to the fnmatch(3C) function, special characters can be escaped to remove their special meaning by preceding them with a backslash character. This escaping backslash will be discarded. The sequence \\ represents one literal backslash. All of the requirements and effects of quoting on ordinary, shell special and special pattern characters will apply to escaping in this context.
Both quoting and escaping are described here because pattern matching must work in three separate circumstances:
-
o
-
Calling directly upon the shell, such as in pathname expansion or in a case statement. All of the following will match the string or file abc:
abc |
"abc" |
a"b"c |
a\bc |
a[b]c |
a["b"]c |
a[\b]c |
a["\b"]c |
a?c |
a*c |
The following will not:
-
o
-
Calling a utility or function without going through a shell, as described for find(1) and the function fnmatch(3C)
-
o
-
Calling utilities such as find, cpio, tar or pax through the shell command line. In this case, shell quote removal is performed before the utility sees the argument. For example, in:
find /bin -name e\c[\h]o -print
after quote removal, the backslashes are presented to find and it treats them as escape characters. Both precede ordinary characters, so the c and h represent themselves and echo would be found on many historical systems (that have it in /bin). To find a file name that contained shell special characters or pattern characters, both quoting and escaping are required, such as:
pax -r ... "*a\(\?"
to extract a filename ending with a(?.
Conforming applications are required to quote or escape the shell special characters (sometimes called metacharacters). If used without this protection, syntax errors can result or implementation extensions can be triggered. For example, the KornShell supports a series of extensions based on parentheses in patterns; see ksh(1)
Patterns Matching Multiple Characters
The following rules are used to construct
patterns matching multiple characters from
patterns matching a single character:
-
o
-
The asterisk (*) is a pattern that will match any string, including the null string.
-
o
-
The concatenation of patterns matching a single character is a valid pattern that will match the concatenation of the single characters or collating elements matched by each of the concatenated patterns.
-
o
-
The concatenation of one or more patterns matching a single character with one or more asterisks is a valid pattern. In such patterns, each asterisk will match a string of zero or more characters, matching the greatest possible number of characters that still allows the remainder of the pattern to match the string.
Since each asterisk matches zero or more occurrences, the patterns a*b and a**b have identical functionality.
Examples:
a[bc]
matches the strings ab and ac.
a*d
matches the strings ad, abd and abcd, but not the string abc.
a*d*
matches the strings ad, abcd, abcdef, aaaad and adddd.
*a*d
matches the strings ad, abcd, efabcd, aaaad and adddd.