Print this page
9083 replace regex implementation with tre
@@ -55,10 +55,12 @@
with but not specified by IEEE Std 1003.2 ("POSIX.2"), and
should be used with caution in software intended to be
portable to other systems. REG_EXTENDED and REG_NOSPEC may
not be used in the same call to regcomp().
+ REG_LITERAL An alias of REG_NOSPEC.
+
REG_ICASE Compile for matching that ignores upper/lower case
distinctions. See regex(5).
REG_NOSUB Compile for matching that need only report success or
failure, not what was matched.
@@ -183,78 +185,26 @@
NUL is supplied. In any case, the returned value is the size of buffer
needed to hold the whole message (including terminating NUL). If
errbuf_size is 0, errbuf is ignored but the return value is still
correct.
- If the errcode given to regerror() is first ORed with REG_ITOA, the
- "message" that results is the printable name of the error code, e.g.
- "REG_NOMATCH", rather than an explanation thereof. If errcode is
- REG_ATOI, then preg shall be non-NULL and the re_endp member of the
- structure it points to must point to the printable name of an error code;
- in this case, the result in errbuf is the decimal digits of the numeric
- value of the error code (0 if the name is not recognized). REG_ITOA and
- REG_ATOI are intended primarily as debugging facilities; they are
- extensions, compatible with but not specified by IEEE Std 1003.2
- ("POSIX.2"), and should be used with caution in software intended to be
- portable to other systems.
-
regfree()
The regfree() function frees any dynamically-allocated storage associated
with the compiled RE pointed to by preg. The remaining regex_t is no
longer a valid compiled RE and the effect of supplying it to regexec() or
regerror() is undefined.
-IMPLEMENTATION NOTES
- There are a number of decisions that IEEE Std 1003.2 ("POSIX.2") leaves
- up to the implementor, either by explicitly saying "undefined" or by
- virtue of them being forbidden by the RE grammar. This implementation
- treats them as follows.
-
- There is no particular limit on the length of REs, except insofar as
- memory is limited. Memory usage is approximately linear in RE size, and
- largely insensitive to RE complexity, except for bounded repetitions.
-
- A backslashed character other than one specifically given a magic meaning
- by IEEE Std 1003.2 ("POSIX.2") (such magic meanings occur only in BREs)
- is taken as an ordinary character.
-
- Any unmatched "[" is a REG_EBRACK error.
-
- Equivalence classes cannot begin or end bracket-expression ranges. The
- endpoint of one range cannot begin another.
-
- RE_DUP_MAX, the limit on repetition counts in bounded repetitions, is
- 255.
-
- A repetition operator ("?", "*", "+", or bounds) cannot follow another
- repetition operator. A repetition operator cannot begin an expression or
- subexpression or follow "^" or "|".
-
- "|" cannot appear first or last in a (sub)expression or after another
- "|", i.e., an operand of "|" cannot be an empty subexpression. An empty
- parenthesized subexpression, "()", is legal and matches an empty
- (sub)string. An empty string is not a legal RE.
-
- A "{" followed by a digit is considered the beginning of bounds for a
- bounded repetition, which must then follow the syntax for bounds. A "{"
- not followed by a digit is considered an ordinary character.
-
- "^" and "$" beginning and ending subexpressions in BREs are anchors, not
- ordinary characters.
-
RETURN VALUES
On successful completion, the regcomp() function returns 0. Otherwise,
it returns an integer value indicating an error as described in
<regex.h>, and the content of preg is undefined.
On successful completion, the regexec() function returns 0. Otherwise it
- returns REG_NOMATCH to indicate no match, or REG_ENOSYS to indicate that
- the function is not supported.
+ returns REG_NOMATCH to indicate no match.
Upon successful completion, the regerror() function returns the number of
- bytes needed to hold the entire generated string. Otherwise, it returns
- 0 to indicate that the function is not implemented.
+ bytes needed to hold the entire generated string.
The regfree() function returns no value.
The following constants are defined as error return values:
@@ -271,10 +221,12 @@
REG_BADBR Content of "\{\}" invalid: not a number, number too large,
more than two numbers, first larger than second.
REG_ERANGE Invalid endpoint in range expression.
REG_ESPACE Out of memory.
REG_BADRPT "?", "*" or "+" not preceded by valid regular expression.
+ REG_EMPTY Empty (sub)expression.
+ REG_INVARG Invalid argument, e.g. negative-length string.
USAGE
An application could use:
regerror(code, preg, (char *)NULL, (size_t)0)
@@ -346,6 +298,6 @@
attributes(5), regex(5), standards(5)
IEEE Std 1003.2 ("POSIX.2"), sections 2.8 (Regular Expression Notation)
and B.5 (C Binding for Regular Expression Matching).
-illumos June 14, 2017 illumos
+illumos February 3, 2018 illumos