Print this page
9083 replace regex implementation with tre

*** 55,64 **** --- 55,66 ---- with but not specified by IEEE Std 1003.2 ("POSIX.2"), and should be used with caution in software intended to be portable to other systems. REG_EXTENDED and REG_NOSPEC may not be used in the same call to regcomp(). + REG_LITERAL An alias of REG_NOSPEC. + REG_ICASE Compile for matching that ignores upper/lower case distinctions. See regex(5). REG_NOSUB Compile for matching that need only report success or failure, not what was matched.
*** 183,260 **** NUL is supplied. In any case, the returned value is the size of buffer needed to hold the whole message (including terminating NUL). If errbuf_size is 0, errbuf is ignored but the return value is still correct. - If the errcode given to regerror() is first ORed with REG_ITOA, the - "message" that results is the printable name of the error code, e.g. - "REG_NOMATCH", rather than an explanation thereof. If errcode is - REG_ATOI, then preg shall be non-NULL and the re_endp member of the - structure it points to must point to the printable name of an error code; - in this case, the result in errbuf is the decimal digits of the numeric - value of the error code (0 if the name is not recognized). REG_ITOA and - REG_ATOI are intended primarily as debugging facilities; they are - extensions, compatible with but not specified by IEEE Std 1003.2 - ("POSIX.2"), and should be used with caution in software intended to be - portable to other systems. - regfree() The regfree() function frees any dynamically-allocated storage associated with the compiled RE pointed to by preg. The remaining regex_t is no longer a valid compiled RE and the effect of supplying it to regexec() or regerror() is undefined. - IMPLEMENTATION NOTES - There are a number of decisions that IEEE Std 1003.2 ("POSIX.2") leaves - up to the implementor, either by explicitly saying "undefined" or by - virtue of them being forbidden by the RE grammar. This implementation - treats them as follows. - - There is no particular limit on the length of REs, except insofar as - memory is limited. Memory usage is approximately linear in RE size, and - largely insensitive to RE complexity, except for bounded repetitions. - - A backslashed character other than one specifically given a magic meaning - by IEEE Std 1003.2 ("POSIX.2") (such magic meanings occur only in BREs) - is taken as an ordinary character. - - Any unmatched "[" is a REG_EBRACK error. - - Equivalence classes cannot begin or end bracket-expression ranges. The - endpoint of one range cannot begin another. - - RE_DUP_MAX, the limit on repetition counts in bounded repetitions, is - 255. - - A repetition operator ("?", "*", "+", or bounds) cannot follow another - repetition operator. A repetition operator cannot begin an expression or - subexpression or follow "^" or "|". - - "|" cannot appear first or last in a (sub)expression or after another - "|", i.e., an operand of "|" cannot be an empty subexpression. An empty - parenthesized subexpression, "()", is legal and matches an empty - (sub)string. An empty string is not a legal RE. - - A "{" followed by a digit is considered the beginning of bounds for a - bounded repetition, which must then follow the syntax for bounds. A "{" - not followed by a digit is considered an ordinary character. - - "^" and "$" beginning and ending subexpressions in BREs are anchors, not - ordinary characters. - RETURN VALUES On successful completion, the regcomp() function returns 0. Otherwise, it returns an integer value indicating an error as described in <regex.h>, and the content of preg is undefined. On successful completion, the regexec() function returns 0. Otherwise it ! returns REG_NOMATCH to indicate no match, or REG_ENOSYS to indicate that ! the function is not supported. Upon successful completion, the regerror() function returns the number of ! bytes needed to hold the entire generated string. Otherwise, it returns ! 0 to indicate that the function is not implemented. The regfree() function returns no value. The following constants are defined as error return values: --- 185,210 ---- NUL is supplied. In any case, the returned value is the size of buffer needed to hold the whole message (including terminating NUL). If errbuf_size is 0, errbuf is ignored but the return value is still correct. regfree() The regfree() function frees any dynamically-allocated storage associated with the compiled RE pointed to by preg. The remaining regex_t is no longer a valid compiled RE and the effect of supplying it to regexec() or regerror() is undefined. RETURN VALUES On successful completion, the regcomp() function returns 0. Otherwise, it returns an integer value indicating an error as described in <regex.h>, and the content of preg is undefined. On successful completion, the regexec() function returns 0. Otherwise it ! returns REG_NOMATCH to indicate no match. Upon successful completion, the regerror() function returns the number of ! bytes needed to hold the entire generated string. The regfree() function returns no value. The following constants are defined as error return values:
*** 271,280 **** --- 221,232 ---- REG_BADBR Content of "\{\}" invalid: not a number, number too large, more than two numbers, first larger than second. REG_ERANGE Invalid endpoint in range expression. REG_ESPACE Out of memory. REG_BADRPT "?", "*" or "+" not preceded by valid regular expression. + REG_EMPTY Empty (sub)expression. + REG_INVARG Invalid argument, e.g. negative-length string. USAGE An application could use: regerror(code, preg, (char *)NULL, (size_t)0)
*** 346,351 **** attributes(5), regex(5), standards(5) IEEE Std 1003.2 ("POSIX.2"), sections 2.8 (Regular Expression Notation) and B.5 (C Binding for Regular Expression Matching). ! illumos June 14, 2017 illumos --- 298,303 ---- attributes(5), regex(5), standards(5) IEEE Std 1003.2 ("POSIX.2"), sections 2.8 (Regular Expression Notation) and B.5 (C Binding for Regular Expression Matching). ! illumos February 3, 2018 illumos