Print this page
9083 replace regex implementation with tre

Split Close
Expand all
Collapse all
          --- old/usr/src/man/man3c/regcomp.3c.man.txt
          +++ new/usr/src/man/man3c/regcomp.3c.man.txt
↓ open down ↓ 49 lines elided ↑ open up ↑
  50   50                     REG_EXTENDED to improve readability.
  51   51  
  52   52       REG_NOSPEC    Compile with recognition of all special characters turned
  53   53                     off.  All characters are thus considered ordinary, so the
  54   54                     RE is a literal string.  This is an extension, compatible
  55   55                     with but not specified by IEEE Std 1003.2 ("POSIX.2"), and
  56   56                     should be used with caution in software intended to be
  57   57                     portable to other systems.  REG_EXTENDED and REG_NOSPEC may
  58   58                     not be used in the same call to regcomp().
  59   59  
       60 +     REG_LITERAL   An alias of REG_NOSPEC.
       61 +
  60   62       REG_ICASE     Compile for matching that ignores upper/lower case
  61   63                     distinctions.  See regex(5).
  62   64  
  63   65       REG_NOSUB     Compile for matching that need only report success or
  64   66                     failure, not what was matched.
  65   67  
  66   68       REG_NEWLINE   Compile for newline-sensitive matching.  By default,
  67   69                     newline is a completely ordinary character with no special
  68   70                     meaning in either REs or strings.  With this flag, "[^"
  69   71                     bracket expressions and "." never match newline, a "^"
↓ open down ↓ 108 lines elided ↑ open up ↑
 178  180       (regerror() may be able to supply a more detailed message using
 179  181       information from the regex_t.) The regerror() function places the NUL-
 180  182       terminated message into the buffer pointed to by errbuf, limiting the
 181  183       length (including the NUL) to at most errbuf_size bytes.  If the whole
 182  184       message will not fit, as much of it as will fit before the terminating
 183  185       NUL is supplied.  In any case, the returned value is the size of buffer
 184  186       needed to hold the whole message (including terminating NUL).  If
 185  187       errbuf_size is 0, errbuf is ignored but the return value is still
 186  188       correct.
 187  189  
 188      -     If the errcode given to regerror() is first ORed with REG_ITOA, the
 189      -     "message" that results is the printable name of the error code, e.g.
 190      -     "REG_NOMATCH", rather than an explanation thereof.  If errcode is
 191      -     REG_ATOI, then preg shall be non-NULL and the re_endp member of the
 192      -     structure it points to must point to the printable name of an error code;
 193      -     in this case, the result in errbuf is the decimal digits of the numeric
 194      -     value of the error code (0 if the name is not recognized).  REG_ITOA and
 195      -     REG_ATOI are intended primarily as debugging facilities; they are
 196      -     extensions, compatible with but not specified by IEEE Std 1003.2
 197      -     ("POSIX.2"), and should be used with caution in software intended to be
 198      -     portable to other systems.
 199      -
 200  190     regfree()
 201  191       The regfree() function frees any dynamically-allocated storage associated
 202  192       with the compiled RE pointed to by preg.  The remaining regex_t is no
 203  193       longer a valid compiled RE and the effect of supplying it to regexec() or
 204  194       regerror() is undefined.
 205  195  
 206      -IMPLEMENTATION NOTES
 207      -     There are a number of decisions that IEEE Std 1003.2 ("POSIX.2") leaves
 208      -     up to the implementor, either by explicitly saying "undefined" or by
 209      -     virtue of them being forbidden by the RE grammar.  This implementation
 210      -     treats them as follows.
 211      -
 212      -     There is no particular limit on the length of REs, except insofar as
 213      -     memory is limited.  Memory usage is approximately linear in RE size, and
 214      -     largely insensitive to RE complexity, except for bounded repetitions.
 215      -
 216      -     A backslashed character other than one specifically given a magic meaning
 217      -     by IEEE Std 1003.2 ("POSIX.2") (such magic meanings occur only in BREs)
 218      -     is taken as an ordinary character.
 219      -
 220      -     Any unmatched "[" is a REG_EBRACK error.
 221      -
 222      -     Equivalence classes cannot begin or end bracket-expression ranges.  The
 223      -     endpoint of one range cannot begin another.
 224      -
 225      -     RE_DUP_MAX, the limit on repetition counts in bounded repetitions, is
 226      -     255.
 227      -
 228      -     A repetition operator ("?", "*", "+", or bounds) cannot follow another
 229      -     repetition operator.  A repetition operator cannot begin an expression or
 230      -     subexpression or follow "^" or "|".
 231      -
 232      -     "|" cannot appear first or last in a (sub)expression or after another
 233      -     "|", i.e., an operand of "|" cannot be an empty subexpression.  An empty
 234      -     parenthesized subexpression, "()", is legal and matches an empty
 235      -     (sub)string.  An empty string is not a legal RE.
 236      -
 237      -     A "{" followed by a digit is considered the beginning of bounds for a
 238      -     bounded repetition, which must then follow the syntax for bounds.  A "{"
 239      -     not followed by a digit is considered an ordinary character.
 240      -
 241      -     "^" and "$" beginning and ending subexpressions in BREs are anchors, not
 242      -     ordinary characters.
 243      -
 244  196  RETURN VALUES
 245  197       On successful completion, the regcomp() function returns 0.  Otherwise,
 246  198       it returns an integer value indicating an error as described in
 247  199       <regex.h>, and the content of preg is undefined.
 248  200  
 249  201       On successful completion, the regexec() function returns 0.  Otherwise it
 250      -     returns REG_NOMATCH to indicate no match, or REG_ENOSYS to indicate that
 251      -     the function is not supported.
      202 +     returns REG_NOMATCH to indicate no match.
 252  203  
 253  204       Upon successful completion, the regerror() function returns the number of
 254      -     bytes needed to hold the entire generated string.  Otherwise, it returns
 255      -     0 to indicate that the function is not implemented.
      205 +     bytes needed to hold the entire generated string.
 256  206  
 257  207       The regfree() function returns no value.
 258  208  
 259  209       The following constants are defined as error return values:
 260  210  
 261  211       REG_NOMATCH   The regexec() function failed to match.
 262  212       REG_BADPAT    Invalid regular expression.
 263  213       REG_ECOLLATE  Invalid collating element referenced.
 264  214       REG_ECTYPE    Invalid character class type referenced.
 265  215       REG_EESCAPE   Trailing "\" in pattern.
 266  216       REG_ESUBREG   Number in "\digit" invalid or in error.
 267  217       REG_EBRACK    "[]" imbalance.
 268  218       REG_ENOSYS    The function is not supported.
 269  219       REG_EPAREN    "\(\)" or "()" imbalance.
 270  220       REG_EBRACE    "\{\}" imbalance.
 271  221       REG_BADBR     Content of "\{\}" invalid: not a number, number too large,
 272  222                     more than two numbers, first larger than second.
 273  223       REG_ERANGE    Invalid endpoint in range expression.
 274  224       REG_ESPACE    Out of memory.
 275  225       REG_BADRPT    "?", "*" or "+" not preceded by valid regular expression.
      226 +     REG_EMPTY     Empty (sub)expression.
      227 +     REG_INVARG    Invalid argument, e.g. negative-length string.
 276  228  
 277  229  USAGE
 278  230       An application could use:
 279  231  
 280  232             regerror(code, preg, (char *)NULL, (size_t)0)
 281  233  
 282  234       to find out how big a buffer is needed for the generated string, malloc()
 283  235       a buffer to hold the string, and then call regerror() again to get the
 284  236       string (see malloc(3C)).  Alternately, it could allocate a fixed, static
 285  237       buffer that is big enough to hold most strings, and then use malloc()
↓ open down ↓ 55 lines elided ↑ open up ↑
 341  293  
 342  294       The regcomp() function can be used safely in a multithreaded application
 343  295       as long as setlocale(3C) is not being called to change the locale.
 344  296  
 345  297  SEE ALSO
 346  298       attributes(5), regex(5), standards(5)
 347  299  
 348  300       IEEE Std 1003.2 ("POSIX.2"), sections 2.8 (Regular Expression Notation)
 349  301       and B.5 (C Binding for Regular Expression Matching).
 350  302  
 351      -illumos                          June 14, 2017                         illumos
      303 +illumos                        February 3, 2018                        illumos
    
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX