Print this page
9083 replace regex implementation with tre

Split Close
Expand all
Collapse all
          --- old/usr/src/man/man3c/regcomp.3c
          +++ new/usr/src/man/man3c/regcomp.3c
↓ open down ↓ 66 lines elided ↑ open up ↑
  67   67  .\"
  68   68  .\" When distributing Covered Code, include this CDDL HEADER in each
  69   69  .\" file and include the License file at usr/src/OPENSOLARIS.LICENSE.
  70   70  .\" If applicable, add the following below this CDDL HEADER, with the
  71   71  .\" fields enclosed by brackets "[]" replaced with your own identifying
  72   72  .\" information: Portions Copyright [yyyy] [name of copyright owner]
  73   73  .\"
  74   74  .\"
  75   75  .\" Copyright (c) 1992, X/Open Company Limited. All Rights Reserved.
  76   76  .\" Portions Copyright (c) 2003, Sun Microsystems, Inc.  All Rights Reserved.
  77      -.\" Copyright 2017 Nexenta Systems, Inc.
       77 +.\" Copyright 2018 Nexenta Systems, Inc.
  78   78  .\"
  79      -.Dd June 14, 2017
       79 +.Dd February 3, 2018
  80   80  .Dt REGCOMP 3C
  81   81  .Os
  82   82  .Sh NAME
  83   83  .Nm regcomp ,
  84   84  .Nm regexec ,
  85   85  .Nm regerror ,
  86   86  .Nm regfree
  87   87  .Nd regular-expression library
  88   88  .Sh LIBRARY
  89   89  .Lb libc
↓ open down ↓ 73 lines elided ↑ open up ↑
 163  163  All characters are thus considered ordinary, so the RE is a literal string.
 164  164  This is an extension, compatible with but not specified by
 165  165  .St -p1003.2 ,
 166  166  and should be used with caution in software intended to be portable to other
 167  167  systems.
 168  168  .Dv REG_EXTENDED
 169  169  and
 170  170  .Dv REG_NOSPEC
 171  171  may not be used in the same call to
 172  172  .Fn regcomp .
      173 +.It Dv REG_LITERAL
      174 +An alias of
      175 +.Dv REG_NOSPEC .
 173  176  .It Dv REG_ICASE
 174  177  Compile for matching that ignores upper/lower case distinctions.
 175  178  See
 176  179  .Xr regex 5 .
 177  180  .It Dv REG_NOSUB
 178  181  Compile for matching that need only report success or failure,
 179  182  not what was matched.
 180  183  .It Dv REG_NEWLINE
 181  184  Compile for newline-sensitive matching.
 182  185  By default, newline is a completely ordinary character with no special
↓ open down ↓ 292 lines elided ↑ open up ↑
 475  478  If the whole message will not fit, as much of it as will fit before the
 476  479  terminating NUL is supplied.
 477  480  In any case, the returned value is the size of buffer needed to hold the whole
 478  481  message
 479  482  .Pq including terminating NUL .
 480  483  If
 481  484  .Fa errbuf_size
 482  485  is 0,
 483  486  .Fa errbuf
 484  487  is ignored but the return value is still correct.
 485      -.Pp
 486      -If the
 487      -.Fa errcode
 488      -given to
 489      -.Fn regerror
 490      -is first ORed with
 491      -.Dv REG_ITOA ,
 492      -the
 493      -.Qq message
 494      -that results is the printable name of the error code, e.g.
 495      -.Qq Dv REG_NOMATCH ,
 496      -rather than an explanation thereof.
 497      -If
 498      -.Fa errcode
 499      -is
 500      -.Dv REG_ATOI ,
 501      -then
 502      -.Fa preg
 503      -shall be non-NULL and the
 504      -.Va re_endp
 505      -member of the structure it points to must point to the printable name of an
 506      -error code; in this case, the result in
 507      -.Fa errbuf
 508      -is the decimal digits of the numeric value of the error code
 509      -.Pq 0 if the name is not recognized .
 510      -.Dv REG_ITOA
 511      -and
 512      -.Dv REG_ATOI
 513      -are intended primarily as debugging facilities; they are extensions,
 514      -compatible with but not specified by
 515      -.St -p1003.2 ,
 516      -and should be used with caution in software intended to be portable to other
 517      -systems.
 518  488  .Ss Fn regfree
 519  489  The
 520  490  .Fn regfree
 521  491  function frees any dynamically-allocated storage associated with the compiled RE
 522  492  pointed to by
 523  493  .Fa preg .
 524  494  The remaining
 525  495  .Ft regex_t
 526  496  is no longer a valid compiled RE and the effect of supplying it to
 527  497  .Fn regexec
 528  498  or
 529  499  .Fn regerror
 530  500  is undefined.
 531      -.Sh IMPLEMENTATION NOTES
 532      -There are a number of decisions that
 533      -.St -p1003.2
 534      -leaves up to the implementor,
 535      -either by explicitly saying
 536      -.Qq undefined
 537      -or by virtue of them being forbidden by the RE grammar.
 538      -This implementation treats them as follows.
 539      -.Pp
 540      -There is no particular limit on the length of REs, except insofar as memory is
 541      -limited.
 542      -Memory usage is approximately linear in RE size, and largely insensitive
 543      -to RE complexity, except for bounded repetitions.
 544      -.Pp
 545      -A backslashed character other than one specifically given a magic meaning by
 546      -.St -p1003.2
 547      -.Pq such magic meanings occur only in BREs
 548      -is taken as an ordinary character.
 549      -.Pp
 550      -Any unmatched
 551      -.Qq \&[
 552      -is a
 553      -.Dv REG_EBRACK
 554      -error.
 555      -.Pp
 556      -Equivalence classes cannot begin or end bracket-expression ranges.
 557      -The endpoint of one range cannot begin another.
 558      -.Pp
 559      -.Dv RE_DUP_MAX ,
 560      -the limit on repetition counts in bounded repetitions, is 255.
 561      -.Pp
 562      -A repetition operator
 563      -.Po
 564      -.Qq \&? ,
 565      -.Qq \&* ,
 566      -.Qq \&+ ,
 567      -or bounds
 568      -.Pc
 569      -cannot follow another repetition operator.
 570      -A repetition operator cannot begin an expression or subexpression
 571      -or follow
 572      -.Qq \&^
 573      -or
 574      -.Qq \&| .
 575      -.Pp
 576      -.Qq \&|
 577      -cannot appear first or last in a (sub)expression or after another
 578      -.Qq \&| ,
 579      -i.e., an operand of
 580      -.Qq \&|
 581      -cannot be an empty subexpression.
 582      -An empty parenthesized subexpression,
 583      -.Qq () ,
 584      -is legal and matches an empty (sub)string.
 585      -An empty string is not a legal RE.
 586      -.Pp
 587      -A
 588      -.Qq \&{
 589      -followed by a digit is considered the beginning of bounds for a bounded
 590      -repetition, which must then follow the syntax for bounds.
 591      -A
 592      -.Qq \&{
 593      -.Em not
 594      -followed by a digit is considered an ordinary character.
 595      -.Pp
 596      -.Qq \&^
 597      -and
 598      -.Qq \&$
 599      -beginning and ending subexpressions in BREs are anchors, not ordinary
 600      -characters.
 601  501  .Sh RETURN VALUES
 602  502  On successful completion, the
 603  503  .Fn regcomp
 604  504  function returns 0.
 605  505  Otherwise, it returns an integer value indicating an error as described in
 606  506  .In regex.h ,
 607  507  and the content of preg is undefined.
 608  508  .Pp
 609  509  On successful completion, the
 610  510  .Fn regexec
 611  511  function returns 0.
 612  512  Otherwise it returns
 613  513  .Dv REG_NOMATCH
 614      -to indicate no match, or
 615      -.Dv REG_ENOSYS
 616      -to indicate that the function is not supported.
      514 +to indicate no match.
 617  515  .Pp
 618  516  Upon successful completion, the
 619  517  .Fn regerror
 620  518  function returns the number of bytes needed to hold the entire generated string.
 621      -Otherwise, it returns 0 to indicate that the function is not implemented.
 622  519  .Pp
 623  520  The
 624  521  .Fn regfree
 625  522  function returns no value.
 626  523  .Pp
 627  524  The following constants are defined as error return values:
 628  525  .Pp
 629  526  .Bl -tag -width "REG_ECOLLATE" -compact
 630  527  .It Dv REG_NOMATCH
 631  528  The
↓ open down ↓ 34 lines elided ↑ open up ↑
 666  563  .It Dv REG_ERANGE
 667  564  Invalid endpoint in range expression.
 668  565  .It Dv REG_ESPACE
 669  566  Out of memory.
 670  567  .It Dv REG_BADRPT
 671  568  .Qq \&? ,
 672  569  .Qq *
 673  570  or
 674  571  .Qq +
 675  572  not preceded by valid regular expression.
      573 +.It Dv REG_EMPTY
      574 +Empty (sub)expression.
      575 +.It Dv REG_INVARG
      576 +Invalid argument, e.g. negative-length string.
 676  577  .El
 677  578  .Sh USAGE
 678  579  An application could use:
 679  580  .Bd -literal -offset Ds
 680  581  regerror(code, preg, (char *)NULL, (size_t)0)
 681  582  .Ed
 682  583  .Pp
 683  584  to find out how big a buffer is needed for the generated string,
 684  585  .Fn malloc
 685  586  a buffer to hold the string, and then call
↓ open down ↓ 79 lines elided ↑ open up ↑
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX