Print this page
9083 replace regex implementation with tre

@@ -72,13 +72,13 @@
 .\" information: Portions Copyright [yyyy] [name of copyright owner]
 .\"
 .\"
 .\" Copyright (c) 1992, X/Open Company Limited. All Rights Reserved.
 .\" Portions Copyright (c) 2003, Sun Microsystems, Inc.  All Rights Reserved.
-.\" Copyright 2017 Nexenta Systems, Inc.
+.\" Copyright 2018 Nexenta Systems, Inc.
 .\"
-.Dd June 14, 2017
+.Dd February 3, 2018
 .Dt REGCOMP 3C
 .Os
 .Sh NAME
 .Nm regcomp ,
 .Nm regexec ,

@@ -168,10 +168,13 @@
 .Dv REG_EXTENDED
 and
 .Dv REG_NOSPEC
 may not be used in the same call to
 .Fn regcomp .
+.It Dv REG_LITERAL
+An alias of
+.Dv REG_NOSPEC .
 .It Dv REG_ICASE
 Compile for matching that ignores upper/lower case distinctions.
 See
 .Xr regex 5 .
 .It Dv REG_NOSUB

@@ -480,43 +483,10 @@
 If
 .Fa errbuf_size
 is 0,
 .Fa errbuf
 is ignored but the return value is still correct.
-.Pp
-If the
-.Fa errcode
-given to
-.Fn regerror
-is first ORed with
-.Dv REG_ITOA ,
-the
-.Qq message
-that results is the printable name of the error code, e.g.
-.Qq Dv REG_NOMATCH ,
-rather than an explanation thereof.
-If
-.Fa errcode
-is
-.Dv REG_ATOI ,
-then
-.Fa preg
-shall be non-NULL and the
-.Va re_endp
-member of the structure it points to must point to the printable name of an
-error code; in this case, the result in
-.Fa errbuf
-is the decimal digits of the numeric value of the error code
-.Pq 0 if the name is not recognized .
-.Dv REG_ITOA
-and
-.Dv REG_ATOI
-are intended primarily as debugging facilities; they are extensions,
-compatible with but not specified by
-.St -p1003.2 ,
-and should be used with caution in software intended to be portable to other
-systems.
 .Ss Fn regfree
 The
 .Fn regfree
 function frees any dynamically-allocated storage associated with the compiled RE
 pointed to by

@@ -526,80 +496,10 @@
 is no longer a valid compiled RE and the effect of supplying it to
 .Fn regexec
 or
 .Fn regerror
 is undefined.
-.Sh IMPLEMENTATION NOTES
-There are a number of decisions that
-.St -p1003.2
-leaves up to the implementor,
-either by explicitly saying
-.Qq undefined
-or by virtue of them being forbidden by the RE grammar.
-This implementation treats them as follows.
-.Pp
-There is no particular limit on the length of REs, except insofar as memory is
-limited.
-Memory usage is approximately linear in RE size, and largely insensitive
-to RE complexity, except for bounded repetitions.
-.Pp
-A backslashed character other than one specifically given a magic meaning by
-.St -p1003.2
-.Pq such magic meanings occur only in BREs
-is taken as an ordinary character.
-.Pp
-Any unmatched
-.Qq \&[
-is a
-.Dv REG_EBRACK
-error.
-.Pp
-Equivalence classes cannot begin or end bracket-expression ranges.
-The endpoint of one range cannot begin another.
-.Pp
-.Dv RE_DUP_MAX ,
-the limit on repetition counts in bounded repetitions, is 255.
-.Pp
-A repetition operator
-.Po
-.Qq \&? ,
-.Qq \&* ,
-.Qq \&+ ,
-or bounds
-.Pc
-cannot follow another repetition operator.
-A repetition operator cannot begin an expression or subexpression
-or follow
-.Qq \&^
-or
-.Qq \&| .
-.Pp
-.Qq \&|
-cannot appear first or last in a (sub)expression or after another
-.Qq \&| ,
-i.e., an operand of
-.Qq \&|
-cannot be an empty subexpression.
-An empty parenthesized subexpression,
-.Qq () ,
-is legal and matches an empty (sub)string.
-An empty string is not a legal RE.
-.Pp
-A
-.Qq \&{
-followed by a digit is considered the beginning of bounds for a bounded
-repetition, which must then follow the syntax for bounds.
-A
-.Qq \&{
-.Em not
-followed by a digit is considered an ordinary character.
-.Pp
-.Qq \&^
-and
-.Qq \&$
-beginning and ending subexpressions in BREs are anchors, not ordinary
-characters.
 .Sh RETURN VALUES
 On successful completion, the
 .Fn regcomp
 function returns 0.
 Otherwise, it returns an integer value indicating an error as described in

@@ -609,18 +509,15 @@
 On successful completion, the
 .Fn regexec
 function returns 0.
 Otherwise it returns
 .Dv REG_NOMATCH
-to indicate no match, or
-.Dv REG_ENOSYS
-to indicate that the function is not supported.
+to indicate no match.
 .Pp
 Upon successful completion, the
 .Fn regerror
 function returns the number of bytes needed to hold the entire generated string.
-Otherwise, it returns 0 to indicate that the function is not implemented.
 .Pp
 The
 .Fn regfree
 function returns no value.
 .Pp

@@ -671,10 +568,14 @@
 .Qq \&? ,
 .Qq *
 or
 .Qq +
 not preceded by valid regular expression.
+.It Dv REG_EMPTY
+Empty (sub)expression.
+.It Dv REG_INVARG
+Invalid argument, e.g. negative-length string.
 .El
 .Sh USAGE
 An application could use:
 .Bd -literal -offset Ds
 regerror(code, preg, (char *)NULL, (size_t)0)