57 .\" This notice shall appear on any product containing this material.
58 .\"
59 .\" The contents of this file are subject to the terms of the
60 .\" Common Development and Distribution License (the "License").
61 .\" You may not use this file except in compliance with the License.
62 .\"
63 .\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
64 .\" or http://www.opensolaris.org/os/licensing.
65 .\" See the License for the specific language governing permissions
66 .\" and limitations under the License.
67 .\"
68 .\" When distributing Covered Code, include this CDDL HEADER in each
69 .\" file and include the License file at usr/src/OPENSOLARIS.LICENSE.
70 .\" If applicable, add the following below this CDDL HEADER, with the
71 .\" fields enclosed by brackets "[]" replaced with your own identifying
72 .\" information: Portions Copyright [yyyy] [name of copyright owner]
73 .\"
74 .\"
75 .\" Copyright (c) 1992, X/Open Company Limited. All Rights Reserved.
76 .\" Portions Copyright (c) 2003, Sun Microsystems, Inc. All Rights Reserved.
77 .\" Copyright 2017 Nexenta Systems, Inc.
78 .\"
79 .Dd June 14, 2017
80 .Dt REGCOMP 3C
81 .Os
82 .Sh NAME
83 .Nm regcomp ,
84 .Nm regexec ,
85 .Nm regerror ,
86 .Nm regfree
87 .Nd regular-expression library
88 .Sh LIBRARY
89 .Lb libc
90 .Sh SYNOPSIS
91 .In regex.h
92 .Ft int
93 .Fo regcomp
94 .Fa "regex_t *restrict preg" "const char *restrict pattern" "int cflags"
95 .Fc
96 .Ft int
97 .Fo regexec
98 .Fa "const regex_t *restrict preg" "const char *restrict string"
99 .Fa "size_t nmatch" "regmatch_t pmatch[restrict]" "int eflags"
153 .Pq EREs ,
154 rather than the basic regular expressions
155 .Pq BREs
156 that are the default.
157 .It Dv REG_BASIC
158 This is a synonym for 0, provided as a counterpart to
159 .Dv REG_EXTENDED
160 to improve readability.
161 .It Dv REG_NOSPEC
162 Compile with recognition of all special characters turned off.
163 All characters are thus considered ordinary, so the RE is a literal string.
164 This is an extension, compatible with but not specified by
165 .St -p1003.2 ,
166 and should be used with caution in software intended to be portable to other
167 systems.
168 .Dv REG_EXTENDED
169 and
170 .Dv REG_NOSPEC
171 may not be used in the same call to
172 .Fn regcomp .
173 .It Dv REG_ICASE
174 Compile for matching that ignores upper/lower case distinctions.
175 See
176 .Xr regex 5 .
177 .It Dv REG_NOSUB
178 Compile for matching that need only report success or failure,
179 not what was matched.
180 .It Dv REG_NEWLINE
181 Compile for newline-sensitive matching.
182 By default, newline is a completely ordinary character with no special
183 meaning in either REs or strings.
184 With this flag,
185 .Qq [^
186 bracket expressions and
187 .Qq \&.
188 never match newline,
189 a
190 .Qq \&^
191 anchor matches the null string after any newline in the string in addition to
192 its normal function, and the
465 .Pc
466 The
467 .Fn regerror
468 function places the NUL-terminated message into the buffer pointed to by
469 .Fa errbuf ,
470 limiting the length
471 .Pq including the NUL
472 to at most
473 .Fa errbuf_size
474 bytes.
475 If the whole message will not fit, as much of it as will fit before the
476 terminating NUL is supplied.
477 In any case, the returned value is the size of buffer needed to hold the whole
478 message
479 .Pq including terminating NUL .
480 If
481 .Fa errbuf_size
482 is 0,
483 .Fa errbuf
484 is ignored but the return value is still correct.
485 .Pp
486 If the
487 .Fa errcode
488 given to
489 .Fn regerror
490 is first ORed with
491 .Dv REG_ITOA ,
492 the
493 .Qq message
494 that results is the printable name of the error code, e.g.
495 .Qq Dv REG_NOMATCH ,
496 rather than an explanation thereof.
497 If
498 .Fa errcode
499 is
500 .Dv REG_ATOI ,
501 then
502 .Fa preg
503 shall be non-NULL and the
504 .Va re_endp
505 member of the structure it points to must point to the printable name of an
506 error code; in this case, the result in
507 .Fa errbuf
508 is the decimal digits of the numeric value of the error code
509 .Pq 0 if the name is not recognized .
510 .Dv REG_ITOA
511 and
512 .Dv REG_ATOI
513 are intended primarily as debugging facilities; they are extensions,
514 compatible with but not specified by
515 .St -p1003.2 ,
516 and should be used with caution in software intended to be portable to other
517 systems.
518 .Ss Fn regfree
519 The
520 .Fn regfree
521 function frees any dynamically-allocated storage associated with the compiled RE
522 pointed to by
523 .Fa preg .
524 The remaining
525 .Ft regex_t
526 is no longer a valid compiled RE and the effect of supplying it to
527 .Fn regexec
528 or
529 .Fn regerror
530 is undefined.
531 .Sh IMPLEMENTATION NOTES
532 There are a number of decisions that
533 .St -p1003.2
534 leaves up to the implementor,
535 either by explicitly saying
536 .Qq undefined
537 or by virtue of them being forbidden by the RE grammar.
538 This implementation treats them as follows.
539 .Pp
540 There is no particular limit on the length of REs, except insofar as memory is
541 limited.
542 Memory usage is approximately linear in RE size, and largely insensitive
543 to RE complexity, except for bounded repetitions.
544 .Pp
545 A backslashed character other than one specifically given a magic meaning by
546 .St -p1003.2
547 .Pq such magic meanings occur only in BREs
548 is taken as an ordinary character.
549 .Pp
550 Any unmatched
551 .Qq \&[
552 is a
553 .Dv REG_EBRACK
554 error.
555 .Pp
556 Equivalence classes cannot begin or end bracket-expression ranges.
557 The endpoint of one range cannot begin another.
558 .Pp
559 .Dv RE_DUP_MAX ,
560 the limit on repetition counts in bounded repetitions, is 255.
561 .Pp
562 A repetition operator
563 .Po
564 .Qq \&? ,
565 .Qq \&* ,
566 .Qq \&+ ,
567 or bounds
568 .Pc
569 cannot follow another repetition operator.
570 A repetition operator cannot begin an expression or subexpression
571 or follow
572 .Qq \&^
573 or
574 .Qq \&| .
575 .Pp
576 .Qq \&|
577 cannot appear first or last in a (sub)expression or after another
578 .Qq \&| ,
579 i.e., an operand of
580 .Qq \&|
581 cannot be an empty subexpression.
582 An empty parenthesized subexpression,
583 .Qq () ,
584 is legal and matches an empty (sub)string.
585 An empty string is not a legal RE.
586 .Pp
587 A
588 .Qq \&{
589 followed by a digit is considered the beginning of bounds for a bounded
590 repetition, which must then follow the syntax for bounds.
591 A
592 .Qq \&{
593 .Em not
594 followed by a digit is considered an ordinary character.
595 .Pp
596 .Qq \&^
597 and
598 .Qq \&$
599 beginning and ending subexpressions in BREs are anchors, not ordinary
600 characters.
601 .Sh RETURN VALUES
602 On successful completion, the
603 .Fn regcomp
604 function returns 0.
605 Otherwise, it returns an integer value indicating an error as described in
606 .In regex.h ,
607 and the content of preg is undefined.
608 .Pp
609 On successful completion, the
610 .Fn regexec
611 function returns 0.
612 Otherwise it returns
613 .Dv REG_NOMATCH
614 to indicate no match, or
615 .Dv REG_ENOSYS
616 to indicate that the function is not supported.
617 .Pp
618 Upon successful completion, the
619 .Fn regerror
620 function returns the number of bytes needed to hold the entire generated string.
621 Otherwise, it returns 0 to indicate that the function is not implemented.
622 .Pp
623 The
624 .Fn regfree
625 function returns no value.
626 .Pp
627 The following constants are defined as error return values:
628 .Pp
629 .Bl -tag -width "REG_ECOLLATE" -compact
630 .It Dv REG_NOMATCH
631 The
632 .Fn regexec
633 function failed to match.
634 .It Dv REG_BADPAT
635 Invalid regular expression.
636 .It Dv REG_ECOLLATE
637 Invalid collating element referenced.
638 .It Dv REG_ECTYPE
639 Invalid character class type referenced.
640 .It Dv REG_EESCAPE
641 Trailing
656 .Qq ()
657 imbalance.
658 .It Dv REG_EBRACE
659 .Qq \e{\e}
660 imbalance.
661 .It Dv REG_BADBR
662 Content of
663 .Qq \e{\e}
664 invalid: not a number, number too large, more than two
665 numbers, first larger than second.
666 .It Dv REG_ERANGE
667 Invalid endpoint in range expression.
668 .It Dv REG_ESPACE
669 Out of memory.
670 .It Dv REG_BADRPT
671 .Qq \&? ,
672 .Qq *
673 or
674 .Qq +
675 not preceded by valid regular expression.
676 .El
677 .Sh USAGE
678 An application could use:
679 .Bd -literal -offset Ds
680 regerror(code, preg, (char *)NULL, (size_t)0)
681 .Ed
682 .Pp
683 to find out how big a buffer is needed for the generated string,
684 .Fn malloc
685 a buffer to hold the string, and then call
686 .Fn regerror
687 again to get the string
688 .Po see
689 .Xr malloc 3C
690 .Pc .
691 Alternately, it could allocate a fixed, static buffer that is big enough to hold
692 most strings, and then use
693 .Fn malloc
694 allocate a larger buffer if it finds that this is too small.
695 .Sh EXAMPLES
|
57 .\" This notice shall appear on any product containing this material.
58 .\"
59 .\" The contents of this file are subject to the terms of the
60 .\" Common Development and Distribution License (the "License").
61 .\" You may not use this file except in compliance with the License.
62 .\"
63 .\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
64 .\" or http://www.opensolaris.org/os/licensing.
65 .\" See the License for the specific language governing permissions
66 .\" and limitations under the License.
67 .\"
68 .\" When distributing Covered Code, include this CDDL HEADER in each
69 .\" file and include the License file at usr/src/OPENSOLARIS.LICENSE.
70 .\" If applicable, add the following below this CDDL HEADER, with the
71 .\" fields enclosed by brackets "[]" replaced with your own identifying
72 .\" information: Portions Copyright [yyyy] [name of copyright owner]
73 .\"
74 .\"
75 .\" Copyright (c) 1992, X/Open Company Limited. All Rights Reserved.
76 .\" Portions Copyright (c) 2003, Sun Microsystems, Inc. All Rights Reserved.
77 .\" Copyright 2018 Nexenta Systems, Inc.
78 .\"
79 .Dd February 3, 2018
80 .Dt REGCOMP 3C
81 .Os
82 .Sh NAME
83 .Nm regcomp ,
84 .Nm regexec ,
85 .Nm regerror ,
86 .Nm regfree
87 .Nd regular-expression library
88 .Sh LIBRARY
89 .Lb libc
90 .Sh SYNOPSIS
91 .In regex.h
92 .Ft int
93 .Fo regcomp
94 .Fa "regex_t *restrict preg" "const char *restrict pattern" "int cflags"
95 .Fc
96 .Ft int
97 .Fo regexec
98 .Fa "const regex_t *restrict preg" "const char *restrict string"
99 .Fa "size_t nmatch" "regmatch_t pmatch[restrict]" "int eflags"
153 .Pq EREs ,
154 rather than the basic regular expressions
155 .Pq BREs
156 that are the default.
157 .It Dv REG_BASIC
158 This is a synonym for 0, provided as a counterpart to
159 .Dv REG_EXTENDED
160 to improve readability.
161 .It Dv REG_NOSPEC
162 Compile with recognition of all special characters turned off.
163 All characters are thus considered ordinary, so the RE is a literal string.
164 This is an extension, compatible with but not specified by
165 .St -p1003.2 ,
166 and should be used with caution in software intended to be portable to other
167 systems.
168 .Dv REG_EXTENDED
169 and
170 .Dv REG_NOSPEC
171 may not be used in the same call to
172 .Fn regcomp .
173 .It Dv REG_LITERAL
174 An alias of
175 .Dv REG_NOSPEC .
176 .It Dv REG_ICASE
177 Compile for matching that ignores upper/lower case distinctions.
178 See
179 .Xr regex 5 .
180 .It Dv REG_NOSUB
181 Compile for matching that need only report success or failure,
182 not what was matched.
183 .It Dv REG_NEWLINE
184 Compile for newline-sensitive matching.
185 By default, newline is a completely ordinary character with no special
186 meaning in either REs or strings.
187 With this flag,
188 .Qq [^
189 bracket expressions and
190 .Qq \&.
191 never match newline,
192 a
193 .Qq \&^
194 anchor matches the null string after any newline in the string in addition to
195 its normal function, and the
468 .Pc
469 The
470 .Fn regerror
471 function places the NUL-terminated message into the buffer pointed to by
472 .Fa errbuf ,
473 limiting the length
474 .Pq including the NUL
475 to at most
476 .Fa errbuf_size
477 bytes.
478 If the whole message will not fit, as much of it as will fit before the
479 terminating NUL is supplied.
480 In any case, the returned value is the size of buffer needed to hold the whole
481 message
482 .Pq including terminating NUL .
483 If
484 .Fa errbuf_size
485 is 0,
486 .Fa errbuf
487 is ignored but the return value is still correct.
488 .Ss Fn regfree
489 The
490 .Fn regfree
491 function frees any dynamically-allocated storage associated with the compiled RE
492 pointed to by
493 .Fa preg .
494 The remaining
495 .Ft regex_t
496 is no longer a valid compiled RE and the effect of supplying it to
497 .Fn regexec
498 or
499 .Fn regerror
500 is undefined.
501 .Sh RETURN VALUES
502 On successful completion, the
503 .Fn regcomp
504 function returns 0.
505 Otherwise, it returns an integer value indicating an error as described in
506 .In regex.h ,
507 and the content of preg is undefined.
508 .Pp
509 On successful completion, the
510 .Fn regexec
511 function returns 0.
512 Otherwise it returns
513 .Dv REG_NOMATCH
514 to indicate no match.
515 .Pp
516 Upon successful completion, the
517 .Fn regerror
518 function returns the number of bytes needed to hold the entire generated string.
519 .Pp
520 The
521 .Fn regfree
522 function returns no value.
523 .Pp
524 The following constants are defined as error return values:
525 .Pp
526 .Bl -tag -width "REG_ECOLLATE" -compact
527 .It Dv REG_NOMATCH
528 The
529 .Fn regexec
530 function failed to match.
531 .It Dv REG_BADPAT
532 Invalid regular expression.
533 .It Dv REG_ECOLLATE
534 Invalid collating element referenced.
535 .It Dv REG_ECTYPE
536 Invalid character class type referenced.
537 .It Dv REG_EESCAPE
538 Trailing
553 .Qq ()
554 imbalance.
555 .It Dv REG_EBRACE
556 .Qq \e{\e}
557 imbalance.
558 .It Dv REG_BADBR
559 Content of
560 .Qq \e{\e}
561 invalid: not a number, number too large, more than two
562 numbers, first larger than second.
563 .It Dv REG_ERANGE
564 Invalid endpoint in range expression.
565 .It Dv REG_ESPACE
566 Out of memory.
567 .It Dv REG_BADRPT
568 .Qq \&? ,
569 .Qq *
570 or
571 .Qq +
572 not preceded by valid regular expression.
573 .It Dv REG_EMPTY
574 Empty (sub)expression.
575 .It Dv REG_INVARG
576 Invalid argument, e.g. negative-length string.
577 .El
578 .Sh USAGE
579 An application could use:
580 .Bd -literal -offset Ds
581 regerror(code, preg, (char *)NULL, (size_t)0)
582 .Ed
583 .Pp
584 to find out how big a buffer is needed for the generated string,
585 .Fn malloc
586 a buffer to hold the string, and then call
587 .Fn regerror
588 again to get the string
589 .Po see
590 .Xr malloc 3C
591 .Pc .
592 Alternately, it could allocate a fixed, static buffer that is big enough to hold
593 most strings, and then use
594 .Fn malloc
595 allocate a larger buffer if it finds that this is too small.
596 .Sh EXAMPLES
|