1 LOCALE(5)             Standards, Environments, and Macros            LOCALE(5)
   2 
   3 
   4 
   5 NAME
   6        locale - subset of a user's environment that depends on language and
   7        cultural conventions
   8 
   9 DESCRIPTION
  10        A locale is the definition of the subset of a user's environment that
  11        depends on language and cultural conventions. It is made up from one or
  12        more categories. Each category is identified by its name and controls
  13        specific aspects of the behavior of components of the system. Category
  14        names correspond to the following environment variable names:
  15 
  16        LC_CTYPE
  17                       Character classification and case conversion.
  18 
  19 
  20        LC_COLLATE
  21                       Collation order.
  22 
  23 
  24        LC_TIME
  25                       Date and time formats.
  26 
  27 
  28        LC_NUMERIC
  29                       Numeric formatting.
  30 
  31 
  32        LC_MONETARY
  33                       Monetary formatting.
  34 
  35 
  36        LC_MESSAGES
  37                       Formats of informative and diagnostic messages and
  38                       interactive responses.
  39 
  40 
  41 
  42        The standard utilities  base their behavior on the current locale, as
  43        defined in the ENVIRONMENT VARIABLES section for each utility. The
  44        behavior of some of the C-language functions will also be modified
  45        based on the current locale, as defined by the last call to
  46        setlocale(3C).
  47 
  48 
  49        Locales other than those supplied by the implementation can be created
  50        by the application via the localedef(1) utility. The value that is used
  51        to specify a locale when using environment variables will be the string
  52        specified as the name operand to  localedef when the locale was
  53        created. The strings "C" and "POSIX" are reserved as identifiers for
  54        the POSIX locale.
  55 
  56 
  57        Applications can select the desired locale by invoking the setlocale()
  58        function with the appropriate value. If the function is invoked with an
  59        empty string, such as:
  60 
  61          setlocale(LC_ALL, "");
  62 
  63 
  64 
  65        the value of the corresponding environment variable is used. If the
  66        environment variable is unset or is set to the empty string, the
  67        setlocale() function sets the appropriate environment.
  68 
  69    Locale Definition
  70        Locales can be described with the file format accepted by the localedef
  71        utility.
  72 
  73 
  74        The locale definition file must contain one or more locale category
  75        source definitions, and must not contain more than one definition for
  76        the same locale category.
  77 
  78 
  79        A category source definition consists of a category header, a category
  80        body and a category trailer. A category header consists of the
  81        character string naming of the category, beginning with the characters
  82        LC_. The category trailer consists of the string END, followed by one
  83        or more blank characters and the string used in the corresponding
  84        category header.
  85 
  86 
  87        The category body consists of one or more lines of text. Each line
  88        contains an identifier, optionally followed by one or more operands.
  89        Identifiers are either keywords, identifying a particular locale
  90        element, or collating elements. Each keyword within a locale must have
  91        a unique name (that is, two categories cannot have a commonly-named
  92        keyword). No keyword can start with the characters LC_. Identifiers
  93        must be separated from the operands by one or more blank characters.
  94 
  95 
  96        Operands must be characters, collating elements, or strings of
  97        characters.  Strings must be enclosed in double-quotes ("). Literal
  98        double-quotes within strings must be preceded by the <escape
  99        character>, as described      below. When a keyword is followed by more than
 100        one operand, the operands must be separated by semicolons (;). Blank
 101        characters are allowed both before and after a semicolon.
 102 
 103 
 104        The first category header in the file can be preceded by a line
 105        modifying the comment character. It has the following format, starting
 106        in column 1:
 107 
 108          "comment_char %c\n",<comment character>
 109 
 110 
 111 
 112        The comment character defaults to the number sign (#). Blank lines and
 113        lines containing the <comment character>   in the first position are
 114        ignored.
 115 
 116 
 117        The first category header in the file can be preceded by a line
 118        modifying the escape character to be used in the file. It has the
 119        following format, starting in column 1:
 120 
 121          "escape_char %c\n",<escape character>
 122 
 123 
 124 
 125 
 126        The escape character defaults to backslash.
 127 
 128 
 129        A line can be continued by placing an escape character as the last
 130        character on the line; this continuation character will be discarded
 131        from the input.  Although the implementation need not accept any one
 132        portion of a continued line with a length exceeding {LINE_MAX} bytes,
 133        it places no limits on the accumulated length of the continued line.
 134        Comment lines cannot be continued on a subsequent line using an escaped
 135        newline character.
 136 
 137 
 138        Individual characters, characters in strings, and collating elements
 139        must be represented using symbolic names, as defined below. In
 140        addition, characters can be represented using the characters themselves
 141        or as octal, hexadecimal or decimal constants. When non-symbolic
 142        notation is used, the resultant locale definitions will in many cases
 143        not be portable between systems. The left angle bracket (<) is a
 144        reserved symbol, denoting the start of a symbolic name; when used to
 145        represent itself it must be preceded by the escape character. The
 146        following rules apply to character representation:
 147 
 148            1.     A character can be represented via a symbolic name, enclosed
 149                   within angle brackets < and >. The symbolic name, including
 150                   the angle brackets, must exactly match a symbolic name
 151                   defined in the charmap file specified via the localedef -f
 152                   option, and will be replaced by a character value determined
 153                   from the value associated with the symbolic name in the
 154                   charmap file. The use of a symbolic name not found in the
 155                   charmap file constitutes an error, unless the category is
 156                   LC_CTYPE or  LC_COLLATE, in which case it constitutes a
 157                   warning condition (see localedef(1) for a description of
 158                   action resulting from errors and warnings). The
 159                   specification of a symbolic name in a collating-element or
 160                   collating-symbol section that duplicates a symbolic name in
 161                   the charmap file (if present) is an error.  Use of the
 162                   escape character or a right angle bracket within a symbolic
 163                   name is invalid unless the character is preceded by the
 164                   escape character.
 165 
 166                   Example:
 167 
 168                     <C>;<c-cedilla> "<M><a><y>"
 169 
 170 
 171 
 172            2.     A character can be represented by the character itself, in
 173                   which case the value of the character is implementation-
 174                   dependent. Within a string, the double-quote character, the
 175                   escape character and the right angle bracket character must
 176                   be escaped (preceded by the escape character) to be
 177                   interpreted as the character itself. Outside strings, the
 178                   characters
 179 
 180                     ,     ;     <     >   escape_char
 181 
 182 
 183                   must be escaped to be interpreted as the character itself.
 184 
 185                   Example:
 186 
 187                     c       "May"
 188 
 189 
 190 
 191            3.     A character can be represented as an octal constant. An
 192                   octal constant is specified as the escape character followed
 193                   by two or more octal digits. Each constant represents a byte
 194                   value. Multi-byte values can be represented by concatenated
 195                   constants specified in byte order with the last constant
 196                   specifying the least significant byte of the character.
 197 
 198                   Example:
 199 
 200                     \143;\347;\143\150    "\115\141\171"
 201 
 202 
 203 
 204            4.     A character can be represented as a hexadecimal constant. A
 205                   hexadecimal constant is specified as the escape character
 206                   followed by an x followed by two or more hexadecimal digits.
 207                   Each constant represents a byte value.  Multi-byte values
 208                   can be represented by concatenated constants specified in
 209                   byte order with the last constant specifying the least
 210                   significant byte of the character.
 211 
 212                   Example:
 213 
 214                     \x63;\xe7;\x63\x68    "\x4d\x61\x79"
 215 
 216 
 217 
 218            5.     A character can be represented as a decimal constant. A
 219                   decimal constant is specified as the escape character
 220                   followed by a d followed by two or more decimal digits. Each
 221                   constant represents a byte value. Multi-byte values can be
 222                   represented by concatenated constants specified in byte
 223                   order with the last constant specifying the least
 224                   significant byte of the character.
 225 
 226                   Example:
 227 
 228                     \d99;\d231;\d99\d104   "\d77\d97\d121"
 229 
 230 
 231                   Only characters existing in the character set for which the
 232                   locale definition is created can be specified, whether using
 233                   symbolic names, the characters themselves, or octal, decimal
 234                   or hexadecimal constants. If a charmap file is present, only
 235                   characters defined in the charmap can be specified using
 236                   octal, decimal or hexadecimal constants. Symbolic names not
 237                   present in the charmap file can be specified and will be
 238                   ignored, as specified under item 1 above.
 239 
 240    LC_CTYPE
 241        The  LC_CTYPE category defines character classification, case
 242        conversion and other character attributes. In addition, a series of
 243        characters can be represented by three adjacent periods representing an
 244        ellipsis symbol (...). The ellipsis specification is interpreted as
 245        meaning that all values between the values preceding and following it
 246        represent valid characters. The ellipsis specification is valid only
 247        within a single encoded character set, that is, within a group of
 248        characters of the same size. An ellipsis is interpreted as including in
 249        the list all characters with an encoded value higher than the encoded
 250        value of the character preceding the ellipsis and lower than the
 251        encoded value of the character following the ellipsis.
 252 
 253 
 254        Example:
 255 
 256          \x30;...;\x39;
 257 
 258 
 259 
 260 
 261        includes in the character class all characters with encoded values
 262        between the endpoints.
 263 
 264 
 265        The following keywords are recognized. In the descriptions, the term
 266        ``automatically included'' means that it is not an error either to
 267        include or omit any of the referenced characters.
 268 
 269 
 270        The character classes digit, xdigit, lower, upper, and space have a set
 271        of automatically included characters. These only need to be specified
 272        if the character values (that is, encoding) differ from the
 273        implementation default values.
 274 
 275        upper
 276                          Define characters to be classified as upper-case
 277                          letters.
 278 
 279                          In the POSIX locale, the 26 upper-case letters are
 280                          included:
 281 
 282                            A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
 283 
 284 
 285                          In a locale definition file, no character specified
 286                          for the keywords cntrl, digit, punct, or space can be
 287                          specified. The upper-case letters A to Z are
 288                          automatically included in this class.
 289 
 290 
 291        lower
 292                          Define characters to be classified as lower-case
 293                          letters. In the POSIX locale, the 26 lower-case
 294                          letters are included:
 295 
 296                            a b c d e f g h i j k l m n o p q r s t u v w x y z
 297 
 298 
 299                          In a locale definition file, no character specified
 300                          for the keywords cntrl, digit, punct, or space can be
 301                          specified. The lower-case letters a to z of the
 302                          portable character set are automatically included in
 303                          this class.
 304 
 305 
 306        alpha
 307                          Define characters to be classified as letters.
 308 
 309                          In the POSIX locale, all characters in the classes
 310                          upper and lower are included.
 311 
 312                          In a locale definition file, no character specified
 313                          for the keywords cntrl, digit, punct, or space can be
 314                          specified.  Characters classified as either upper or
 315                          lower are automatically included in this class.
 316 
 317 
 318        digit
 319                          Define the characters to be classified as numeric
 320                          digits.
 321 
 322                          In the POSIX locale, only
 323 
 324                            0 1 2 3 4 5 6 7 8 9
 325 
 326 
 327                          are included.
 328 
 329                          In a locale definition file, only the digits 0, 1, 2,
 330                          3, 4, 5, 6, 7, 8, and 9 can be specified, and in
 331                          contiguous ascending sequence by numerical value. The
 332                          digits 0 to 9 of the portable character set are
 333                          automatically included in this class.
 334 
 335                          The definition of character class digit requires that
 336                          only ten characters; the ones defining digits can be
 337                          specified; alternative digits (for example, Hindi or
 338                          Kanji) cannot be specified here.
 339 
 340 
 341        alnum
 342                          Define characters to be classified as letters and
 343                          numeric digits. Only the characters specified for the
 344                          alpha and digit keywords are specified. Characters
 345                          specified for the keywords alpha and digit are
 346                          automatically included in this class.
 347 
 348 
 349        space
 350                          Define characters to be classified as white-space
 351                          characters.
 352 
 353                          In the POSIX locale, at a minimum, the characters
 354                          SPACE, FORMFEED, NEWLINE, CARRIAGE RETURN, TAB, and
 355                          VERTICAL TAB are included.
 356 
 357                          In a locale definition file, no character specified
 358                          for the keywords upper, lower, alpha, digit, graph,
 359                          or xdigit can be specified. The characters SPACE,
 360                          FORMFEED, NEWLINE, CARRIAGE RETURN, TAB, and
 361                          VERTICAL TAB of the portable character set, and any
 362                          characters included in the class blank are
 363                          automatically included in this class.
 364 
 365 
 366        cntrl
 367                          Define characters to be classified as control
 368                          characters.
 369 
 370                          In the POSIX locale, no characters in classes alpha
 371                          or print are included.
 372 
 373                          In a locale definition file, no character specified
 374                          for the keywords upper, lower, alpha, digit, punct,
 375                          graph, print, or xdigit can be specified.
 376 
 377 
 378        punct
 379                          Define characters to be classified as punctuation
 380                          characters.
 381 
 382                          In the POSIX locale, neither the space character nor
 383                          any characters in classes alpha, digit, or cntrl are
 384                          included.
 385 
 386                          In a locale definition file, no character specified
 387                          for the keywords upper, lower, alpha, digit, cntrl,
 388                          xdigit or as the space character can be specified.
 389 
 390 
 391        graph
 392                          Define characters to be classified as printable
 393                          characters, not including the space character.
 394 
 395                          In the POSIX locale, all characters in classes alpha,
 396                          digit, and punct are included; no characters in class
 397                          cntrl are included.
 398 
 399                          In a locale definition file, characters specified for
 400                          the keywords upper, lower, alpha, digit, xdigit, and
 401                          punct are automatically included in this class. No
 402                          character specified for the keyword cntrl can be
 403                          specified.
 404 
 405 
 406        print
 407                          Define characters to be classified as printable
 408                          characters, including the space character.
 409 
 410                          In the POSIX locale, all characters in class graph
 411                          are included; no characters in class cntrl are
 412                          included.
 413 
 414                          In a locale definition file, characters specified for
 415                          the keywords upper, lower, alpha, digit, xdigit,
 416                          punct, and the space character are automatically
 417                          included in this class. No character specified for
 418                          the keyword cntrl can be specified.
 419 
 420 
 421        xdigit
 422                          Define the characters to be classified as hexadecimal
 423                          digits.
 424 
 425                          In the POSIX locale, only:
 426 
 427                            0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f
 428 
 429 
 430                          are included.
 431 
 432                          In a locale definition file, only the characters
 433                          defined for the class digit can be specified, in
 434                          contiguous ascending sequence by numerical value,
 435                          followed by one or more sets of six characters
 436                          representing the hexadecimal digits 10 to 15
 437                          inclusive, with each set in ascending order (for
 438                          example A, B, C, D, E, F, a, b, c, d, e, f). The
 439                          digits 0 to 9, the upper-case letters A to F and the
 440                          lower-case letters a to f of the portable character
 441                          set are automatically included in this class.
 442 
 443                          The definition of character class xdigit requires
 444                          that the characters included in character class digit
 445                          be included here also.
 446 
 447 
 448        blank
 449                          Define characters to be classified as blank
 450                          characters.
 451 
 452                          In the POSIX locale, only the space and tab
 453                          characters are included.
 454 
 455                          In a locale definition file, the characters space and
 456                          tab are automatically included in this class.
 457 
 458 
 459        charclass
 460                          Define one or more locale-specific character class
 461                          names as strings separated by semi-colons. Each named
 462                          character class can then be defined subsequently in
 463                          the LC_CTYPE definition. A character class name
 464                          consists of at least one and at most
 465                          {CHARCLASS_NAME_MAX} bytes of alphanumeric characters
 466                          from the portable filename character set. The first
 467                          character of a character class name cannot be a
 468                          digit. The name cannot match any of the LC_CTYPE
 469                          keywords defined in this document.
 470 
 471 
 472        charclass-name
 473                          Define characters to be classified as belonging to
 474                          the named locale-specific character class. In the
 475                          POSIX locale, the locale-specific named character
 476                          classes need not exist. If a class name is defined by
 477                          a charclass keyword, but no characters are
 478                          subsequently assigned to it, this is not an error; it
 479                          represents a class without any characters belonging
 480                          to it. The charclass-name can be used as the property
 481                          argument to the wctype(3C) function, in regular
 482                          expression and shell pattern-matching bracket
 483                          expressions, and by the tr(1) command.
 484 
 485 
 486        toupper
 487                          Define the mapping of lower-case letters to upper-
 488                          case letters.
 489 
 490                          In the POSIX locale, at a minimum, the 26 lower-case
 491                          characters:
 492 
 493                            a b c d e f g h i j k l m n o p q r s t u v w x y z
 494 
 495 
 496                          are mapped to the corresponding 26 upper-case
 497                          characters:
 498 
 499                            A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
 500 
 501 
 502                          In a locale definition file, the operand consists of
 503                          character pairs, separated by semicolons. The
 504                          characters in each character pair are separated by a
 505                          comma and the pair enclosed by parentheses. The first
 506                          character in each pair is the lower-case letter, the
 507                          second the corresponding upper-case letter. Only
 508                          characters specified for the keywords lower and upper
 509                          can be specified. The lower-case letters a to z, and
 510                          their corresponding upper-case letters A to Z, of the
 511                          portable character set are automatically included in
 512                          this mapping, but only when the toupper keyword is
 513                          omitted from the locale definition.
 514 
 515 
 516        tolower
 517                          Define the mapping of upper-case letters to lower-
 518                          case letters.
 519 
 520                          In the POSIX locale, at a minimum, the 26 upper-case
 521                          characters:
 522 
 523                            A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
 524 
 525 
 526                          are mapped to the corresponding 26 lower-case
 527                          characters:
 528 
 529                            a b c d e f g h i j k l m n o p q r s t u v w x y z
 530 
 531 
 532                          In a locale definition file, the operand consists of
 533                          character pairs, separated by semicolons. The
 534                          characters in each character pair are separated by a
 535                          comma and the pair enclosed by parentheses. The first
 536                          character in each pair is the upper-case letter, the
 537                          second the corresponding lower-case letter. Only
 538                          characters specified for the keywords lower and upper
 539                          can be specified. If the tolower keyword is omitted
 540                          from the locale definition, the mapping will be the
 541                          reverse mapping of the one specified for toupper.
 542 
 543 
 544    LC_COLLATE
 545        The  LC_COLLATE category provides a collation sequence definition for
 546        numerous utilities (such as sort(1), uniq(1), and so forth), regular
 547        expression matching (see regex(5)), and the strcoll(3C), strxfrm(3C),
 548        wcscoll(3C), and wcsxfrm(3C) functions.
 549 
 550 
 551        A collation sequence definition defines the relative order between
 552        collating elements (characters and multi-character collating elements)
 553        in the locale.  This order is expressed in terms of collation values,
 554        that is, by assigning each element one or more collation values (also
 555        known as collation weights).  The following capabilities are provided:
 556 
 557            1.     Multi-character collating elements. Specification of multi-
 558                   character collating elements (that is, sequences of two or
 559                   more characters to be collated as an entity).
 560 
 561            2.     User-defined ordering of collating elements. Each collating
 562                   element is assigned a collation value defining its order in
 563                   the character (or basic) collation sequence. This ordering
 564                   is used by regular expressions and pattern matching and,
 565                   unless collation weights are explicitly specified, also as
 566                   the collation weight to be used in sorting.
 567 
 568            3.     Multiple weights and equivalence classes. Collating elements
 569                   can be assigned one or more (up to the limit
 570                   {COLL_WEIGHTS_MAX} ) collating weights for use in sorting.
 571                   The first weight is hereafter referred to as the primary
 572                   weight.
 573 
 574            4.     One-to-Many mapping. A single character is mapped into a
 575                   string of collating elements.
 576 
 577            5.     Equivalence class definition. Two or more collating elements
 578                   have the same collation value (primary weight).
 579 
 580            6.     Ordering by weights. When two strings are compared to
 581                   determine their relative order, the two strings are first
 582                   broken up into a series of collating elements. The elements
 583                   in each successive pair of elements are then compared
 584                   according to the relative primary weights for the elements.
 585                   If equal, and more than one weight has been assigned, the
 586                   pairs of collating elements are recompared according to the
 587                   relative subsequent weights, until either a pair of
 588                   collating elements compare unequal or the weights are
 589                   exhausted.
 590 
 591 
 592        The following keywords are recognized in a collation sequence
 593        definition. They are described in detail in the following sections.
 594 
 595        copy
 596                             Specify the name of an existing locale which is
 597                             used as the definition of this category. If this
 598                             keyword is specified, no other keyword is
 599                             specified.
 600 
 601 
 602        collating-element
 603                             Define a collating-element symbol representing a
 604                             multi-character collating element. This keyword is
 605                             optional.
 606 
 607 
 608        collating-symbol
 609                             Define a collating symbol for use in collation
 610                             order statements. This keyword is optional.
 611 
 612 
 613        order_start
 614                             Define collation rules. This statement is followed
 615                             by one or more collation order statements,
 616                             assigning character collation values and collation
 617                             weights to collating elements.
 618 
 619 
 620        order_end
 621                             Specify the end of the collation-order statements.
 622 
 623 
 624    collating-element keyword
 625        In addition to the collating elements in the character set, the
 626        collating-element keyword is used to define multi-character collating
 627        elements. The syntax is:
 628 
 629          "collating-element %s from \"%s\"\n",<collating-symbol>,<string>
 630 
 631 
 632 
 633        The <collating-symbol> operand is a symbolic name, enclosed between
 634        angle brackets (< and >), and must not duplicate   any symbolic name in
 635        the current charmap file (if any), or any other symbolic name defined
 636        in this collation definition. The string operand is a string of two or
 637        more characters that collates as an entity. A <collating-element>
 638        defined via this keyword is only recognized with the LC_COLLATE
 639        category.
 640 
 641 
 642        Example:
 643          collating-element <ch>   from "<c><h>"
 644          collating-element <e-acute> from "<acute><e>"
 645          collating-element <ll>   from "ll"
 646 
 647    collating-symbol keyword
 648        This keyword will be used to define symbols for use in collation
 649        sequence statements; that is, between the order_start and the order_end
 650        keywords. The syntax is:
 651 
 652          "collating-symbol %s\n",<collating-symbol>
 653 
 654 
 655 
 656        The <collating-symbol> is a symbolic name, enclosed between angle
 657        brackets (< and >), and must not   duplicate any symbolic name in the
 658        current charmap file (if any), or any other symbolic name defined in
 659        this collation definition.
 660 
 661 
 662        A collating-symbol defined via this keyword is only recognized with the
 663        LC_COLLATE category.
 664 
 665 
 666        Example:
 667          collating-symbol <UPPER_CASE>
 668          collating-symbol <HIGH>
 669 
 670 
 671        The collating-symbol keyword defines a symbolic name that can be
 672        associated with a relative position in the character order sequence.
 673        While such a symbolic name does not represent any collating element, it
 674        can be used as a weight.
 675 
 676    order_start keyword
 677        The order_start keyword must precede collation order entries and also
 678        defines the number of weights for this collation sequence definition
 679        and other collation rules.
 680 
 681 
 682        The syntax of the order_start keyword is:
 683 
 684          "order_start %s;%s;...;%s\n",<sort-rules>,<sort-rules>
 685 
 686 
 687 
 688        The operands to the order_start keyword are optional. If present, the
 689        operands define rules to be applied when strings are compared. The
 690        number of operands define how many weights each element is assigned. If
 691        no operands are present, one forward operand is assumed. If present,
 692        the first operand defines rules to be applied when comparing strings
 693        using the first (primary) weight; the second when comparing strings
 694        using the second weight, and so on.  Operands are separated by
 695        semicolons (;). Each operand consists of one or more collation
 696        directives, separated by commas (,). If the number of operands exceeds
 697        the {COLL_WEIGHTS_MAX} limit, the utility will issue a warning message.
 698        The following directives will be supported:
 699 
 700        forward
 701                    Specifies that comparison operations for the weight level
 702                    proceed from start of string towards the end of string.
 703 
 704 
 705        backward
 706                    Specifies that comparison operations for the weight level
 707                    proceed from end of string towards the beginning of string.
 708 
 709 
 710        position
 711                    Specifies that comparison operations for the weight level
 712                    will consider the relative position of elements in the
 713                    strings not subject to IGNORE. The string containing an
 714                    element not subject to IGNORE after the fewest collating
 715                    elements subject to IGNORE from the start of the compare
 716                    will collate first. If both strings contain a character not
 717                    subject to IGNORE in the same relative position, the
 718                    collating values assigned to the elements will determine
 719                    the ordering. In case of equality, subsequent characters
 720                    not subject to IGNORE are considered in the same manner.
 721 
 722 
 723 
 724        The directives forward and backward are mutually exclusive.
 725 
 726 
 727        Example:
 728 
 729          order_start    forward;backward
 730 
 731 
 732 
 733 
 734        If no operands are specified, a single forward operand is assumed.
 735 
 736    Collation Order
 737        The order_start keyword is followed by collating identifier entries.
 738        The syntax for the collating element entries is:
 739 
 740          "%s %s;%s;...;%s\n"<collating-identifier>,<weight>,<weight>,...
 741 
 742 
 743 
 744        Each collating-identifier consists of either a character described in
 745        Locale Definition above,  a <collating-element>,   a <collating-symbol>,
 746        an ellipsis, or the special symbol UNDEFINED.  The order in which
 747        collating elements are specified determines the character order
 748        sequence, such that each collating element compares less than the
 749        elements following it. The  NUL character compares lower than any other
 750        character.
 751 
 752 
 753        A <collating-element> is   used to specify multi-character collating
 754        elements, and indicates that the character sequence specified via the
 755        <collating-element> is to be collated as   a unit and in the relative
 756        order specified by its place.
 757 
 758 
 759        A <collating-symbol> is used to define a   position in the relative order
 760        for use in weights. No weights are specified with a <collating-symbol>.
 761 
 762 
 763        The ellipsis symbol specifies that a sequence of characters will
 764        collate according to their encoded character values. It is interpreted
 765        as indicating that all characters with a coded character set value
 766        higher than the value of the character in the preceding line, and lower
 767        than the coded character set value for the character in the following
 768        line, in the current coded character set, will be placed in the
 769        character collation order between the previous and the following
 770        character in ascending order according to their coded character set
 771        values. An initial ellipsis is interpreted as if the preceding line
 772        specified the NUL character, and a trailing ellipsis as if the
 773        following line specified the highest coded character set value in the
 774        current coded character set. An ellipsis is treated as invalid if the
 775        preceding or following lines do not specify characters in the current
 776        coded character set. The use of the ellipsis symbol ties the definition
 777        to a specific coded character set and may preclude the definition from
 778        being portable between implementations.
 779 
 780 
 781        The symbol UNDEFINED is interpreted as including all coded character
 782        set values not specified explicitly or via the ellipsis symbol. Such
 783        characters are inserted in the character collation order at the point
 784        indicated by the symbol, and in ascending order according to their
 785        coded character set values. If no UNDEFINED symbol is specified, and
 786        the current coded character set contains characters not specified in
 787        this section, the utility will issue a warning message and place such
 788        characters at the end of the character collation order.
 789 
 790 
 791        The optional operands for each collation-element are used to define the
 792        primary, secondary, or subsequent weights for the collating element.
 793        The first operand specifies the relative primary weight, the second the
 794        relative secondary weight, and so on. Two or more collation-elements
 795        can be assigned the same weight; they belong to the same equivalence
 796        class if they have the same primary weight. Collation behaves as if,
 797        for each weight level, elements subject to IGNORE are removed, unless
 798        the position collation directive is specified for the corresponding
 799        level with the order_start keyword. Then each successive pair of
 800        elements is compared according to the relative weights for the
 801        elements. If the two strings compare equal, the process is repeated for
 802        the next weight level, up to the limit {COLL_WEIGHTS_MAX}.
 803 
 804 
 805        Weights are expressed as characters  described in Locale Definition
 806        above, <collating-symbol>s, <collating-element>s, an ellipsis, or the
 807        special symbol IGNORE. A single character, a <collating-symbol> or a
 808        <collating-element> represent the relative position in the character
 809        collating sequence of the character or symbol, rather than the
 810        character or characters themselves. Thus, rather than assigning
 811        absolute values to weights, a particular weight is expressed using the
 812        relative order value assigned to a collating element based on its order
 813        in the character collation sequence.
 814 
 815 
 816        One-to-many mapping is indicated by specifying two or more concatenated
 817        characters or symbolic names. For example, if the character <eszet> is
 818        given the string "<s><s>" as a weight, comparisons are performed     as if
 819        all occurrences of the character <eszet>   are replaced by <s><s>
 820        (assuming that <s> has the collating weight <s>). If it is necessary to
 821        define <eszet> and <s><s> as an equivalence class, then a collating
 822        element must be defined for the string ss.
 823 
 824 
 825        All characters specified via an ellipsis will by default be assigned
 826        unique weights, equal to the relative order of characters. Characters
 827        specified via an explicit or implicit UNDEFINED special symbol will by
 828        default be assigned the same primary weight (that is, belong to the
 829        same equivalence class). An ellipsis symbol as a weight is interpreted
 830        to mean that each character in the sequence has unique weights, equal
 831        to the relative order of their character in the character collation
 832        sequence. The use of the ellipsis as a weight is treated as an error if
 833        the collating element is neither an ellipsis nor the special symbol
 834        UNDEFINED.
 835 
 836 
 837        The special keyword IGNORE as a weight indicates that when strings are
 838        compared using the weights at the level where IGNORE is specified, the
 839        collating element is ignored; that is, as if the string did not contain
 840        the collating element. In regular expressions and pattern matching, all
 841        characters that are subject to IGNORE in their primary weight form an
 842        equivalence class.
 843 
 844 
 845        An empty operand is interpreted as the collating element itself.
 846 
 847 
 848        For example, the order statement:
 849 
 850          <a>   <a>;<a>
 851 
 852 
 853 
 854 
 855        is equal to:
 856 
 857          <a>
 858 
 859 
 860 
 861 
 862        An ellipsis can be used as an operand if the collating element was an
 863        ellipsis, and is interpreted as the value of each character defined by
 864        the ellipsis.
 865 
 866 
 867        The collation order as defined in this section defines the
 868        interpretation of bracket expressions in regular expressions.
 869 
 870 
 871        Example:
 872 
 873 
 874 
 875 
 876        order_start   forward;backward
 877        UNDEFINED     IGNORE;IGNORE
 878        <LOW>
 879        <space>         <LOW>;<space>
 880        ...           <LOW>;...
 881        <a>             <a>;<a>
 882        <a-acute>     <a>;<a-acute>
 883        <a-grave>     <a>;<a-grave>
 884        <A>             <a>;<A>
 885        <A-acute>     <a>;<A-acute>
 886        <A-grave>     <a>;<A-grave>
 887        <ch>            <ch>;<ch>
 888        <Ch>            <ch>;<Ch>
 889        <s>             <s>;<s>
 890        <eszet>         "<s><s>";"<eszet><eszet>"
 891        order_end
 892 
 893 
 894 
 895        This example is interpreted as follows:
 896 
 897            1.     The UNDEFINED means that all characters not specified in
 898                   this definition (explicitly or via the ellipsis) are ignored
 899                   for collation purposes; for regular expression purposes they
 900                   are ordered first.
 901 
 902            2.     All characters between <space> and <a> have the same primary
 903                   equivalence class and individual secondary weights based on
 904                   their ordinal encoded values.
 905 
 906            3.     All characters based on the upper- or lower-case character a
 907                   belong to the same primary equivalence class.
 908 
 909            4.     The multi-character collating element <ch> is   represented by
 910                   the collating symbol <ch> and   belongs to the same primary
 911                   equivalence class as the multi-character collating element
 912                   <Ch>.
 913 
 914    order_end keyword
 915        The collating order entries must be terminated with an order_end
 916        keyword.
 917 
 918    LC_MONETARY
 919        The  LC_MONETARY category defines the rules and symbols that are used
 920        to format monetary numeric information. This information is available
 921        through the localeconv(3C) function
 922 
 923 
 924        The following items are defined in this category of the locale. The
 925        item names are the keywords recognized by the localedef(1) utility when
 926        defining a locale. They are also similar to the member names of the
 927        lconv structure defined in <locale.h>. The localeconv function returns
 928        {CHAR_MAX} for unspecified integer items and the empty string ("") for
 929        unspecified or size zero string items.
 930 
 931 
 932        In a locale definition file the operands are strings. For some
 933        keywords, the strings can contain only integers. Keywords that are not
 934        provided, string values set to the empty string (""), or integer
 935        keywords set to -1, are used to indicate that the value is not
 936        available in the locale.
 937 
 938        int_curr_symbol
 939                              The international currency symbol. The operand is
 940                              a four-character string, with the first three
 941                              characters containing the alphabetic
 942                              international currency symbol in accordance with
 943                              those specified in the ISO 4217 standard. The
 944                              fourth character is the character used to
 945                              separate the international currency symbol from
 946                              the monetary quantity.
 947 
 948 
 949        currency_symbol
 950                              The string used as the local currency symbol.
 951 
 952 
 953        mon_decimal_point
 954                              The operand is a string containing the symbol
 955                              that is used as the decimal delimiter (radix
 956                              character) in monetary formatted quantities.
 957 
 958 
 959        mon_thousands_sep
 960                              The operand is a string containing the symbol
 961                              that is used as a separator for groups of digits
 962                              to the left of the decimal delimiter in formatted
 963                              monetary quantities.
 964 
 965 
 966        mon_grouping
 967                              Define the size of each group of digits in
 968                              formatted monetary quantities. The operand is a
 969                              sequence of integers separated by semicolons.
 970                              Each integer specifies the number of digits in
 971                              each group, with the initial integer defining the
 972                              size of the group immediately preceding the
 973                              decimal delimiter, and the following integers
 974                              defining the preceding groups. If the last
 975                              integer is not -1, then the size of the previous
 976                              group (if any) will be repeatedly used for the
 977                              remainder of the digits. If the last integer is
 978                              -1, then no further grouping will be performed.
 979 
 980                              The following is an example of the interpretation
 981                              of the mon_grouping keyword. Assuming that the
 982                              value to be formatted is 123456789 and the
 983                              mon_thousands_sep is ', then the following table
 984                              shows the result. The third column shows the
 985                              equivalent string in the ISO C standard that
 986                              would be used by the localeconv function to
 987                              accommodate this grouping.
 988 
 989                                mon_grouping   Formatted Value  ISO C String
 990 
 991                                3;-1           123456'789       "\3\177"
 992                                3              123'456'789      "\3"
 993                                3;2;-1         1234'56'789      "\3\2\177"
 994                                3;2            12'34'56'789     "\3\2"
 995                                -1             1234567898       "\177"
 996 
 997 
 998                              In these examples, the octal value of {CHAR_MAX}
 999                              is 177.
1000 
1001 
1002        positive_sign
1003                              A string used to indicate a non-negative-valued
1004                              formatted monetary quantity.
1005 
1006 
1007        negative_sign
1008                              A string used to indicate a negative-valued
1009                              formatted monetary quantity.
1010 
1011 
1012        int_frac_digits
1013                              An integer representing the number of fractional
1014                              digits (those to the right of the decimal
1015                              delimiter) to be written in a formatted monetary
1016                              quantity using int_curr_symbol.
1017 
1018 
1019        frac_digits
1020                              An integer representing the number of fractional
1021                              digits (those to the right of the decimal
1022                              delimiter) to be written in a formatted monetary
1023                              quantity using currency_symbol.
1024 
1025 
1026        p_cs_precedes
1027                              In an application conforming to the SUSv3
1028                              standard, an integer set to 1 if the
1029                              currency_symbol precedes the value for a monetary
1030                              quantity with a non-negative value, and set to 0
1031                              if the symbol succeeds the value.
1032 
1033                              In an application not conforming to the SUSv3
1034                              standard, an integer set to 1 if the
1035                              currency_symbol or int_currency_symbol precedes
1036                              the value for a monetary quantity with a non-
1037                              negative value, and set to 0 if the symbol
1038                              succeeds the value.
1039 
1040 
1041        p_sep_by_space
1042                              In an application conforming to the SUSv3
1043                              standard, an integer set to 0 if no space
1044                              separates the currency_symbol from the value for
1045                              a monetary quantity with a non-negative value,
1046                              set to 1 if a space separates the symbol from the
1047                              value, and set to 2 if a space separates the
1048                              symbol and the sign string, if adjacent.
1049 
1050                              In an application not conforming to the SUSv3
1051                              standard, an integer set to 0 if no space
1052                              separates the currency_symbol or int_curr_symbol
1053                              from the value for a monetary quantity with a
1054                              non-negative value, set to 1 if a space separates
1055                              the symbol from the value, and set to 2 if a
1056                              space separates the symbol and the sign string,
1057                              if adjacent.
1058 
1059 
1060        n_cs_precedes
1061                              In an application conforming to the SUSv3
1062                              standard, an integer set to 1 if the
1063                              currency_symbol precedes the value for a monetary
1064                              quantity with a negative value, and set to 0 if
1065                              the symbol succeeds the value.
1066 
1067                              In an application not conforming to the SUSv3
1068                              standard, an integer set to 1 if the
1069                              currency_symbol or int_currency_symbol precedes
1070                              the value for a monetary quantity with a negative
1071                              value, and set to 0 if the symbol succeeds the
1072                              value.
1073 
1074 
1075        n_sep_by_space
1076                              In an application conforming to the SUSv3
1077                              standard, an integer set to 0 if no space
1078                              separates the currency_symbol from the value for
1079                              a monetary quantity with a negative value, set to
1080                              1 if a space separates the symbol from the value,
1081                              and set to 2 if a space separates the symbol and
1082                              the sign string, if adjacent.
1083 
1084                              In an application not conforming to the SUSv3
1085                              standard, an integer set to 0 if no space
1086                              separates the currency_symbol or int_curr_symbol
1087                              from the value for a monetary quantity with a
1088                              negative value, set to 1 if a space separates the
1089                              symbol from the value, and set to 2 if a space
1090                              separates the symbol and the sign string, if
1091                              adjacent.
1092 
1093 
1094        p_sign_posn
1095                              An integer set to a value indicating the
1096                              positioning of the positive_sign for a monetary
1097                              quantity with a non-negative value. The following
1098                              integer values are recognized for both
1099                              p_sign_posn and n_sign_posn:
1100 
1101                              In an application conforming to the SUSv3
1102                              standard:
1103 
1104                              0
1105                                   Parentheses enclose the quantity and the
1106                                   currency_symbol.
1107 
1108 
1109                              1
1110                                   The sign string precedes the quantity and
1111                                   the currency_symbol.
1112 
1113 
1114                              2
1115                                   The sign string succeeds the quantity and
1116                                   the currency_symbol.
1117 
1118 
1119                              3
1120                                   The sign string precedes the
1121                                   currency_symbol.
1122 
1123 
1124                              4
1125                                   The sign string succeeds the
1126                                   currency_symbol.
1127 
1128                              In an application not conforming to the SUSv3
1129                              standard:
1130 
1131                              0
1132                                   Parentheses enclose the quantity and the
1133                                   currency_symbol or int_curr_symbol.
1134 
1135 
1136                              1
1137                                   The sign string precedes the quantity and
1138                                   the currency_symbol or int_curr_symbol.
1139 
1140 
1141                              2
1142                                   The sign string succeeds the quantity and
1143                                   the currency_symbol or int_curr_symbol.
1144 
1145 
1146                              3
1147                                   The sign string precedes the currency_symbol
1148                                   or int_curr_symbol.
1149 
1150 
1151                              4
1152                                   The sign string succeeds the currency_symbol
1153                                   or int_curr_symbol.
1154 
1155 
1156 
1157        n_sign_posn
1158                              An integer set to a value indicating the
1159                              positioning of the negative_sign for a negative
1160                              formatted monetary quantity.
1161 
1162 
1163        int_p_cs_precedes
1164                              An integer set to 1 if the int_curr_symbol
1165                              precedes the value for a monetary quantity with a
1166                              non-negative value, and set to 0 if the symbol
1167                              succeeds the value.
1168 
1169 
1170        int_n_cs_precedes
1171                              An integer set to 1 if the int_curr_symbol
1172                              precedes the value for a monetary quantity with a
1173                              negative value, and set to 0 if the symbol
1174                              succeeds the value.
1175 
1176 
1177        int_p_sep_by_space
1178                              An integer set to 0 if no space separates the
1179                              int_curr_symbol from the value for a monetary
1180                              quantity with a non-negative value, set to 1 if a
1181                              space separates the symbol from the value, and
1182                              set to 2 if a space separates the symbol and the
1183                              sign string, if adjacent.
1184 
1185 
1186        int_n_sep_by_space
1187                              An integer set to 0 if no space separates the
1188                              int_curr_symbol from the value for a monetary
1189                              quantity with a negative value, set to 1 if a
1190                              space separates the symbol from the value, and
1191                              set to 2 if a space separates the symbol and the
1192                              sign string, if adjacent.
1193 
1194 
1195        int_p_sign_posn
1196                              An integer set to a value indicating the
1197                              positioning of the positive_sign for a positive
1198                              monetary quantity formatted with the
1199                              international format. The following integer
1200                              values are recognized for int_p_sign_posn and
1201                              int_n_sign_posn:
1202 
1203                              0
1204                                   Parentheses enclose the quantity and the
1205                                   int_curr_symbol.
1206 
1207 
1208                              1
1209                                   The sign string precedes the quantity and
1210                                   the int_curr_symbol.
1211 
1212 
1213                              2
1214                                   The sign string precedes the quantity and
1215                                   the int_curr_symbol.
1216 
1217 
1218                              3
1219                                   The sign string precedes the
1220                                   int_curr_symbol.
1221 
1222 
1223                              4
1224                                   The sign string succeeds the
1225                                   int_curr_symbol.
1226 
1227 
1228 
1229        int_n_sign_posn
1230                              An integer set to a value indicating the
1231                              positioning of the negative_sign for a negative
1232                              monetary quantity formatted with the
1233                              international format.
1234 
1235 
1236 
1237        The following table shows the result of various combinations:
1238 
1239 
1240 
1241 
1242                                            p_sep_by_space
1243                                            2                1          0
1244        p_cs_precedes= 1   p_sign_posn= 0   ($1.25)          ($1.25)    ($1.25)
1245                           p_sign_posn= 1   +$1.25           +$1.25     +$1.25
1246                           p_sign_posn= 2   $1.25+           $1.25+     $1.25+
1247                           p_sign_posn= 3   +$1.25           +$1.25     +$1.25
1248                           p_sign_posn= 4   $+1.25           $+1.25     $+1.25
1249        p_cs_precedes= 0   p_sign_posn= 0   (1.25 $)         (1.25 $)   (1.25$)
1250                           p_sign_posn= 1   +1.25 $          +1.25 $    +1.25$
1251                           p_sign_posn= 2   1.25$ +          1.25 $+    1.25$+
1252                           p_sign_posn= 3   1.25+ $          1.25 +$    1.25+$
1253                           p_sign_posn= 4   1.25$ +          1.25 $+    1.25$+
1254 
1255 
1256 
1257        The monetary formatting definitions for the POSIX locale follow. The
1258        code listing depicts the localedef(1) input, the table representing the
1259        same information with the addition of localeconv(3C) and
1260        nl_langinfo(3C) formats. All values are unspecified in the POSIX
1261        locale.
1262 
1263          LC_MONETARY
1264          # This is the POSIX locale definition for
1265          # the LC_MONETARY category.
1266          #
1267          int_curr_symbol       ""
1268          currency_symbol       ""
1269          mon_decimal_point     ""
1270          mon_thousands_sep     ""
1271          mon_grouping          -1
1272          positive_sign         ""
1273          negative_sign         ""
1274          int_frac_digits       -1
1275          frac_digits           -1
1276          p_cs_precedes         -1
1277          p_sep_by_space        -1
1278          n_cs_precedes         -1
1279          n_sep_by_space        -1
1280          p_sign_posn           -1
1281          n_sign_posn           -1
1282          int_p_cs_precedes     -1
1283          int_p_sep_by_space    -1
1284          int_n_cs_precedes     -1
1285          int_n_sep_by_space    -1
1286          int_p_sign_posn       -1
1287          int_n_sign_posn       -1
1288          #
1289          END LC_MONETARY
1290 
1291 
1292 
1293 
1294        The entry n/a indicates that the value is not available in the POSIX
1295        locale.
1296 
1297    LC_NUMERIC
1298        The  LC_NUMERIC category defines the rules and symbols that will be
1299        used to format non-monetary numeric information. This information is
1300        available through the localeconv(3C) function.
1301 
1302 
1303        The following items are defined in this category of the locale. The
1304        item names are the keywords recognized by the localedef utility when
1305        defining a locale. They are also similar to the member names of the
1306        lconv structure defined in <locale.h>. The localeconv() function
1307        returns {CHAR_MAX} for unspecified integer items and the empty string
1308        ("") for unspecified or size zero string items.
1309 
1310 
1311        In a locale definition file the operands are strings. For some
1312        keywords, the strings only can contain integers. Keywords that are not
1313        provided, string values set to the empty string (""), or integer
1314        keywords set to -1, will be used to indicate that the value is not
1315        available in the locale. The following keywords are recognized:
1316 
1317        decimal_point
1318                         The operand is a string containing the symbol that is
1319                         used as the decimal delimiter (radix character) in
1320                         numeric, non-monetary formatted quantities. This
1321                         keyword cannot be omitted and cannot be set to the
1322                         empty string. In contexts where standards limit the
1323                         decimal_point to a single byte, the result of
1324                         specifying a multi-byte operand is unspecified.
1325 
1326 
1327        thousands_sep
1328                         The operand is a string containing the symbol that is
1329                         used as a separator for groups of digits to the left
1330                         of the decimal delimiter in numeric, non-monetary
1331                         formatted monetary quantities. In contexts where
1332                         standards limit the thousands_sep to a single byte,
1333                         the result of specifying a multi-byte operand is
1334                         unspecified.
1335 
1336 
1337        grouping
1338                         Define the size of each group of digits in formatted
1339                         non-monetary quantities.  The operand is a sequence of
1340                         integers separated by semicolons. Each integer
1341                         specifies the number of digits in each group, with the
1342                         initial integer defining the size of the group
1343                         immediately preceding the decimal delimiter, and the
1344                         following integers defining the preceding groups. If
1345                         the last integer is not -1, then the size of the
1346                         previous group (if any) will be repeatedly used for
1347                         the remainder of the digits. If the last integer is
1348                         -1, then no further grouping will be performed. The
1349                         non-monetary numeric formatting definitions for the
1350                         POSIX locale follow. The code listing depicts the
1351                         localedef input, the table representing the same
1352                         information with the addition of localeconv values,
1353                         and nl_langinfo constants.
1354 
1355                           LC_NUMERIC
1356                           # This is the POSIX locale definition for
1357                           # the LC_NUMERIC category.
1358                           #
1359                           decimal_point   "<period>"
1360                           thousands_sep   ""
1361                           grouping        -1
1362                           #
1363                           END LC_NUMERIC
1364 
1365 
1366 
1367 
1368 
1369 
1370 
1371                        POSIX locale   langinfo    localeconv()   localedef
1372        Item            Value          Constant    Value          Value
1373        --------------------------------------------------------------------
1374        decimal_point   "."            RADIXCHAR   "."            .
1375        thousands_sep   n/a            THOUSEP     ""             ""
1376        grouping        n/a            -           ""             -1
1377 
1378 
1379 
1380        The entry n/a indicates that the value is not available in the POSIX
1381        locale.
1382 
1383    LC_TIME
1384        The  LC_TIME category defines the interpretation of the field
1385        descriptors supported by  date(1) and affects the behavior of the
1386        strftime(3C), wcsftime(3C), strptime(3C), and nl_langinfo(3C)
1387        functions.  Because the interfaces for C-language access and locale
1388        definition differ significantly, they are described separately. For
1389        locale definition, the following mandatory keywords are recognized:
1390 
1391        abday
1392                       Define the abbreviated weekday names, corresponding to
1393                       the %a field descriptor (conversion specification in the
1394                       strftime(), wcsftime(), and strptime() functions). The
1395                       operand consists of seven semicolon-separated strings,
1396                       each surrounded by double-quotes. The first string is
1397                       the abbreviated name of the day corresponding to Sunday,
1398                       the second the abbreviated name of the day corresponding
1399                       to Monday, and so on.
1400 
1401 
1402        day
1403                       Define the full weekday names, corresponding to the %A
1404                       field descriptor.  The operand consists of seven
1405                       semicolon-separated  strings, each surrounded by double-
1406                       quotes. The first string is the full name of the day
1407                       corresponding to Sunday, the second the full name of the
1408                       day corresponding to Monday, and so on.
1409 
1410 
1411        abmon
1412                       Define the abbreviated month names, corresponding to the
1413                       %b field descriptor. The operand consists of twelve
1414                       semicolon-separated strings, each surrounded by double-
1415                       quotes. The first string is the abbreviated name of the
1416                       first month of the year (January), the second the
1417                       abbreviated name of the second month, and so on.
1418 
1419 
1420        mon
1421                       Define the full month names, corresponding to the %B
1422                       field descriptor.  The operand consists of twelve
1423                       semicolon-separated strings, each surrounded by double-
1424                       quotes. The first string is the full name of the first
1425                       month of the year (January), the second the full name of
1426                       the second month, and so on.
1427 
1428 
1429        d_t_fmt
1430                       Define the appropriate date and time representation,
1431                       corresponding to the %c field descriptor. The operand
1432                       consists of a string, and can contain any combination of
1433                       characters and field descriptors. In addition, the
1434                       string can contain the escape sequences  \\, \a, \b, \f,
1435                       \n, \r, \t, \v.
1436 
1437 
1438        date_fmt
1439                       Define the appropriate date and time representation,
1440                       corresponding to the %C field descriptor. The operand
1441                       consists of a string, and can contain any combination of
1442                       characters and field descriptors. In addition, the
1443                       string can contain the escape sequences  \\, \a, \b, \f,
1444                       \n, \r, \t, \v.
1445 
1446 
1447        d_fmt
1448                       Define the appropriate date representation,
1449                       corresponding to the %x field descriptor. The operand
1450                       consists of a string, and can contain any combination of
1451                       characters and field descriptors. In addition, the
1452                       string can contain the escape sequences  \\, \a, \b, \f,
1453                       \n, \r, \t, \v.
1454 
1455 
1456        t_fmt
1457                       Define the appropriate time representation,
1458                       corresponding to the %X field descriptor. The operand
1459                       consists of a string, and can contain any combination of
1460                       characters and field descriptors. In addition, the
1461                       string can contain the escape sequences  \\, \a, \b, \f,
1462                       \n, \r, \t, \v.
1463 
1464 
1465        am_pm
1466                       Define the appropriate representation of the ante
1467                       meridiem and post meridiem strings, corresponding to the
1468                       %p field descriptor. The operand consists of two
1469                       strings, separated by a semicolon, each surrounded by
1470                       double-quotes. The first string represents the ante
1471                       meridiem designation, the last string the post meridiem
1472                       designation.
1473 
1474 
1475        t_fmt_ampm
1476                       Define the appropriate time representation in the
1477                       12-hour clock format with am_pm, corresponding to the %r
1478                       field descriptor. The operand consists of a string and
1479                       can contain any combination of characters and field
1480                       descriptors. If the string is empty, the 12-hour format
1481                       is not supported in the locale.
1482 
1483 
1484        era
1485                       Define how years are counted and displayed for each era
1486                       in a locale. The operand consists of semicolon-separated
1487                       strings. Each string is an era description segment with
1488                       the format:
1489 
1490                       direction:offset:start_date:end_date:era_name:era_format
1491 
1492                       according to the definitions below.  There can be as
1493                       many era description segments as are necessary to
1494                       describe the different eras.
1495 
1496                       The start of an era might not be the earliest point For
1497                       example, the Christian era B.C. starts on the day before
1498                       January 1, A.D. 1, and increases with earlier time.
1499 
1500                       direction
1501                                     Either a + or a - character. The +
1502                                     character indicates that years closer to
1503                                     the start_date have lower numbers than
1504                                     those closer to the end_date. The -
1505                                     character indicates that years closer to
1506                                     the start_date have higher numbers than
1507                                     those closer to the end_date.
1508 
1509 
1510                       offset
1511                                     The number of the year closest to the
1512                                     start_date in the era, corresponding to
1513                                     the %Eg and %Ey field descriptors.
1514 
1515 
1516                       start_date
1517                                     A date in the form yyyy/mm/dd, where yyyy,
1518                                     mm, and dd are the year, month and day
1519                                     numbers respectively of the start of the
1520                                     era. Years prior to A.D. 1 are represented
1521                                     as negative numbers.
1522 
1523 
1524                       end_date
1525                                     The ending date of the era, in the same
1526                                     format as the start_date, or one of the
1527                                     two special values -* or +*. The value -*
1528                                     indicates that the ending date is the
1529                                     beginning of time. The value +* indicates
1530                                     that the ending date is the end of time.
1531 
1532 
1533                       era_name
1534                                     A string representing the name of the era,
1535                                     corresponding to the %EC field descriptor.
1536 
1537 
1538                       era_format
1539                                     A string for formatting the year in the
1540                                     era, corresponding to the %EG and %EY
1541                                     field descriptors.
1542 
1543 
1544 
1545        era_d_fmt
1546                       Define the format of the date in alternative era
1547                       notation, corresponding to the %Ex field descriptor.
1548 
1549 
1550        era_t_fmt
1551                       Define the locale's appropriate alternative time format,
1552                       corresponding to the %EX field descriptor.
1553 
1554 
1555        era_d_t_fmt
1556                       Define the locale's appropriate alternative date and
1557                       time format, corresponding to the %Ec field descriptor.
1558 
1559 
1560        alt_digits
1561                       Define alternative symbols for digits, corresponding to
1562                       the %O field descriptor modifier. The operand consists
1563                       of semicolon-separated strings, each surrounded by
1564                       double-quotes. The first string is the alternative
1565                       symbol corresponding with zero, the second string the
1566                       symbol corresponding with one, and so on. Up to 100
1567                       alternative symbol strings can be specified. The %O
1568                       modifier indicates that the string corresponding to the
1569                       value specified via the field descriptor will be used
1570                       instead of the value.
1571 
1572 
1573    LC_TIME C-language Access
1574        The following information can be accessed. These correspond to
1575        constants defined in <langinfo.h> and used as arguments to the
1576        nl_langinfo(3C) function.
1577 
1578        ABDAY_x
1579                       The abbreviated weekday names (for example Sun), where x
1580                       is a number from 1 to 7.
1581 
1582 
1583        DAY_x
1584                       The full weekday names (for example Sunday), where x is
1585                       a number from 1 to 7.
1586 
1587 
1588        ABMON_x
1589                       The abbreviated month names (for example Jan), where x
1590                       is a number from 1 to 12.
1591 
1592 
1593        MON_x
1594                       The full month names (for example January), where x is a
1595                       number from 1 to 12.
1596 
1597 
1598        D_T_FMT
1599                       The appropriate date and time representation.
1600 
1601 
1602        D_FMT
1603                       The appropriate date representation.
1604 
1605 
1606        T_FMT
1607                       The appropriate time representation.
1608 
1609 
1610        AM_STR
1611                       The appropriate ante-meridiem affix.
1612 
1613 
1614        PM_STR
1615                       The appropriate post-meridiem affix.
1616 
1617 
1618        T_FMT_AMPM
1619                       The appropriate time representation in the 12-hour clock
1620                       format with AM_STR and  PM_STR.
1621 
1622 
1623        ERA
1624                       The era description segments, which describe how years
1625                       are counted and displayed for each era in a locale. Each
1626                       era description segment has the format:
1627 
1628                         direction:offset:start_date:end_date:era_name:era_format
1629 
1630 
1631                       according to the definitions below. There will be as
1632                       many era description segments as are necessary to
1633                       describe the different eras. Era description segments
1634                       are separated by semicolons.
1635 
1636                       The start of an era might not be the earliest point For
1637                       example, the Christian era B.C. starts on the day before
1638                       January 1, A.D. 1, and increases with earlier time.
1639 
1640                       direction
1641                                     Either a + or a - character. The +
1642                                     character indicates that years closer to
1643                                     the start_date have lower numbers than
1644                                     those closer to the end_date.  The -
1645                                     character indicates that years closer to
1646                                     the start_date have higher numbers than
1647                                     those closer to the end_date.
1648 
1649 
1650                       offset
1651                                     The number of the year closest to the
1652                                     start_date in the era.
1653 
1654 
1655                       start_date
1656                                     A date in the form yyyy/mm/dd, where yyyy,
1657                                     mm, and dd are the year, month and day
1658                                     numbers respectively of the start of the
1659                                     era. Years prior to AD 1 are represented
1660                                     as negative numbers.
1661 
1662 
1663                       end_date
1664                                     The ending date of the era, in the same
1665                                     format as the start_date, or one of the
1666                                     two special values, -* or +*. The value -*
1667                                     indicates that the ending date is the
1668                                     beginning of time. The value +* indicates
1669                                     that the ending date is the end of time.
1670 
1671 
1672                       era_name
1673                                     The era, corresponding to the %EC
1674                                     conversion specification.
1675 
1676 
1677                       era_format
1678                                     The format of the year in the era,
1679                                     corresponding to the %EY and %EY
1680                                     conversion specifications.
1681 
1682 
1683 
1684        ERA_D_FMT
1685                       The era date format.
1686 
1687 
1688        ERA_T_FMT
1689                       The locale's appropriate alternative time format,
1690                       corresponding to the %EX field descriptor.
1691 
1692 
1693        ERA_D_T_FMT
1694                       The locale's appropriate alternative date and time
1695                       format, corresponding to the %Ec field descriptor.
1696 
1697 
1698        ALT_DIGITS
1699                       The alternative symbols for digits, corresponding to the
1700                       %O conversion specification modifier. The value consists
1701                       of semicolon-separated symbols. The first is the
1702                       alternative symbol corresponding to zero, the second is
1703                       the symbol corresponding to one, and so on.  Up to 100
1704                       alternative symbols may be specified. The following
1705                       table displays the correspondence between the items
1706                       described above and the conversion specifiers used by
1707                       date(1) and the strftime(3C), wcsftime(3C), and
1708                       strptime(3C) functions.
1709 
1710 
1711 
1712 
1713 
1714        +------------+-------------+---------------+
1715        | localedef  |  langinfo   |  Conversion   |
1716        |  Keyword   |  Constant   |   Specifier   |
1717        +------------+-------------+---------------+
1718        |   abday    |   ABDAY_x   |      %a       |
1719        |    day     |    DAY_x    |      %A       |
1720        |   abmon    |   ABMON_x   |      %b       |
1721        |    mon     |     MON     |      %B       |
1722        |  d_t_fmt   |   D_T_FMT   |      %c       |
1723        | date_fmt   |  DATE_FMT   |      %C       |
1724        |   d_fmt    |    D_FMT    |      %x       |
1725        |   t_fmt    |    T_FMT    |      %X       |
1726        |   am_pm    |   AM_STR    |      %p       |
1727        |   am_pm    |   PM_STR    |      %p       |
1728        |t_fmt_ampm  | T_FMT_AMPM  |      %r       |
1729        |    era     |     ERA     |   %EC, %Eg,   |
1730        |            |             | %EG, %Ey, %EY |
1731        | era_d_fmt  |  ERA_D_FMT  |      %Ex      |
1732        | era_t_fmt  |  ERA_T_FMT  |      %EX      |
1733        |era_d_t_fmt | ERA_D_T_FMT |      %Ec      |
1734        |alt_digits  | ALT_DIGITS  |      %O       |
1735        +------------+-------------+---------------+
1736 
1737    LC_TIME General Information
1738        Although certain of the field descriptors in the POSIX locale (such as
1739        the name of the month) are shown with initial capital letters, this
1740        need not be the case in other locales. Programs using these fields may
1741        need to adjust the capitalization if the output is going to be used at
1742        the beginning of a sentence.
1743 
1744 
1745        The LC_TIME descriptions of abday, day, mon, and abmon imply a
1746        Gregorian style calendar (7-day weeks, 12-month years, leap years, and
1747        so forth). Formatting time strings for other types of calendars is
1748        outside the scope of this document set.
1749 
1750 
1751        As specified under date in Locale Definition and strftime(3C), the
1752        field descriptors corresponding to the optional keywords consist of a
1753        modifier followed by a traditional field descriptor (for instance %Ex).
1754        If the optional keywords are not supported by the implementation or are
1755        unspecified for the current locale, these field descriptors are treated
1756        as the traditional field descriptor. For instance, assume the following
1757        keywords:
1758 
1759          alt_digits  "0th" ; "1st" ; "2nd" ; "3rd" ; "4th" ; "5th" ; \
1760          "6th" ; "7th" ; "8th" ; "9th" ; "10th">
1761          d_fmt       "The %Od day of %B in %Y"
1762 
1763 
1764 
1765 
1766        On 7/4/1776, the %x field descriptor would result in "The 4th day of
1767        July in 1776" while 7/14/1789 would come out as "The 14 day of July in
1768        1789" The above example is for illustrative purposes only. The %O
1769        modifier is primarily intended to provide for Kanji or Hindi digits in
1770        date formats.
1771 
1772    LC_MESSAGES
1773        The  LC_MESSAGES category defines the format and values for affirmative
1774        and negative responses.
1775 
1776 
1777        The following keywords are recognized as part of the locale definition
1778        file.  The nl_langinfo(3C) function accepts upper-case versions of the
1779        first four keywords.
1780 
1781        yesexpr
1782                   The operand consists of an extended regular expression (see
1783                   regex(5)) that describes the acceptable affirmative response
1784                   to a question expecting an affirmative or negative response.
1785 
1786 
1787        noexpr
1788                   The operand consists of an extended regular expression that
1789                   describes the acceptable negative response to a question
1790                   expecting an affirmative or negative response.
1791 
1792 
1793        yesstr
1794                   The operand consists of a fixed string (not a regular
1795                   expression) that can be used by an application for
1796                   composition of a message that lists an acceptable
1797                   affirmative response, such as in a prompt.
1798 
1799 
1800        nostr
1801                   The operand consists of a fixed string that can be used by
1802                   an application for composition of a message that lists an
1803                   acceptable negative response. The format and values for
1804                   affirmative and negative responses of the POSIX locale
1805                   follow; the code listing depicting the localedef input, the
1806                   table representing the same information with the addition of
1807                   nl_langinfo() constants.
1808 
1809                     LC_MESSAGES
1810                     # This is the POSIX locale definition for
1811                     # the LC_MESSAGES category.
1812                     #
1813                     yesexpr "<circumflex><left-square-bracket><y><Y>\
1814                             <right-square-bracket>"
1815                     #
1816                     noexpr  "<circumflex><left-square-bracket><n><N>\
1817                             <right-square-bracket>"
1818                     #
1819                     yesstr      "yes"
1820                     nostr       "no"
1821                     END LC_MESSAGES
1822 
1823 
1824 
1825 
1826 
1827 
1828 
1829        +------------------+-------------------+--------------------+
1830        |localedef Keyword | langinfo Constant | POSIX Locale Value |
1831        |yesexpr           | YESEXPR           | "^[yY]"            |
1832        |noexpr            | NOEXPR            | "^[nN]"            |
1833        |yesstr            | YESSTR            | "yes"              |
1834        |nostr             | NOSTR             | "no"               |
1835        +------------------+-------------------+--------------------+
1836 
1837 
1838        In an application conforming to the SUSv3 standard, the information on
1839        yesstr and nostr is not available.
1840 
1841 SEE ALSO
1842        date(1), locale(1), localedef(1), sort(1), tr(1), uniq(1),
1843        localeconv(3C), nl_langinfo(3C), setlocale(3C), strcoll(3C),
1844        strftime(3C), strptime(3C), strxfrm(3C), wcscoll(3C), wcsftime(3C),
1845        wcsxfrm(3C), wctype(3C), attributes(5), charmap(5), extensions(5),
1846        regex(5)
1847 
1848 
1849 
1850                                  May 16, 2020                        LOCALE(5)