1 LOCALE(5) Standards, Environments, and Macros LOCALE(5) 2 3 4 5 NAME 6 locale - subset of a user's environment that depends on language and 7 cultural conventions 8 9 DESCRIPTION 10 A locale is the definition of the subset of a user's environment that 11 depends on language and cultural conventions. It is made up from one or 12 more categories. Each category is identified by its name and controls 13 specific aspects of the behavior of components of the system. Category 14 names correspond to the following environment variable names: 15 16 LC_CTYPE 17 Character classification and case conversion. 18 19 20 LC_COLLATE 21 Collation order. 22 23 24 LC_TIME 25 Date and time formats. 26 27 28 LC_NUMERIC 29 Numeric formatting. 30 31 32 LC_MONETARY 33 Monetary formatting. 34 35 36 LC_MESSAGES 37 Formats of informative and diagnostic messages and 38 interactive responses. 39 40 41 42 The standard utilities base their behavior on the current locale, as 43 defined in the ENVIRONMENT VARIABLES section for each utility. The 44 behavior of some of the C-language functions will also be modified 45 based on the current locale, as defined by the last call to 46 setlocale(3C). 47 48 49 Locales other than those supplied by the implementation can be created 50 by the application via the localedef(1) utility. The value that is used 51 to specify a locale when using environment variables will be the string 52 specified as the name operand to localedef when the locale was 53 created. The strings "C" and "POSIX" are reserved as identifiers for 54 the POSIX locale. 55 56 57 Applications can select the desired locale by invoking the setlocale() 58 function with the appropriate value. If the function is invoked with an 59 empty string, such as: 60 61 setlocale(LC_ALL, ""); 62 63 64 65 the value of the corresponding environment variable is used. If the 66 environment variable is unset or is set to the empty string, the 67 setlocale() function sets the appropriate environment. 68 69 Locale Definition 70 Locales can be described with the file format accepted by the localedef 71 utility. 72 73 74 The locale definition file must contain one or more locale category 75 source definitions, and must not contain more than one definition for 76 the same locale category. 77 78 79 A category source definition consists of a category header, a category 80 body and a category trailer. A category header consists of the 81 character string naming of the category, beginning with the characters 82 LC_. The category trailer consists of the string END, followed by one 83 or more blank characters and the string used in the corresponding 84 category header. 85 86 87 The category body consists of one or more lines of text. Each line 88 contains an identifier, optionally followed by one or more operands. 89 Identifiers are either keywords, identifying a particular locale 90 element, or collating elements. Each keyword within a locale must have 91 a unique name (that is, two categories cannot have a commonly-named 92 keyword). No keyword can start with the characters LC_. Identifiers 93 must be separated from the operands by one or more blank characters. 94 95 96 Operands must be characters, collating elements, or strings of 97 characters. Strings must be enclosed in double-quotes ("). Literal 98 double-quotes within strings must be preceded by the <escape 99 character>, as described below. When a keyword is followed by more than 100 one operand, the operands must be separated by semicolons (;). Blank 101 characters are allowed both before and after a semicolon. 102 103 104 The first category header in the file can be preceded by a line 105 modifying the comment character. It has the following format, starting 106 in column 1: 107 108 "comment_char %c\n",<comment character> 109 110 111 112 The comment character defaults to the number sign (#). Blank lines and 113 lines containing the <comment character> in the first position are 114 ignored. 115 116 117 The first category header in the file can be preceded by a line 118 modifying the escape character to be used in the file. It has the 119 following format, starting in column 1: 120 121 "escape_char %c\n",<escape character> 122 123 124 125 126 The escape character defaults to backslash. 127 128 129 A line can be continued by placing an escape character as the last 130 character on the line; this continuation character will be discarded 131 from the input. Although the implementation need not accept any one 132 portion of a continued line with a length exceeding {LINE_MAX} bytes, 133 it places no limits on the accumulated length of the continued line. 134 Comment lines cannot be continued on a subsequent line using an escaped 135 newline character. 136 137 138 Individual characters, characters in strings, and collating elements 139 must be represented using symbolic names, as defined below. In 140 addition, characters can be represented using the characters themselves 141 or as octal, hexadecimal or decimal constants. When non-symbolic 142 notation is used, the resultant locale definitions will in many cases 143 not be portable between systems. The left angle bracket (<) is a 144 reserved symbol, denoting the start of a symbolic name; when used to 145 represent itself it must be preceded by the escape character. The 146 following rules apply to character representation: 147 148 1. A character can be represented via a symbolic name, enclosed 149 within angle brackets < and >. The symbolic name, including 150 the angle brackets, must exactly match a symbolic name 151 defined in the charmap file specified via the localedef -f 152 option, and will be replaced by a character value determined 153 from the value associated with the symbolic name in the 154 charmap file. The use of a symbolic name not found in the 155 charmap file constitutes an error, unless the category is 156 LC_CTYPE or LC_COLLATE, in which case it constitutes a 157 warning condition (see localedef(1) for a description of 158 action resulting from errors and warnings). The 159 specification of a symbolic name in a collating-element or 160 collating-symbol section that duplicates a symbolic name in 161 the charmap file (if present) is an error. Use of the 162 escape character or a right angle bracket within a symbolic 163 name is invalid unless the character is preceded by the 164 escape character. 165 166 Example: 167 168 <C>;<c-cedilla> "<M><a><y>" 169 170 171 172 2. A character can be represented by the character itself, in 173 which case the value of the character is implementation- 174 dependent. Within a string, the double-quote character, the 175 escape character and the right angle bracket character must 176 be escaped (preceded by the escape character) to be 177 interpreted as the character itself. Outside strings, the 178 characters 179 180 , ; < > escape_char 181 182 183 must be escaped to be interpreted as the character itself. 184 185 Example: 186 187 c "May" 188 189 190 191 3. A character can be represented as an octal constant. An 192 octal constant is specified as the escape character followed 193 by two or more octal digits. Each constant represents a byte 194 value. Multi-byte values can be represented by concatenated 195 constants specified in byte order with the last constant 196 specifying the least significant byte of the character. 197 198 Example: 199 200 \143;\347;\143\150 "\115\141\171" 201 202 203 204 4. A character can be represented as a hexadecimal constant. A 205 hexadecimal constant is specified as the escape character 206 followed by an x followed by two or more hexadecimal digits. 207 Each constant represents a byte value. Multi-byte values 208 can be represented by concatenated constants specified in 209 byte order with the last constant specifying the least 210 significant byte of the character. 211 212 Example: 213 214 \x63;\xe7;\x63\x68 "\x4d\x61\x79" 215 216 217 218 5. A character can be represented as a decimal constant. A 219 decimal constant is specified as the escape character 220 followed by a d followed by two or more decimal digits. Each 221 constant represents a byte value. Multi-byte values can be 222 represented by concatenated constants specified in byte 223 order with the last constant specifying the least 224 significant byte of the character. 225 226 Example: 227 228 \d99;\d231;\d99\d104 "\d77\d97\d121" 229 230 231 Only characters existing in the character set for which the 232 locale definition is created can be specified, whether using 233 symbolic names, the characters themselves, or octal, decimal 234 or hexadecimal constants. If a charmap file is present, only 235 characters defined in the charmap can be specified using 236 octal, decimal or hexadecimal constants. Symbolic names not 237 present in the charmap file can be specified and will be 238 ignored, as specified under item 1 above. 239 240 LC_CTYPE 241 The LC_CTYPE category defines character classification, case 242 conversion and other character attributes. In addition, a series of 243 characters can be represented by three adjacent periods representing an 244 ellipsis symbol (...). The ellipsis specification is interpreted as 245 meaning that all values between the values preceding and following it 246 represent valid characters. The ellipsis specification is valid only 247 within a single encoded character set, that is, within a group of 248 characters of the same size. An ellipsis is interpreted as including in 249 the list all characters with an encoded value higher than the encoded 250 value of the character preceding the ellipsis and lower than the 251 encoded value of the character following the ellipsis. 252 253 254 Example: 255 256 \x30;...;\x39; 257 258 259 260 261 includes in the character class all characters with encoded values 262 between the endpoints. 263 264 265 The following keywords are recognized. In the descriptions, the term 266 ``automatically included'' means that it is not an error either to 267 include or omit any of the referenced characters. 268 269 270 The character classes digit, xdigit, lower, upper, and space have a set 271 of automatically included characters. These only need to be specified 272 if the character values (that is, encoding) differ from the 273 implementation default values. 274 275 upper 276 Define characters to be classified as upper-case 277 letters. 278 279 In the POSIX locale, the 26 upper-case letters are 280 included: 281 282 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 283 284 285 In a locale definition file, no character specified 286 for the keywords cntrl, digit, punct, or space can be 287 specified. The upper-case letters A to Z are 288 automatically included in this class. 289 290 291 lower 292 Define characters to be classified as lower-case 293 letters. In the POSIX locale, the 26 lower-case 294 letters are included: 295 296 a b c d e f g h i j k l m n o p q r s t u v w x y z 297 298 299 In a locale definition file, no character specified 300 for the keywords cntrl, digit, punct, or space can be 301 specified. The lower-case letters a to z of the 302 portable character set are automatically included in 303 this class. 304 305 306 alpha 307 Define characters to be classified as letters. 308 309 In the POSIX locale, all characters in the classes 310 upper and lower are included. 311 312 In a locale definition file, no character specified 313 for the keywords cntrl, digit, punct, or space can be 314 specified. Characters classified as either upper or 315 lower are automatically included in this class. 316 317 318 digit 319 Define the characters to be classified as numeric 320 digits. 321 322 In the POSIX locale, only 323 324 0 1 2 3 4 5 6 7 8 9 325 326 327 are included. 328 329 In a locale definition file, only the digits 0, 1, 2, 330 3, 4, 5, 6, 7, 8, and 9 can be specified, and in 331 contiguous ascending sequence by numerical value. The 332 digits 0 to 9 of the portable character set are 333 automatically included in this class. 334 335 The definition of character class digit requires that 336 only ten characters; the ones defining digits can be 337 specified; alternative digits (for example, Hindi or 338 Kanji) cannot be specified here. 339 340 341 alnum 342 Define characters to be classified as letters and 343 numeric digits. Only the characters specified for the 344 alpha and digit keywords are specified. Characters 345 specified for the keywords alpha and digit are 346 automatically included in this class. 347 348 349 space 350 Define characters to be classified as white-space 351 characters. 352 353 In the POSIX locale, at a minimum, the characters 354 SPACE, FORMFEED, NEWLINE, CARRIAGE RETURN, TAB, and 355 VERTICAL TAB are included. 356 357 In a locale definition file, no character specified 358 for the keywords upper, lower, alpha, digit, graph, 359 or xdigit can be specified. The characters SPACE, 360 FORMFEED, NEWLINE, CARRIAGE RETURN, TAB, and 361 VERTICAL TAB of the portable character set, and any 362 characters included in the class blank are 363 automatically included in this class. 364 365 366 cntrl 367 Define characters to be classified as control 368 characters. 369 370 In the POSIX locale, no characters in classes alpha 371 or print are included. 372 373 In a locale definition file, no character specified 374 for the keywords upper, lower, alpha, digit, punct, 375 graph, print, or xdigit can be specified. 376 377 378 punct 379 Define characters to be classified as punctuation 380 characters. 381 382 In the POSIX locale, neither the space character nor 383 any characters in classes alpha, digit, or cntrl are 384 included. 385 386 In a locale definition file, no character specified 387 for the keywords upper, lower, alpha, digit, cntrl, 388 xdigit or as the space character can be specified. 389 390 391 graph 392 Define characters to be classified as printable 393 characters, not including the space character. 394 395 In the POSIX locale, all characters in classes alpha, 396 digit, and punct are included; no characters in class 397 cntrl are included. 398 399 In a locale definition file, characters specified for 400 the keywords upper, lower, alpha, digit, xdigit, and 401 punct are automatically included in this class. No 402 character specified for the keyword cntrl can be 403 specified. 404 405 406 print 407 Define characters to be classified as printable 408 characters, including the space character. 409 410 In the POSIX locale, all characters in class graph 411 are included; no characters in class cntrl are 412 included. 413 414 In a locale definition file, characters specified for 415 the keywords upper, lower, alpha, digit, xdigit, 416 punct, and the space character are automatically 417 included in this class. No character specified for 418 the keyword cntrl can be specified. 419 420 421 xdigit 422 Define the characters to be classified as hexadecimal 423 digits. 424 425 In the POSIX locale, only: 426 427 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f 428 429 430 are included. 431 432 In a locale definition file, only the characters 433 defined for the class digit can be specified, in 434 contiguous ascending sequence by numerical value, 435 followed by one or more sets of six characters 436 representing the hexadecimal digits 10 to 15 437 inclusive, with each set in ascending order (for 438 example A, B, C, D, E, F, a, b, c, d, e, f). The 439 digits 0 to 9, the upper-case letters A to F and the 440 lower-case letters a to f of the portable character 441 set are automatically included in this class. 442 443 The definition of character class xdigit requires 444 that the characters included in character class digit 445 be included here also. 446 447 448 blank 449 Define characters to be classified as blank 450 characters. 451 452 In the POSIX locale, only the space and tab 453 characters are included. 454 455 In a locale definition file, the characters space and 456 tab are automatically included in this class. 457 458 459 charclass 460 Define one or more locale-specific character class 461 names as strings separated by semi-colons. Each named 462 character class can then be defined subsequently in 463 the LC_CTYPE definition. A character class name 464 consists of at least one and at most 465 {CHARCLASS_NAME_MAX} bytes of alphanumeric characters 466 from the portable filename character set. The first 467 character of a character class name cannot be a 468 digit. The name cannot match any of the LC_CTYPE 469 keywords defined in this document. 470 471 472 charclass-name 473 Define characters to be classified as belonging to 474 the named locale-specific character class. In the 475 POSIX locale, the locale-specific named character 476 classes need not exist. If a class name is defined by 477 a charclass keyword, but no characters are 478 subsequently assigned to it, this is not an error; it 479 represents a class without any characters belonging 480 to it. The charclass-name can be used as the property 481 argument to the wctype(3C) function, in regular 482 expression and shell pattern-matching bracket 483 expressions, and by the tr(1) command. 484 485 486 toupper 487 Define the mapping of lower-case letters to upper- 488 case letters. 489 490 In the POSIX locale, at a minimum, the 26 lower-case 491 characters: 492 493 a b c d e f g h i j k l m n o p q r s t u v w x y z 494 495 496 are mapped to the corresponding 26 upper-case 497 characters: 498 499 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 500 501 502 In a locale definition file, the operand consists of 503 character pairs, separated by semicolons. The 504 characters in each character pair are separated by a 505 comma and the pair enclosed by parentheses. The first 506 character in each pair is the lower-case letter, the 507 second the corresponding upper-case letter. Only 508 characters specified for the keywords lower and upper 509 can be specified. The lower-case letters a to z, and 510 their corresponding upper-case letters A to Z, of the 511 portable character set are automatically included in 512 this mapping, but only when the toupper keyword is 513 omitted from the locale definition. 514 515 516 tolower 517 Define the mapping of upper-case letters to lower- 518 case letters. 519 520 In the POSIX locale, at a minimum, the 26 upper-case 521 characters: 522 523 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 524 525 526 are mapped to the corresponding 26 lower-case 527 characters: 528 529 a b c d e f g h i j k l m n o p q r s t u v w x y z 530 531 532 In a locale definition file, the operand consists of 533 character pairs, separated by semicolons. The 534 characters in each character pair are separated by a 535 comma and the pair enclosed by parentheses. The first 536 character in each pair is the upper-case letter, the 537 second the corresponding lower-case letter. Only 538 characters specified for the keywords lower and upper 539 can be specified. If the tolower keyword is omitted 540 from the locale definition, the mapping will be the 541 reverse mapping of the one specified for toupper. 542 543 544 LC_COLLATE 545 The LC_COLLATE category provides a collation sequence definition for 546 numerous utilities (such as sort(1), uniq(1), and so forth), regular 547 expression matching (see regex(5)), and the strcoll(3C), strxfrm(3C), 548 wcscoll(3C), and wcsxfrm(3C) functions. 549 550 551 A collation sequence definition defines the relative order between 552 collating elements (characters and multi-character collating elements) 553 in the locale. This order is expressed in terms of collation values, 554 that is, by assigning each element one or more collation values (also 555 known as collation weights). The following capabilities are provided: 556 557 1. Multi-character collating elements. Specification of multi- 558 character collating elements (that is, sequences of two or 559 more characters to be collated as an entity). 560 561 2. User-defined ordering of collating elements. Each collating 562 element is assigned a collation value defining its order in 563 the character (or basic) collation sequence. This ordering 564 is used by regular expressions and pattern matching and, 565 unless collation weights are explicitly specified, also as 566 the collation weight to be used in sorting. 567 568 3. Multiple weights and equivalence classes. Collating elements 569 can be assigned one or more (up to the limit 570 {COLL_WEIGHTS_MAX} ) collating weights for use in sorting. 571 The first weight is hereafter referred to as the primary 572 weight. 573 574 4. One-to-Many mapping. A single character is mapped into a 575 string of collating elements. 576 577 5. Equivalence class definition. Two or more collating elements 578 have the same collation value (primary weight). 579 580 6. Ordering by weights. When two strings are compared to 581 determine their relative order, the two strings are first 582 broken up into a series of collating elements. The elements 583 in each successive pair of elements are then compared 584 according to the relative primary weights for the elements. 585 If equal, and more than one weight has been assigned, the 586 pairs of collating elements are recompared according to the 587 relative subsequent weights, until either a pair of 588 collating elements compare unequal or the weights are 589 exhausted. 590 591 592 The following keywords are recognized in a collation sequence 593 definition. They are described in detail in the following sections. 594 595 copy 596 Specify the name of an existing locale which is 597 used as the definition of this category. If this 598 keyword is specified, no other keyword is 599 specified. 600 601 602 collating-element 603 Define a collating-element symbol representing a 604 multi-character collating element. This keyword is 605 optional. 606 607 608 collating-symbol 609 Define a collating symbol for use in collation 610 order statements. This keyword is optional. 611 612 613 order_start 614 Define collation rules. This statement is followed 615 by one or more collation order statements, 616 assigning character collation values and collation 617 weights to collating elements. 618 619 620 order_end 621 Specify the end of the collation-order statements. 622 623 624 collating-element keyword 625 In addition to the collating elements in the character set, the 626 collating-element keyword is used to define multi-character collating 627 elements. The syntax is: 628 629 "collating-element %s from \"%s\"\n",<collating-symbol>,<string> 630 631 632 633 The <collating-symbol> operand is a symbolic name, enclosed between 634 angle brackets (< and >), and must not duplicate any symbolic name in 635 the current charmap file (if any), or any other symbolic name defined 636 in this collation definition. The string operand is a string of two or 637 more characters that collates as an entity. A <collating-element> 638 defined via this keyword is only recognized with the LC_COLLATE 639 category. 640 641 642 Example: 643 collating-element <ch> from "<c><h>" 644 collating-element <e-acute> from "<acute><e>" 645 collating-element <ll> from "ll" 646 647 collating-symbol keyword 648 This keyword will be used to define symbols for use in collation 649 sequence statements; that is, between the order_start and the order_end 650 keywords. The syntax is: 651 652 "collating-symbol %s\n",<collating-symbol> 653 654 655 656 The <collating-symbol> is a symbolic name, enclosed between angle 657 brackets (< and >), and must not duplicate any symbolic name in the 658 current charmap file (if any), or any other symbolic name defined in 659 this collation definition. 660 661 662 A collating-symbol defined via this keyword is only recognized with the 663 LC_COLLATE category. 664 665 666 Example: 667 collating-symbol <UPPER_CASE> 668 collating-symbol <HIGH> 669 670 671 The collating-symbol keyword defines a symbolic name that can be 672 associated with a relative position in the character order sequence. 673 While such a symbolic name does not represent any collating element, it 674 can be used as a weight. 675 676 order_start keyword 677 The order_start keyword must precede collation order entries and also 678 defines the number of weights for this collation sequence definition 679 and other collation rules. 680 681 682 The syntax of the order_start keyword is: 683 684 "order_start %s;%s;...;%s\n",<sort-rules>,<sort-rules> 685 686 687 688 The operands to the order_start keyword are optional. If present, the 689 operands define rules to be applied when strings are compared. The 690 number of operands define how many weights each element is assigned. If 691 no operands are present, one forward operand is assumed. If present, 692 the first operand defines rules to be applied when comparing strings 693 using the first (primary) weight; the second when comparing strings 694 using the second weight, and so on. Operands are separated by 695 semicolons (;). Each operand consists of one or more collation 696 directives, separated by commas (,). If the number of operands exceeds 697 the {COLL_WEIGHTS_MAX} limit, the utility will issue a warning message. 698 The following directives will be supported: 699 700 forward 701 Specifies that comparison operations for the weight level 702 proceed from start of string towards the end of string. 703 704 705 backward 706 Specifies that comparison operations for the weight level 707 proceed from end of string towards the beginning of string. 708 709 710 position 711 Specifies that comparison operations for the weight level 712 will consider the relative position of elements in the 713 strings not subject to IGNORE. The string containing an 714 element not subject to IGNORE after the fewest collating 715 elements subject to IGNORE from the start of the compare 716 will collate first. If both strings contain a character not 717 subject to IGNORE in the same relative position, the 718 collating values assigned to the elements will determine 719 the ordering. In case of equality, subsequent characters 720 not subject to IGNORE are considered in the same manner. 721 722 723 724 The directives forward and backward are mutually exclusive. 725 726 727 Example: 728 729 order_start forward;backward 730 731 732 733 734 If no operands are specified, a single forward operand is assumed. 735 736 Collation Order 737 The order_start keyword is followed by collating identifier entries. 738 The syntax for the collating element entries is: 739 740 "%s %s;%s;...;%s\n"<collating-identifier>,<weight>,<weight>,... 741 742 743 744 Each collating-identifier consists of either a character described in 745 Locale Definition above, a <collating-element>, a <collating-symbol>, 746 an ellipsis, or the special symbol UNDEFINED. The order in which 747 collating elements are specified determines the character order 748 sequence, such that each collating element compares less than the 749 elements following it. The NUL character compares lower than any other 750 character. 751 752 753 A <collating-element> is used to specify multi-character collating 754 elements, and indicates that the character sequence specified via the 755 <collating-element> is to be collated as a unit and in the relative 756 order specified by its place. 757 758 759 A <collating-symbol> is used to define a position in the relative order 760 for use in weights. No weights are specified with a <collating-symbol>. 761 762 763 The ellipsis symbol specifies that a sequence of characters will 764 collate according to their encoded character values. It is interpreted 765 as indicating that all characters with a coded character set value 766 higher than the value of the character in the preceding line, and lower 767 than the coded character set value for the character in the following 768 line, in the current coded character set, will be placed in the 769 character collation order between the previous and the following 770 character in ascending order according to their coded character set 771 values. An initial ellipsis is interpreted as if the preceding line 772 specified the NUL character, and a trailing ellipsis as if the 773 following line specified the highest coded character set value in the 774 current coded character set. An ellipsis is treated as invalid if the 775 preceding or following lines do not specify characters in the current 776 coded character set. The use of the ellipsis symbol ties the definition 777 to a specific coded character set and may preclude the definition from 778 being portable between implementations. 779 780 781 The symbol UNDEFINED is interpreted as including all coded character 782 set values not specified explicitly or via the ellipsis symbol. Such 783 characters are inserted in the character collation order at the point 784 indicated by the symbol, and in ascending order according to their 785 coded character set values. If no UNDEFINED symbol is specified, and 786 the current coded character set contains characters not specified in 787 this section, the utility will issue a warning message and place such 788 characters at the end of the character collation order. 789 790 791 The optional operands for each collation-element are used to define the 792 primary, secondary, or subsequent weights for the collating element. 793 The first operand specifies the relative primary weight, the second the 794 relative secondary weight, and so on. Two or more collation-elements 795 can be assigned the same weight; they belong to the same equivalence 796 class if they have the same primary weight. Collation behaves as if, 797 for each weight level, elements subject to IGNORE are removed, unless 798 the position collation directive is specified for the corresponding 799 level with the order_start keyword. Then each successive pair of 800 elements is compared according to the relative weights for the 801 elements. If the two strings compare equal, the process is repeated for 802 the next weight level, up to the limit {COLL_WEIGHTS_MAX}. 803 804 805 Weights are expressed as characters described in Locale Definition 806 above, <collating-symbol>s, <collating-element>s, an ellipsis, or the 807 special symbol IGNORE. A single character, a <collating-symbol> or a 808 <collating-element> represent the relative position in the character 809 collating sequence of the character or symbol, rather than the 810 character or characters themselves. Thus, rather than assigning 811 absolute values to weights, a particular weight is expressed using the 812 relative order value assigned to a collating element based on its order 813 in the character collation sequence. 814 815 816 One-to-many mapping is indicated by specifying two or more concatenated 817 characters or symbolic names. For example, if the character <eszet> is 818 given the string "<s><s>" as a weight, comparisons are performed as if 819 all occurrences of the character <eszet> are replaced by <s><s> 820 (assuming that <s> has the collating weight <s>). If it is necessary to 821 define <eszet> and <s><s> as an equivalence class, then a collating 822 element must be defined for the string ss. 823 824 825 All characters specified via an ellipsis will by default be assigned 826 unique weights, equal to the relative order of characters. Characters 827 specified via an explicit or implicit UNDEFINED special symbol will by 828 default be assigned the same primary weight (that is, belong to the 829 same equivalence class). An ellipsis symbol as a weight is interpreted 830 to mean that each character in the sequence has unique weights, equal 831 to the relative order of their character in the character collation 832 sequence. The use of the ellipsis as a weight is treated as an error if 833 the collating element is neither an ellipsis nor the special symbol 834 UNDEFINED. 835 836 837 The special keyword IGNORE as a weight indicates that when strings are 838 compared using the weights at the level where IGNORE is specified, the 839 collating element is ignored; that is, as if the string did not contain 840 the collating element. In regular expressions and pattern matching, all 841 characters that are subject to IGNORE in their primary weight form an 842 equivalence class. 843 844 845 An empty operand is interpreted as the collating element itself. 846 847 848 For example, the order statement: 849 850 <a> <a>;<a> 851 852 853 854 855 is equal to: 856 857 <a> 858 859 860 861 862 An ellipsis can be used as an operand if the collating element was an 863 ellipsis, and is interpreted as the value of each character defined by 864 the ellipsis. 865 866 867 The collation order as defined in this section defines the 868 interpretation of bracket expressions in regular expressions. 869 870 871 Example: 872 873 874 875 876 order_start forward;backward 877 UNDEFINED IGNORE;IGNORE 878 <LOW> 879 <space> <LOW>;<space> 880 ... <LOW>;... 881 <a> <a>;<a> 882 <a-acute> <a>;<a-acute> 883 <a-grave> <a>;<a-grave> 884 <A> <a>;<A> 885 <A-acute> <a>;<A-acute> 886 <A-grave> <a>;<A-grave> 887 <ch> <ch>;<ch> 888 <Ch> <ch>;<Ch> 889 <s> <s>;<s> 890 <eszet> "<s><s>";"<eszet><eszet>" 891 order_end 892 893 894 895 This example is interpreted as follows: 896 897 1. The UNDEFINED means that all characters not specified in 898 this definition (explicitly or via the ellipsis) are ignored 899 for collation purposes; for regular expression purposes they 900 are ordered first. 901 902 2. All characters between <space> and <a> have the same primary 903 equivalence class and individual secondary weights based on 904 their ordinal encoded values. 905 906 3. All characters based on the upper- or lower-case character a 907 belong to the same primary equivalence class. 908 909 4. The multi-character collating element <ch> is represented by 910 the collating symbol <ch> and belongs to the same primary 911 equivalence class as the multi-character collating element 912 <Ch>. 913 914 order_end keyword 915 The collating order entries must be terminated with an order_end 916 keyword. 917 918 LC_MONETARY 919 The LC_MONETARY category defines the rules and symbols that are used 920 to format monetary numeric information. This information is available 921 through the localeconv(3C) function 922 923 924 The following items are defined in this category of the locale. The 925 item names are the keywords recognized by the localedef(1) utility when 926 defining a locale. They are also similar to the member names of the 927 lconv structure defined in <locale.h>. The localeconv function returns 928 {CHAR_MAX} for unspecified integer items and the empty string ("") for 929 unspecified or size zero string items. 930 931 932 In a locale definition file the operands are strings. For some 933 keywords, the strings can contain only integers. Keywords that are not 934 provided, string values set to the empty string (""), or integer 935 keywords set to -1, are used to indicate that the value is not 936 available in the locale. 937 938 int_curr_symbol 939 The international currency symbol. The operand is 940 a four-character string, with the first three 941 characters containing the alphabetic 942 international currency symbol in accordance with 943 those specified in the ISO 4217 standard. The 944 fourth character is the character used to 945 separate the international currency symbol from 946 the monetary quantity. 947 948 949 currency_symbol 950 The string used as the local currency symbol. 951 952 953 mon_decimal_point 954 The operand is a string containing the symbol 955 that is used as the decimal delimiter (radix 956 character) in monetary formatted quantities. 957 958 959 mon_thousands_sep 960 The operand is a string containing the symbol 961 that is used as a separator for groups of digits 962 to the left of the decimal delimiter in formatted 963 monetary quantities. 964 965 966 mon_grouping 967 Define the size of each group of digits in 968 formatted monetary quantities. The operand is a 969 sequence of integers separated by semicolons. 970 Each integer specifies the number of digits in 971 each group, with the initial integer defining the 972 size of the group immediately preceding the 973 decimal delimiter, and the following integers 974 defining the preceding groups. If the last 975 integer is not -1, then the size of the previous 976 group (if any) will be repeatedly used for the 977 remainder of the digits. If the last integer is 978 -1, then no further grouping will be performed. 979 980 The following is an example of the interpretation 981 of the mon_grouping keyword. Assuming that the 982 value to be formatted is 123456789 and the 983 mon_thousands_sep is ', then the following table 984 shows the result. The third column shows the 985 equivalent string in the ISO C standard that 986 would be used by the localeconv function to 987 accommodate this grouping. 988 989 mon_grouping Formatted Value ISO C String 990 991 3;-1 123456'789 "\3\177" 992 3 123'456'789 "\3" 993 3;2;-1 1234'56'789 "\3\2\177" 994 3;2 12'34'56'789 "\3\2" 995 -1 1234567898 "\177" 996 997 998 In these examples, the octal value of {CHAR_MAX} 999 is 177. 1000 1001 1002 positive_sign 1003 A string used to indicate a non-negative-valued 1004 formatted monetary quantity. 1005 1006 1007 negative_sign 1008 A string used to indicate a negative-valued 1009 formatted monetary quantity. 1010 1011 1012 int_frac_digits 1013 An integer representing the number of fractional 1014 digits (those to the right of the decimal 1015 delimiter) to be written in a formatted monetary 1016 quantity using int_curr_symbol. 1017 1018 1019 frac_digits 1020 An integer representing the number of fractional 1021 digits (those to the right of the decimal 1022 delimiter) to be written in a formatted monetary 1023 quantity using currency_symbol. 1024 1025 1026 p_cs_precedes 1027 In an application conforming to the SUSv3 1028 standard, an integer set to 1 if the 1029 currency_symbol precedes the value for a monetary 1030 quantity with a non-negative value, and set to 0 1031 if the symbol succeeds the value. 1032 1033 In an application not conforming to the SUSv3 1034 standard, an integer set to 1 if the 1035 currency_symbol or int_currency_symbol precedes 1036 the value for a monetary quantity with a non- 1037 negative value, and set to 0 if the symbol 1038 succeeds the value. 1039 1040 1041 p_sep_by_space 1042 In an application conforming to the SUSv3 1043 standard, an integer set to 0 if no space 1044 separates the currency_symbol from the value for 1045 a monetary quantity with a non-negative value, 1046 set to 1 if a space separates the symbol from the 1047 value, and set to 2 if a space separates the 1048 symbol and the sign string, if adjacent. 1049 1050 In an application not conforming to the SUSv3 1051 standard, an integer set to 0 if no space 1052 separates the currency_symbol or int_curr_symbol 1053 from the value for a monetary quantity with a 1054 non-negative value, set to 1 if a space separates 1055 the symbol from the value, and set to 2 if a 1056 space separates the symbol and the sign string, 1057 if adjacent. 1058 1059 1060 n_cs_precedes 1061 In an application conforming to the SUSv3 1062 standard, an integer set to 1 if the 1063 currency_symbol precedes the value for a monetary 1064 quantity with a negative value, and set to 0 if 1065 the symbol succeeds the value. 1066 1067 In an application not conforming to the SUSv3 1068 standard, an integer set to 1 if the 1069 currency_symbol or int_currency_symbol precedes 1070 the value for a monetary quantity with a negative 1071 value, and set to 0 if the symbol succeeds the 1072 value. 1073 1074 1075 n_sep_by_space 1076 In an application conforming to the SUSv3 1077 standard, an integer set to 0 if no space 1078 separates the currency_symbol from the value for 1079 a monetary quantity with a negative value, set to 1080 1 if a space separates the symbol from the value, 1081 and set to 2 if a space separates the symbol and 1082 the sign string, if adjacent. 1083 1084 In an application not conforming to the SUSv3 1085 standard, an integer set to 0 if no space 1086 separates the currency_symbol or int_curr_symbol 1087 from the value for a monetary quantity with a 1088 negative value, set to 1 if a space separates the 1089 symbol from the value, and set to 2 if a space 1090 separates the symbol and the sign string, if 1091 adjacent. 1092 1093 1094 p_sign_posn 1095 An integer set to a value indicating the 1096 positioning of the positive_sign for a monetary 1097 quantity with a non-negative value. The following 1098 integer values are recognized for both 1099 p_sign_posn and n_sign_posn: 1100 1101 In an application conforming to the SUSv3 1102 standard: 1103 1104 0 1105 Parentheses enclose the quantity and the 1106 currency_symbol. 1107 1108 1109 1 1110 The sign string precedes the quantity and 1111 the currency_symbol. 1112 1113 1114 2 1115 The sign string succeeds the quantity and 1116 the currency_symbol. 1117 1118 1119 3 1120 The sign string precedes the 1121 currency_symbol. 1122 1123 1124 4 1125 The sign string succeeds the 1126 currency_symbol. 1127 1128 In an application not conforming to the SUSv3 1129 standard: 1130 1131 0 1132 Parentheses enclose the quantity and the 1133 currency_symbol or int_curr_symbol. 1134 1135 1136 1 1137 The sign string precedes the quantity and 1138 the currency_symbol or int_curr_symbol. 1139 1140 1141 2 1142 The sign string succeeds the quantity and 1143 the currency_symbol or int_curr_symbol. 1144 1145 1146 3 1147 The sign string precedes the currency_symbol 1148 or int_curr_symbol. 1149 1150 1151 4 1152 The sign string succeeds the currency_symbol 1153 or int_curr_symbol. 1154 1155 1156 1157 n_sign_posn 1158 An integer set to a value indicating the 1159 positioning of the negative_sign for a negative 1160 formatted monetary quantity. 1161 1162 1163 int_p_cs_precedes 1164 An integer set to 1 if the int_curr_symbol 1165 precedes the value for a monetary quantity with a 1166 non-negative value, and set to 0 if the symbol 1167 succeeds the value. 1168 1169 1170 int_n_cs_precedes 1171 An integer set to 1 if the int_curr_symbol 1172 precedes the value for a monetary quantity with a 1173 negative value, and set to 0 if the symbol 1174 succeeds the value. 1175 1176 1177 int_p_sep_by_space 1178 An integer set to 0 if no space separates the 1179 int_curr_symbol from the value for a monetary 1180 quantity with a non-negative value, set to 1 if a 1181 space separates the symbol from the value, and 1182 set to 2 if a space separates the symbol and the 1183 sign string, if adjacent. 1184 1185 1186 int_n_sep_by_space 1187 An integer set to 0 if no space separates the 1188 int_curr_symbol from the value for a monetary 1189 quantity with a negative value, set to 1 if a 1190 space separates the symbol from the value, and 1191 set to 2 if a space separates the symbol and the 1192 sign string, if adjacent. 1193 1194 1195 int_p_sign_posn 1196 An integer set to a value indicating the 1197 positioning of the positive_sign for a positive 1198 monetary quantity formatted with the 1199 international format. The following integer 1200 values are recognized for int_p_sign_posn and 1201 int_n_sign_posn: 1202 1203 0 1204 Parentheses enclose the quantity and the 1205 int_curr_symbol. 1206 1207 1208 1 1209 The sign string precedes the quantity and 1210 the int_curr_symbol. 1211 1212 1213 2 1214 The sign string precedes the quantity and 1215 the int_curr_symbol. 1216 1217 1218 3 1219 The sign string precedes the 1220 int_curr_symbol. 1221 1222 1223 4 1224 The sign string succeeds the 1225 int_curr_symbol. 1226 1227 1228 1229 int_n_sign_posn 1230 An integer set to a value indicating the 1231 positioning of the negative_sign for a negative 1232 monetary quantity formatted with the 1233 international format. 1234 1235 1236 1237 The following table shows the result of various combinations: 1238 1239 1240 1241 1242 p_sep_by_space 1243 2 1 0 1244 p_cs_precedes= 1 p_sign_posn= 0 ($1.25) ($1.25) ($1.25) 1245 p_sign_posn= 1 +$1.25 +$1.25 +$1.25 1246 p_sign_posn= 2 $1.25+ $1.25+ $1.25+ 1247 p_sign_posn= 3 +$1.25 +$1.25 +$1.25 1248 p_sign_posn= 4 $+1.25 $+1.25 $+1.25 1249 p_cs_precedes= 0 p_sign_posn= 0 (1.25 $) (1.25 $) (1.25$) 1250 p_sign_posn= 1 +1.25 $ +1.25 $ +1.25$ 1251 p_sign_posn= 2 1.25$ + 1.25 $+ 1.25$+ 1252 p_sign_posn= 3 1.25+ $ 1.25 +$ 1.25+$ 1253 p_sign_posn= 4 1.25$ + 1.25 $+ 1.25$+ 1254 1255 1256 1257 The monetary formatting definitions for the POSIX locale follow. The 1258 code listing depicts the localedef(1) input, the table representing the 1259 same information with the addition of localeconv(3C) and 1260 nl_langinfo(3C) formats. All values are unspecified in the POSIX 1261 locale. 1262 1263 LC_MONETARY 1264 # This is the POSIX locale definition for 1265 # the LC_MONETARY category. 1266 # 1267 int_curr_symbol "" 1268 currency_symbol "" 1269 mon_decimal_point "" 1270 mon_thousands_sep "" 1271 mon_grouping -1 1272 positive_sign "" 1273 negative_sign "" 1274 int_frac_digits -1 1275 frac_digits -1 1276 p_cs_precedes -1 1277 p_sep_by_space -1 1278 n_cs_precedes -1 1279 n_sep_by_space -1 1280 p_sign_posn -1 1281 n_sign_posn -1 1282 int_p_cs_precedes -1 1283 int_p_sep_by_space -1 1284 int_n_cs_precedes -1 1285 int_n_sep_by_space -1 1286 int_p_sign_posn -1 1287 int_n_sign_posn -1 1288 # 1289 END LC_MONETARY 1290 1291 1292 1293 1294 The entry n/a indicates that the value is not available in the POSIX 1295 locale. 1296 1297 LC_NUMERIC 1298 The LC_NUMERIC category defines the rules and symbols that will be 1299 used to format non-monetary numeric information. This information is 1300 available through the localeconv(3C) function. 1301 1302 1303 The following items are defined in this category of the locale. The 1304 item names are the keywords recognized by the localedef utility when 1305 defining a locale. They are also similar to the member names of the 1306 lconv structure defined in <locale.h>. The localeconv() function 1307 returns {CHAR_MAX} for unspecified integer items and the empty string 1308 ("") for unspecified or size zero string items. 1309 1310 1311 In a locale definition file the operands are strings. For some 1312 keywords, the strings only can contain integers. Keywords that are not 1313 provided, string values set to the empty string (""), or integer 1314 keywords set to -1, will be used to indicate that the value is not 1315 available in the locale. The following keywords are recognized: 1316 1317 decimal_point 1318 The operand is a string containing the symbol that is 1319 used as the decimal delimiter (radix character) in 1320 numeric, non-monetary formatted quantities. This 1321 keyword cannot be omitted and cannot be set to the 1322 empty string. In contexts where standards limit the 1323 decimal_point to a single byte, the result of 1324 specifying a multi-byte operand is unspecified. 1325 1326 1327 thousands_sep 1328 The operand is a string containing the symbol that is 1329 used as a separator for groups of digits to the left 1330 of the decimal delimiter in numeric, non-monetary 1331 formatted monetary quantities. In contexts where 1332 standards limit the thousands_sep to a single byte, 1333 the result of specifying a multi-byte operand is 1334 unspecified. 1335 1336 1337 grouping 1338 Define the size of each group of digits in formatted 1339 non-monetary quantities. The operand is a sequence of 1340 integers separated by semicolons. Each integer 1341 specifies the number of digits in each group, with the 1342 initial integer defining the size of the group 1343 immediately preceding the decimal delimiter, and the 1344 following integers defining the preceding groups. If 1345 the last integer is not -1, then the size of the 1346 previous group (if any) will be repeatedly used for 1347 the remainder of the digits. If the last integer is 1348 -1, then no further grouping will be performed. The 1349 non-monetary numeric formatting definitions for the 1350 POSIX locale follow. The code listing depicts the 1351 localedef input, the table representing the same 1352 information with the addition of localeconv values, 1353 and nl_langinfo constants. 1354 1355 LC_NUMERIC 1356 # This is the POSIX locale definition for 1357 # the LC_NUMERIC category. 1358 # 1359 decimal_point "<period>" 1360 thousands_sep "" 1361 grouping -1 1362 # 1363 END LC_NUMERIC 1364 1365 1366 1367 1368 1369 1370 1371 POSIX locale langinfo localeconv() localedef 1372 Item Value Constant Value Value 1373 -------------------------------------------------------------------- 1374 decimal_point "." RADIXCHAR "." . 1375 thousands_sep n/a THOUSEP "" "" 1376 grouping n/a - "" -1 1377 1378 1379 1380 The entry n/a indicates that the value is not available in the POSIX 1381 locale. 1382 1383 LC_TIME 1384 The LC_TIME category defines the interpretation of the field 1385 descriptors supported by date(1) and affects the behavior of the 1386 strftime(3C), wcsftime(3C), strptime(3C), and nl_langinfo(3C) 1387 functions. Because the interfaces for C-language access and locale 1388 definition differ significantly, they are described separately. For 1389 locale definition, the following mandatory keywords are recognized: 1390 1391 abday 1392 Define the abbreviated weekday names, corresponding to 1393 the %a field descriptor (conversion specification in the 1394 strftime(), wcsftime(), and strptime() functions). The 1395 operand consists of seven semicolon-separated strings, 1396 each surrounded by double-quotes. The first string is 1397 the abbreviated name of the day corresponding to Sunday, 1398 the second the abbreviated name of the day corresponding 1399 to Monday, and so on. 1400 1401 1402 day 1403 Define the full weekday names, corresponding to the %A 1404 field descriptor. The operand consists of seven 1405 semicolon-separated strings, each surrounded by double- 1406 quotes. The first string is the full name of the day 1407 corresponding to Sunday, the second the full name of the 1408 day corresponding to Monday, and so on. 1409 1410 1411 abmon 1412 Define the abbreviated month names, corresponding to the 1413 %b field descriptor. The operand consists of twelve 1414 semicolon-separated strings, each surrounded by double- 1415 quotes. The first string is the abbreviated name of the 1416 first month of the year (January), the second the 1417 abbreviated name of the second month, and so on. 1418 1419 1420 mon 1421 Define the full month names, corresponding to the %B 1422 field descriptor. The operand consists of twelve 1423 semicolon-separated strings, each surrounded by double- 1424 quotes. The first string is the full name of the first 1425 month of the year (January), the second the full name of 1426 the second month, and so on. 1427 1428 1429 d_t_fmt 1430 Define the appropriate date and time representation, 1431 corresponding to the %c field descriptor. The operand 1432 consists of a string, and can contain any combination of 1433 characters and field descriptors. In addition, the 1434 string can contain the escape sequences \\, \a, \b, \f, 1435 \n, \r, \t, \v. 1436 1437 1438 date_fmt 1439 Define the appropriate date and time representation, 1440 corresponding to the %C field descriptor. The operand 1441 consists of a string, and can contain any combination of 1442 characters and field descriptors. In addition, the 1443 string can contain the escape sequences \\, \a, \b, \f, 1444 \n, \r, \t, \v. 1445 1446 1447 d_fmt 1448 Define the appropriate date representation, 1449 corresponding to the %x field descriptor. The operand 1450 consists of a string, and can contain any combination of 1451 characters and field descriptors. In addition, the 1452 string can contain the escape sequences \\, \a, \b, \f, 1453 \n, \r, \t, \v. 1454 1455 1456 t_fmt 1457 Define the appropriate time representation, 1458 corresponding to the %X field descriptor. The operand 1459 consists of a string, and can contain any combination of 1460 characters and field descriptors. In addition, the 1461 string can contain the escape sequences \\, \a, \b, \f, 1462 \n, \r, \t, \v. 1463 1464 1465 am_pm 1466 Define the appropriate representation of the ante 1467 meridiem and post meridiem strings, corresponding to the 1468 %p field descriptor. The operand consists of two 1469 strings, separated by a semicolon, each surrounded by 1470 double-quotes. The first string represents the ante 1471 meridiem designation, the last string the post meridiem 1472 designation. 1473 1474 1475 t_fmt_ampm 1476 Define the appropriate time representation in the 1477 12-hour clock format with am_pm, corresponding to the %r 1478 field descriptor. The operand consists of a string and 1479 can contain any combination of characters and field 1480 descriptors. If the string is empty, the 12-hour format 1481 is not supported in the locale. 1482 1483 1484 era 1485 Define how years are counted and displayed for each era 1486 in a locale. The operand consists of semicolon-separated 1487 strings. Each string is an era description segment with 1488 the format: 1489 1490 direction:offset:start_date:end_date:era_name:era_format 1491 1492 according to the definitions below. There can be as 1493 many era description segments as are necessary to 1494 describe the different eras. 1495 1496 The start of an era might not be the earliest point For 1497 example, the Christian era B.C. starts on the day before 1498 January 1, A.D. 1, and increases with earlier time. 1499 1500 direction 1501 Either a + or a - character. The + 1502 character indicates that years closer to 1503 the start_date have lower numbers than 1504 those closer to the end_date. The - 1505 character indicates that years closer to 1506 the start_date have higher numbers than 1507 those closer to the end_date. 1508 1509 1510 offset 1511 The number of the year closest to the 1512 start_date in the era, corresponding to 1513 the %Eg and %Ey field descriptors. 1514 1515 1516 start_date 1517 A date in the form yyyy/mm/dd, where yyyy, 1518 mm, and dd are the year, month and day 1519 numbers respectively of the start of the 1520 era. Years prior to A.D. 1 are represented 1521 as negative numbers. 1522 1523 1524 end_date 1525 The ending date of the era, in the same 1526 format as the start_date, or one of the 1527 two special values -* or +*. The value -* 1528 indicates that the ending date is the 1529 beginning of time. The value +* indicates 1530 that the ending date is the end of time. 1531 1532 1533 era_name 1534 A string representing the name of the era, 1535 corresponding to the %EC field descriptor. 1536 1537 1538 era_format 1539 A string for formatting the year in the 1540 era, corresponding to the %EG and %EY 1541 field descriptors. 1542 1543 1544 1545 era_d_fmt 1546 Define the format of the date in alternative era 1547 notation, corresponding to the %Ex field descriptor. 1548 1549 1550 era_t_fmt 1551 Define the locale's appropriate alternative time format, 1552 corresponding to the %EX field descriptor. 1553 1554 1555 era_d_t_fmt 1556 Define the locale's appropriate alternative date and 1557 time format, corresponding to the %Ec field descriptor. 1558 1559 1560 alt_digits 1561 Define alternative symbols for digits, corresponding to 1562 the %O field descriptor modifier. The operand consists 1563 of semicolon-separated strings, each surrounded by 1564 double-quotes. The first string is the alternative 1565 symbol corresponding with zero, the second string the 1566 symbol corresponding with one, and so on. Up to 100 1567 alternative symbol strings can be specified. The %O 1568 modifier indicates that the string corresponding to the 1569 value specified via the field descriptor will be used 1570 instead of the value. 1571 1572 1573 LC_TIME C-language Access 1574 The following information can be accessed. These correspond to 1575 constants defined in <langinfo.h> and used as arguments to the 1576 nl_langinfo(3C) function. 1577 1578 ABDAY_x 1579 The abbreviated weekday names (for example Sun), where x 1580 is a number from 1 to 7. 1581 1582 1583 DAY_x 1584 The full weekday names (for example Sunday), where x is 1585 a number from 1 to 7. 1586 1587 1588 ABMON_x 1589 The abbreviated month names (for example Jan), where x 1590 is a number from 1 to 12. 1591 1592 1593 MON_x 1594 The full month names (for example January), where x is a 1595 number from 1 to 12. 1596 1597 1598 D_T_FMT 1599 The appropriate date and time representation. 1600 1601 1602 D_FMT 1603 The appropriate date representation. 1604 1605 1606 T_FMT 1607 The appropriate time representation. 1608 1609 1610 AM_STR 1611 The appropriate ante-meridiem affix. 1612 1613 1614 PM_STR 1615 The appropriate post-meridiem affix. 1616 1617 1618 T_FMT_AMPM 1619 The appropriate time representation in the 12-hour clock 1620 format with AM_STR and PM_STR. 1621 1622 1623 ERA 1624 The era description segments, which describe how years 1625 are counted and displayed for each era in a locale. Each 1626 era description segment has the format: 1627 1628 direction:offset:start_date:end_date:era_name:era_format 1629 1630 1631 according to the definitions below. There will be as 1632 many era description segments as are necessary to 1633 describe the different eras. Era description segments 1634 are separated by semicolons. 1635 1636 The start of an era might not be the earliest point For 1637 example, the Christian era B.C. starts on the day before 1638 January 1, A.D. 1, and increases with earlier time. 1639 1640 direction 1641 Either a + or a - character. The + 1642 character indicates that years closer to 1643 the start_date have lower numbers than 1644 those closer to the end_date. The - 1645 character indicates that years closer to 1646 the start_date have higher numbers than 1647 those closer to the end_date. 1648 1649 1650 offset 1651 The number of the year closest to the 1652 start_date in the era. 1653 1654 1655 start_date 1656 A date in the form yyyy/mm/dd, where yyyy, 1657 mm, and dd are the year, month and day 1658 numbers respectively of the start of the 1659 era. Years prior to AD 1 are represented 1660 as negative numbers. 1661 1662 1663 end_date 1664 The ending date of the era, in the same 1665 format as the start_date, or one of the 1666 two special values, -* or +*. The value -* 1667 indicates that the ending date is the 1668 beginning of time. The value +* indicates 1669 that the ending date is the end of time. 1670 1671 1672 era_name 1673 The era, corresponding to the %EC 1674 conversion specification. 1675 1676 1677 era_format 1678 The format of the year in the era, 1679 corresponding to the %EY and %EY 1680 conversion specifications. 1681 1682 1683 1684 ERA_D_FMT 1685 The era date format. 1686 1687 1688 ERA_T_FMT 1689 The locale's appropriate alternative time format, 1690 corresponding to the %EX field descriptor. 1691 1692 1693 ERA_D_T_FMT 1694 The locale's appropriate alternative date and time 1695 format, corresponding to the %Ec field descriptor. 1696 1697 1698 ALT_DIGITS 1699 The alternative symbols for digits, corresponding to the 1700 %O conversion specification modifier. The value consists 1701 of semicolon-separated symbols. The first is the 1702 alternative symbol corresponding to zero, the second is 1703 the symbol corresponding to one, and so on. Up to 100 1704 alternative symbols may be specified. The following 1705 table displays the correspondence between the items 1706 described above and the conversion specifiers used by 1707 date(1) and the strftime(3C), wcsftime(3C), and 1708 strptime(3C) functions. 1709 1710 1711 1712 1713 1714 +------------+-------------+---------------+ 1715 | localedef | langinfo | Conversion | 1716 | Keyword | Constant | Specifier | 1717 +------------+-------------+---------------+ 1718 | abday | ABDAY_x | %a | 1719 | day | DAY_x | %A | 1720 | abmon | ABMON_x | %b | 1721 | mon | MON | %B | 1722 | d_t_fmt | D_T_FMT | %c | 1723 | date_fmt | DATE_FMT | %C | 1724 | d_fmt | D_FMT | %x | 1725 | t_fmt | T_FMT | %X | 1726 | am_pm | AM_STR | %p | 1727 | am_pm | PM_STR | %p | 1728 |t_fmt_ampm | T_FMT_AMPM | %r | 1729 | era | ERA | %EC, %Eg, | 1730 | | | %EG, %Ey, %EY | 1731 | era_d_fmt | ERA_D_FMT | %Ex | 1732 | era_t_fmt | ERA_T_FMT | %EX | 1733 |era_d_t_fmt | ERA_D_T_FMT | %Ec | 1734 |alt_digits | ALT_DIGITS | %O | 1735 +------------+-------------+---------------+ 1736 1737 LC_TIME General Information 1738 Although certain of the field descriptors in the POSIX locale (such as 1739 the name of the month) are shown with initial capital letters, this 1740 need not be the case in other locales. Programs using these fields may 1741 need to adjust the capitalization if the output is going to be used at 1742 the beginning of a sentence. 1743 1744 1745 The LC_TIME descriptions of abday, day, mon, and abmon imply a 1746 Gregorian style calendar (7-day weeks, 12-month years, leap years, and 1747 so forth). Formatting time strings for other types of calendars is 1748 outside the scope of this document set. 1749 1750 1751 As specified under date in Locale Definition and strftime(3C), the 1752 field descriptors corresponding to the optional keywords consist of a 1753 modifier followed by a traditional field descriptor (for instance %Ex). 1754 If the optional keywords are not supported by the implementation or are 1755 unspecified for the current locale, these field descriptors are treated 1756 as the traditional field descriptor. For instance, assume the following 1757 keywords: 1758 1759 alt_digits "0th" ; "1st" ; "2nd" ; "3rd" ; "4th" ; "5th" ; \ 1760 "6th" ; "7th" ; "8th" ; "9th" ; "10th"> 1761 d_fmt "The %Od day of %B in %Y" 1762 1763 1764 1765 1766 On 7/4/1776, the %x field descriptor would result in "The 4th day of 1767 July in 1776" while 7/14/1789 would come out as "The 14 day of July in 1768 1789" The above example is for illustrative purposes only. The %O 1769 modifier is primarily intended to provide for Kanji or Hindi digits in 1770 date formats. 1771 1772 LC_MESSAGES 1773 The LC_MESSAGES category defines the format and values for affirmative 1774 and negative responses. 1775 1776 1777 The following keywords are recognized as part of the locale definition 1778 file. The nl_langinfo(3C) function accepts upper-case versions of the 1779 first four keywords. 1780 1781 yesexpr 1782 The operand consists of an extended regular expression (see 1783 regex(5)) that describes the acceptable affirmative response 1784 to a question expecting an affirmative or negative response. 1785 1786 1787 noexpr 1788 The operand consists of an extended regular expression that 1789 describes the acceptable negative response to a question 1790 expecting an affirmative or negative response. 1791 1792 1793 yesstr 1794 The operand consists of a fixed string (not a regular 1795 expression) that can be used by an application for 1796 composition of a message that lists an acceptable 1797 affirmative response, such as in a prompt. 1798 1799 1800 nostr 1801 The operand consists of a fixed string that can be used by 1802 an application for composition of a message that lists an 1803 acceptable negative response. The format and values for 1804 affirmative and negative responses of the POSIX locale 1805 follow; the code listing depicting the localedef input, the 1806 table representing the same information with the addition of 1807 nl_langinfo() constants. 1808 1809 LC_MESSAGES 1810 # This is the POSIX locale definition for 1811 # the LC_MESSAGES category. 1812 # 1813 yesexpr "<circumflex><left-square-bracket><y><Y>\ 1814 <right-square-bracket>" 1815 # 1816 noexpr "<circumflex><left-square-bracket><n><N>\ 1817 <right-square-bracket>" 1818 # 1819 yesstr "yes" 1820 nostr "no" 1821 END LC_MESSAGES 1822 1823 1824 1825 1826 1827 1828 1829 +------------------+-------------------+--------------------+ 1830 |localedef Keyword | langinfo Constant | POSIX Locale Value | 1831 |yesexpr | YESEXPR | "^[yY]" | 1832 |noexpr | NOEXPR | "^[nN]" | 1833 |yesstr | YESSTR | "yes" | 1834 |nostr | NOSTR | "no" | 1835 +------------------+-------------------+--------------------+ 1836 1837 1838 In an application conforming to the SUSv3 standard, the information on 1839 yesstr and nostr is not available. 1840 1841 SEE ALSO 1842 date(1), locale(1), localedef(1), sort(1), tr(1), uniq(1), 1843 localeconv(3C), nl_langinfo(3C), setlocale(3C), strcoll(3C), 1844 strftime(3C), strptime(3C), strxfrm(3C), wcscoll(3C), wcsftime(3C), 1845 wcsxfrm(3C), wctype(3C), attributes(5), charmap(5), extensions(5), 1846 regex(5) 1847 1848 1849 1850 May 16, 2020 LOCALE(5)