Print this page
12743 man page spelling mistakes
Split |
Close |
Expand all |
Collapse all |
--- old/usr/src/man/man5/locale.5.man.txt
+++ new/usr/src/man/man5/locale.5.man.txt
1 1 LOCALE(5) Standards, Environments, and Macros LOCALE(5)
2 2
3 3
4 4
5 5 NAME
6 6 locale - subset of a user's environment that depends on language and
7 7 cultural conventions
8 8
9 9 DESCRIPTION
10 10 A locale is the definition of the subset of a user's environment that
11 11 depends on language and cultural conventions. It is made up from one or
12 12 more categories. Each category is identified by its name and controls
13 13 specific aspects of the behavior of components of the system. Category
14 14 names correspond to the following environment variable names:
15 15
16 16 LC_CTYPE
17 17 Character classification and case conversion.
18 18
19 19
20 20 LC_COLLATE
21 21 Collation order.
22 22
23 23
24 24 LC_TIME
25 25 Date and time formats.
26 26
27 27
28 28 LC_NUMERIC
29 29 Numeric formatting.
30 30
31 31
32 32 LC_MONETARY
33 33 Monetary formatting.
34 34
35 35
36 36 LC_MESSAGES
37 37 Formats of informative and diagnostic messages and
38 38 interactive responses.
39 39
40 40
41 41
42 42 The standard utilities base their behavior on the current locale, as
43 43 defined in the ENVIRONMENT VARIABLES section for each utility. The
44 44 behavior of some of the C-language functions will also be modified
45 45 based on the current locale, as defined by the last call to
46 46 setlocale(3C).
47 47
48 48
49 49 Locales other than those supplied by the implementation can be created
50 50 by the application via the localedef(1) utility. The value that is used
51 51 to specify a locale when using environment variables will be the string
52 52 specified as the name operand to localedef when the locale was
53 53 created. The strings "C" and "POSIX" are reserved as identifiers for
54 54 the POSIX locale.
55 55
56 56
57 57 Applications can select the desired locale by invoking the setlocale()
58 58 function with the appropriate value. If the function is invoked with an
59 59 empty string, such as:
60 60
61 61 setlocale(LC_ALL, "");
62 62
63 63
64 64
65 65 the value of the corresponding environment variable is used. If the
66 66 environment variable is unset or is set to the empty string, the
67 67 setlocale() function sets the appropriate environment.
68 68
69 69 Locale Definition
70 70 Locales can be described with the file format accepted by the localedef
71 71 utility.
72 72
73 73
74 74 The locale definition file must contain one or more locale category
75 75 source definitions, and must not contain more than one definition for
76 76 the same locale category.
77 77
78 78
79 79 A category source definition consists of a category header, a category
80 80 body and a category trailer. A category header consists of the
81 81 character string naming of the category, beginning with the characters
82 82 LC_. The category trailer consists of the string END, followed by one
83 83 or more blank characters and the string used in the corresponding
84 84 category header.
85 85
86 86
87 87 The category body consists of one or more lines of text. Each line
88 88 contains an identifier, optionally followed by one or more operands.
89 89 Identifiers are either keywords, identifying a particular locale
90 90 element, or collating elements. Each keyword within a locale must have
91 91 a unique name (that is, two categories cannot have a commonly-named
92 92 keyword). No keyword can start with the characters LC_. Identifiers
93 93 must be separated from the operands by one or more blank characters.
94 94
95 95
96 96 Operands must be characters, collating elements, or strings of
97 97 characters. Strings must be enclosed in double-quotes ("). Literal
98 98 double-quotes within strings must be preceded by the <escape
99 99 character>, as described below. When a keyword is followed by more than
100 100 one operand, the operands must be separated by semicolons (;). Blank
101 101 characters are allowed both before and after a semicolon.
102 102
103 103
104 104 The first category header in the file can be preceded by a line
105 105 modifying the comment character. It has the following format, starting
106 106 in column 1:
107 107
108 108 "comment_char %c\n",<comment character>
109 109
110 110
111 111
112 112 The comment character defaults to the number sign (#). Blank lines and
113 113 lines containing the <comment character> in the first position are
114 114 ignored.
115 115
116 116
117 117 The first category header in the file can be preceded by a line
118 118 modifying the escape character to be used in the file. It has the
119 119 following format, starting in column 1:
120 120
121 121 "escape_char %c\n",<escape character>
122 122
123 123
124 124
125 125
126 126 The escape character defaults to backslash.
127 127
128 128
129 129 A line can be continued by placing an escape character as the last
130 130 character on the line; this continuation character will be discarded
131 131 from the input. Although the implementation need not accept any one
132 132 portion of a continued line with a length exceeding {LINE_MAX} bytes,
133 133 it places no limits on the accumulated length of the continued line.
134 134 Comment lines cannot be continued on a subsequent line using an escaped
135 135 newline character.
136 136
137 137
138 138 Individual characters, characters in strings, and collating elements
139 139 must be represented using symbolic names, as defined below. In
140 140 addition, characters can be represented using the characters themselves
141 141 or as octal, hexadecimal or decimal constants. When non-symbolic
142 142 notation is used, the resultant locale definitions will in many cases
143 143 not be portable between systems. The left angle bracket (<) is a
144 144 reserved symbol, denoting the start of a symbolic name; when used to
145 145 represent itself it must be preceded by the escape character. The
146 146 following rules apply to character representation:
147 147
148 148 1. A character can be represented via a symbolic name, enclosed
149 149 within angle brackets < and >. The symbolic name, including
150 150 the angle brackets, must exactly match a symbolic name
151 151 defined in the charmap file specified via the localedef -f
152 152 option, and will be replaced by a character value determined
153 153 from the value associated with the symbolic name in the
154 154 charmap file. The use of a symbolic name not found in the
155 155 charmap file constitutes an error, unless the category is
156 156 LC_CTYPE or LC_COLLATE, in which case it constitutes a
157 157 warning condition (see localedef(1) for a description of
158 158 action resulting from errors and warnings). The
159 159 specification of a symbolic name in a collating-element or
160 160 collating-symbol section that duplicates a symbolic name in
161 161 the charmap file (if present) is an error. Use of the
162 162 escape character or a right angle bracket within a symbolic
163 163 name is invalid unless the character is preceded by the
164 164 escape character.
165 165
166 166 Example:
167 167
168 168 <C>;<c-cedilla> "<M><a><y>"
169 169
170 170
171 171
172 172 2. A character can be represented by the character itself, in
173 173 which case the value of the character is implementation-
174 174 dependent. Within a string, the double-quote character, the
175 175 escape character and the right angle bracket character must
176 176 be escaped (preceded by the escape character) to be
177 177 interpreted as the character itself. Outside strings, the
178 178 characters
179 179
180 180 , ; < > escape_char
181 181
182 182
183 183 must be escaped to be interpreted as the character itself.
184 184
185 185 Example:
186 186
187 187 c "May"
188 188
189 189
190 190
191 191 3. A character can be represented as an octal constant. An
192 192 octal constant is specified as the escape character followed
193 193 by two or more octal digits. Each constant represents a byte
194 194 value. Multi-byte values can be represented by concatenated
195 195 constants specified in byte order with the last constant
196 196 specifying the least significant byte of the character.
197 197
198 198 Example:
199 199
200 200 \143;\347;\143\150 "\115\141\171"
201 201
202 202
203 203
204 204 4. A character can be represented as a hexadecimal constant. A
205 205 hexadecimal constant is specified as the escape character
206 206 followed by an x followed by two or more hexadecimal digits.
207 207 Each constant represents a byte value. Multi-byte values
208 208 can be represented by concatenated constants specified in
209 209 byte order with the last constant specifying the least
210 210 significant byte of the character.
211 211
212 212 Example:
213 213
214 214 \x63;\xe7;\x63\x68 "\x4d\x61\x79"
215 215
216 216
217 217
218 218 5. A character can be represented as a decimal constant. A
219 219 decimal constant is specified as the escape character
220 220 followed by a d followed by two or more decimal digits. Each
221 221 constant represents a byte value. Multi-byte values can be
222 222 represented by concatenated constants specified in byte
223 223 order with the last constant specifying the least
224 224 significant byte of the character.
225 225
226 226 Example:
227 227
228 228 \d99;\d231;\d99\d104 "\d77\d97\d121"
229 229
230 230
231 231 Only characters existing in the character set for which the
232 232 locale definition is created can be specified, whether using
233 233 symbolic names, the characters themselves, or octal, decimal
234 234 or hexadecimal constants. If a charmap file is present, only
235 235 characters defined in the charmap can be specified using
236 236 octal, decimal or hexadecimal constants. Symbolic names not
237 237 present in the charmap file can be specified and will be
238 238 ignored, as specified under item 1 above.
239 239
240 240 LC_CTYPE
241 241 The LC_CTYPE category defines character classification, case
242 242 conversion and other character attributes. In addition, a series of
243 243 characters can be represented by three adjacent periods representing an
244 244 ellipsis symbol (...). The ellipsis specification is interpreted as
245 245 meaning that all values between the values preceding and following it
246 246 represent valid characters. The ellipsis specification is valid only
247 247 within a single encoded character set, that is, within a group of
248 248 characters of the same size. An ellipsis is interpreted as including in
249 249 the list all characters with an encoded value higher than the encoded
250 250 value of the character preceding the ellipsis and lower than the
251 251 encoded value of the character following the ellipsis.
252 252
253 253
254 254 Example:
255 255
256 256 \x30;...;\x39;
257 257
258 258
259 259
260 260
261 261 includes in the character class all characters with encoded values
262 262 between the endpoints.
263 263
264 264
265 265 The following keywords are recognized. In the descriptions, the term
266 266 ``automatically included'' means that it is not an error either to
267 267 include or omit any of the referenced characters.
268 268
269 269
270 270 The character classes digit, xdigit, lower, upper, and space have a set
271 271 of automatically included characters. These only need to be specified
272 272 if the character values (that is, encoding) differ from the
273 273 implementation default values.
274 274
275 275 upper
276 276 Define characters to be classified as upper-case
277 277 letters.
278 278
279 279 In the POSIX locale, the 26 upper-case letters are
280 280 included:
281 281
282 282 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
283 283
284 284
285 285 In a locale definition file, no character specified
286 286 for the keywords cntrl, digit, punct, or space can be
287 287 specified. The upper-case letters A to Z are
288 288 automatically included in this class.
289 289
290 290
291 291 lower
292 292 Define characters to be classified as lower-case
293 293 letters. In the POSIX locale, the 26 lower-case
294 294 letters are included:
295 295
296 296 a b c d e f g h i j k l m n o p q r s t u v w x y z
297 297
298 298
299 299 In a locale definition file, no character specified
300 300 for the keywords cntrl, digit, punct, or space can be
301 301 specified. The lower-case letters a to z of the
302 302 portable character set are automatically included in
303 303 this class.
304 304
305 305
306 306 alpha
307 307 Define characters to be classified as letters.
308 308
309 309 In the POSIX locale, all characters in the classes
310 310 upper and lower are included.
311 311
312 312 In a locale definition file, no character specified
313 313 for the keywords cntrl, digit, punct, or space can be
314 314 specified. Characters classified as either upper or
315 315 lower are automatically included in this class.
316 316
317 317
318 318 digit
319 319 Define the characters to be classified as numeric
320 320 digits.
321 321
322 322 In the POSIX locale, only
323 323
324 324 0 1 2 3 4 5 6 7 8 9
325 325
326 326
327 327 are included.
328 328
329 329 In a locale definition file, only the digits 0, 1, 2,
330 330 3, 4, 5, 6, 7, 8, and 9 can be specified, and in
331 331 contiguous ascending sequence by numerical value. The
332 332 digits 0 to 9 of the portable character set are
333 333 automatically included in this class.
334 334
335 335 The definition of character class digit requires that
336 336 only ten characters; the ones defining digits can be
337 337 specified; alternative digits (for example, Hindi or
338 338 Kanji) cannot be specified here.
339 339
340 340
341 341 alnum
342 342 Define characters to be classified as letters and
343 343 numeric digits. Only the characters specified for the
344 344 alpha and digit keywords are specified. Characters
345 345 specified for the keywords alpha and digit are
346 346 automatically included in this class.
347 347
348 348
349 349 space
350 350 Define characters to be classified as white-space
351 351 characters.
352 352
353 353 In the POSIX locale, at a minimum, the characters
354 354 SPACE, FORMFEED, NEWLINE, CARRIAGE RETURN, TAB, and
355 355 VERTICAL TAB are included.
356 356
357 357 In a locale definition file, no character specified
358 358 for the keywords upper, lower, alpha, digit, graph,
359 359 or xdigit can be specified. The characters SPACE,
360 360 FORMFEED, NEWLINE, CARRIAGE RETURN, TAB, and
361 361 VERTICAL TAB of the portable character set, and any
362 362 characters included in the class blank are
363 363 automatically included in this class.
364 364
365 365
366 366 cntrl
367 367 Define characters to be classified as control
368 368 characters.
369 369
370 370 In the POSIX locale, no characters in classes alpha
371 371 or print are included.
372 372
373 373 In a locale definition file, no character specified
374 374 for the keywords upper, lower, alpha, digit, punct,
375 375 graph, print, or xdigit can be specified.
376 376
377 377
378 378 punct
379 379 Define characters to be classified as punctuation
380 380 characters.
381 381
382 382 In the POSIX locale, neither the space character nor
383 383 any characters in classes alpha, digit, or cntrl are
384 384 included.
385 385
386 386 In a locale definition file, no character specified
387 387 for the keywords upper, lower, alpha, digit, cntrl,
388 388 xdigit or as the space character can be specified.
389 389
390 390
391 391 graph
392 392 Define characters to be classified as printable
393 393 characters, not including the space character.
394 394
395 395 In the POSIX locale, all characters in classes alpha,
396 396 digit, and punct are included; no characters in class
397 397 cntrl are included.
398 398
399 399 In a locale definition file, characters specified for
400 400 the keywords upper, lower, alpha, digit, xdigit, and
401 401 punct are automatically included in this class. No
402 402 character specified for the keyword cntrl can be
403 403 specified.
404 404
405 405
406 406 print
407 407 Define characters to be classified as printable
408 408 characters, including the space character.
409 409
410 410 In the POSIX locale, all characters in class graph
411 411 are included; no characters in class cntrl are
412 412 included.
413 413
414 414 In a locale definition file, characters specified for
415 415 the keywords upper, lower, alpha, digit, xdigit,
416 416 punct, and the space character are automatically
417 417 included in this class. No character specified for
418 418 the keyword cntrl can be specified.
419 419
420 420
421 421 xdigit
422 422 Define the characters to be classified as hexadecimal
423 423 digits.
424 424
425 425 In the POSIX locale, only:
426 426
427 427 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f
428 428
429 429
430 430 are included.
431 431
432 432 In a locale definition file, only the characters
433 433 defined for the class digit can be specified, in
434 434 contiguous ascending sequence by numerical value,
435 435 followed by one or more sets of six characters
436 436 representing the hexadecimal digits 10 to 15
437 437 inclusive, with each set in ascending order (for
438 438 example A, B, C, D, E, F, a, b, c, d, e, f). The
439 439 digits 0 to 9, the upper-case letters A to F and the
440 440 lower-case letters a to f of the portable character
441 441 set are automatically included in this class.
442 442
443 443 The definition of character class xdigit requires
444 444 that the characters included in character class digit
445 445 be included here also.
446 446
447 447
448 448 blank
449 449 Define characters to be classified as blank
450 450 characters.
451 451
452 452 In the POSIX locale, only the space and tab
453 453 characters are included.
454 454
455 455 In a locale definition file, the characters space and
456 456 tab are automatically included in this class.
457 457
458 458
459 459 charclass
460 460 Define one or more locale-specific character class
461 461 names as strings separated by semi-colons. Each named
462 462 character class can then be defined subsequently in
463 463 the LC_CTYPE definition. A character class name
464 464 consists of at least one and at most
465 465 {CHARCLASS_NAME_MAX} bytes of alphanumeric characters
466 466 from the portable filename character set. The first
467 467 character of a character class name cannot be a
468 468 digit. The name cannot match any of the LC_CTYPE
469 469 keywords defined in this document.
470 470
471 471
472 472 charclass-name
473 473 Define characters to be classified as belonging to
474 474 the named locale-specific character class. In the
475 475 POSIX locale, the locale-specific named character
476 476 classes need not exist. If a class name is defined by
477 477 a charclass keyword, but no characters are
478 478 subsequently assigned to it, this is not an error; it
479 479 represents a class without any characters belonging
480 480 to it. The charclass-name can be used as the property
481 481 argument to the wctype(3C) function, in regular
482 482 expression and shell pattern-matching bracket
483 483 expressions, and by the tr(1) command.
484 484
485 485
486 486 toupper
487 487 Define the mapping of lower-case letters to upper-
488 488 case letters.
489 489
490 490 In the POSIX locale, at a minimum, the 26 lower-case
491 491 characters:
492 492
493 493 a b c d e f g h i j k l m n o p q r s t u v w x y z
494 494
495 495
496 496 are mapped to the corresponding 26 upper-case
497 497 characters:
498 498
499 499 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
500 500
501 501
502 502 In a locale definition file, the operand consists of
503 503 character pairs, separated by semicolons. The
504 504 characters in each character pair are separated by a
505 505 comma and the pair enclosed by parentheses. The first
506 506 character in each pair is the lower-case letter, the
507 507 second the corresponding upper-case letter. Only
508 508 characters specified for the keywords lower and upper
509 509 can be specified. The lower-case letters a to z, and
510 510 their corresponding upper-case letters A to Z, of the
511 511 portable character set are automatically included in
512 512 this mapping, but only when the toupper keyword is
513 513 omitted from the locale definition.
514 514
515 515
516 516 tolower
517 517 Define the mapping of upper-case letters to lower-
518 518 case letters.
519 519
520 520 In the POSIX locale, at a minimum, the 26 upper-case
521 521 characters:
522 522
523 523 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
524 524
525 525
526 526 are mapped to the corresponding 26 lower-case
527 527 characters:
528 528
529 529 a b c d e f g h i j k l m n o p q r s t u v w x y z
530 530
531 531
532 532 In a locale definition file, the operand consists of
533 533 character pairs, separated by semicolons. The
534 534 characters in each character pair are separated by a
535 535 comma and the pair enclosed by parentheses. The first
536 536 character in each pair is the upper-case letter, the
537 537 second the corresponding lower-case letter. Only
538 538 characters specified for the keywords lower and upper
539 539 can be specified. If the tolower keyword is omitted
540 540 from the locale definition, the mapping will be the
541 541 reverse mapping of the one specified for toupper.
542 542
543 543
544 544 LC_COLLATE
545 545 The LC_COLLATE category provides a collation sequence definition for
546 546 numerous utilities (such as sort(1), uniq(1), and so forth), regular
547 547 expression matching (see regex(5)), and the strcoll(3C), strxfrm(3C),
548 548 wcscoll(3C), and wcsxfrm(3C) functions.
549 549
550 550
551 551 A collation sequence definition defines the relative order between
552 552 collating elements (characters and multi-character collating elements)
553 553 in the locale. This order is expressed in terms of collation values,
554 554 that is, by assigning each element one or more collation values (also
↓ open down ↓ |
554 lines elided |
↑ open up ↑ |
555 555 known as collation weights). The following capabilities are provided:
556 556
557 557 1. Multi-character collating elements. Specification of multi-
558 558 character collating elements (that is, sequences of two or
559 559 more characters to be collated as an entity).
560 560
561 561 2. User-defined ordering of collating elements. Each collating
562 562 element is assigned a collation value defining its order in
563 563 the character (or basic) collation sequence. This ordering
564 564 is used by regular expressions and pattern matching and,
565 - unless collation weights are explicity specified, also as
565 + unless collation weights are explicitly specified, also as
566 566 the collation weight to be used in sorting.
567 567
568 568 3. Multiple weights and equivalence classes. Collating elements
569 569 can be assigned one or more (up to the limit
570 570 {COLL_WEIGHTS_MAX} ) collating weights for use in sorting.
571 571 The first weight is hereafter referred to as the primary
572 572 weight.
573 573
574 574 4. One-to-Many mapping. A single character is mapped into a
575 575 string of collating elements.
576 576
577 577 5. Equivalence class definition. Two or more collating elements
578 578 have the same collation value (primary weight).
579 579
580 580 6. Ordering by weights. When two strings are compared to
581 581 determine their relative order, the two strings are first
582 582 broken up into a series of collating elements. The elements
583 583 in each successive pair of elements are then compared
584 584 according to the relative primary weights for the elements.
585 585 If equal, and more than one weight has been assigned, the
586 586 pairs of collating elements are recompared according to the
587 587 relative subsequent weights, until either a pair of
588 588 collating elements compare unequal or the weights are
589 589 exhausted.
590 590
591 591
592 592 The following keywords are recognized in a collation sequence
593 593 definition. They are described in detail in the following sections.
594 594
595 595 copy
596 596 Specify the name of an existing locale which is
597 597 used as the definition of this category. If this
598 598 keyword is specified, no other keyword is
599 599 specified.
600 600
601 601
602 602 collating-element
603 603 Define a collating-element symbol representing a
604 604 multi-character collating element. This keyword is
605 605 optional.
606 606
607 607
608 608 collating-symbol
609 609 Define a collating symbol for use in collation
610 610 order statements. This keyword is optional.
611 611
612 612
613 613 order_start
614 614 Define collation rules. This statement is followed
615 615 by one or more collation order statements,
616 616 assigning character collation values and collation
617 617 weights to collating elements.
618 618
619 619
620 620 order_end
621 621 Specify the end of the collation-order statements.
622 622
623 623
624 624 collating-element keyword
625 625 In addition to the collating elements in the character set, the
626 626 collating-element keyword is used to define multi-character collating
627 627 elements. The syntax is:
628 628
629 629 "collating-element %s from \"%s\"\n",<collating-symbol>,<string>
630 630
631 631
632 632
633 633 The <collating-symbol> operand is a symbolic name, enclosed between
634 634 angle brackets (< and >), and must not duplicate any symbolic name in
635 635 the current charmap file (if any), or any other symbolic name defined
636 636 in this collation definition. The string operand is a string of two or
637 637 more characters that collates as an entity. A <collating-element>
638 638 defined via this keyword is only recognized with the LC_COLLATE
639 639 category.
640 640
641 641
642 642 Example:
643 643 collating-element <ch> from "<c><h>"
644 644 collating-element <e-acute> from "<acute><e>"
645 645 collating-element <ll> from "ll"
646 646
647 647 collating-symbol keyword
648 648 This keyword will be used to define symbols for use in collation
649 649 sequence statements; that is, between the order_start and the order_end
650 650 keywords. The syntax is:
651 651
652 652 "collating-symbol %s\n",<collating-symbol>
653 653
654 654
655 655
656 656 The <collating-symbol> is a symbolic name, enclosed between angle
657 657 brackets (< and >), and must not duplicate any symbolic name in the
658 658 current charmap file (if any), or any other symbolic name defined in
659 659 this collation definition.
660 660
661 661
662 662 A collating-symbol defined via this keyword is only recognized with the
663 663 LC_COLLATE category.
664 664
665 665
666 666 Example:
667 667 collating-symbol <UPPER_CASE>
668 668 collating-symbol <HIGH>
669 669
670 670
671 671 The collating-symbol keyword defines a symbolic name that can be
672 672 associated with a relative position in the character order sequence.
673 673 While such a symbolic name does not represent any collating element, it
674 674 can be used as a weight.
675 675
676 676 order_start keyword
677 677 The order_start keyword must precede collation order entries and also
678 678 defines the number of weights for this collation sequence definition
679 679 and other collation rules.
680 680
681 681
682 682 The syntax of the order_start keyword is:
683 683
684 684 "order_start %s;%s;...;%s\n",<sort-rules>,<sort-rules>
685 685
686 686
687 687
688 688 The operands to the order_start keyword are optional. If present, the
689 689 operands define rules to be applied when strings are compared. The
690 690 number of operands define how many weights each element is assigned. If
691 691 no operands are present, one forward operand is assumed. If present,
692 692 the first operand defines rules to be applied when comparing strings
693 693 using the first (primary) weight; the second when comparing strings
694 694 using the second weight, and so on. Operands are separated by
695 695 semicolons (;). Each operand consists of one or more collation
696 696 directives, separated by commas (,). If the number of operands exceeds
697 697 the {COLL_WEIGHTS_MAX} limit, the utility will issue a warning message.
698 698 The following directives will be supported:
699 699
700 700 forward
701 701 Specifies that comparison operations for the weight level
702 702 proceed from start of string towards the end of string.
703 703
704 704
705 705 backward
706 706 Specifies that comparison operations for the weight level
707 707 proceed from end of string towards the beginning of string.
708 708
709 709
710 710 position
711 711 Specifies that comparison operations for the weight level
712 712 will consider the relative position of elements in the
713 713 strings not subject to IGNORE. The string containing an
714 714 element not subject to IGNORE after the fewest collating
715 715 elements subject to IGNORE from the start of the compare
716 716 will collate first. If both strings contain a character not
717 717 subject to IGNORE in the same relative position, the
718 718 collating values assigned to the elements will determine
719 719 the ordering. In case of equality, subsequent characters
720 720 not subject to IGNORE are considered in the same manner.
721 721
722 722
723 723
724 724 The directives forward and backward are mutually exclusive.
725 725
726 726
727 727 Example:
728 728
729 729 order_start forward;backward
730 730
731 731
732 732
733 733
734 734 If no operands are specified, a single forward operand is assumed.
735 735
736 736 Collation Order
737 737 The order_start keyword is followed by collating identifier entries.
738 738 The syntax for the collating element entries is:
739 739
740 740 "%s %s;%s;...;%s\n"<collating-identifier>,<weight>,<weight>,...
741 741
742 742
743 743
744 744 Each collating-identifier consists of either a character described in
745 745 Locale Definition above, a <collating-element>, a <collating-symbol>,
746 746 an ellipsis, or the special symbol UNDEFINED. The order in which
747 747 collating elements are specified determines the character order
748 748 sequence, such that each collating element compares less than the
749 749 elements following it. The NUL character compares lower than any other
750 750 character.
751 751
752 752
753 753 A <collating-element> is used to specify multi-character collating
754 754 elements, and indicates that the character sequence specified via the
755 755 <collating-element> is to be collated as a unit and in the relative
756 756 order specified by its place.
757 757
758 758
759 759 A <collating-symbol> is used to define a position in the relative order
760 760 for use in weights. No weights are specified with a <collating-symbol>.
761 761
762 762
763 763 The ellipsis symbol specifies that a sequence of characters will
764 764 collate according to their encoded character values. It is interpreted
765 765 as indicating that all characters with a coded character set value
766 766 higher than the value of the character in the preceding line, and lower
767 767 than the coded character set value for the character in the following
768 768 line, in the current coded character set, will be placed in the
769 769 character collation order between the previous and the following
770 770 character in ascending order according to their coded character set
771 771 values. An initial ellipsis is interpreted as if the preceding line
772 772 specified the NUL character, and a trailing ellipsis as if the
773 773 following line specified the highest coded character set value in the
774 774 current coded character set. An ellipsis is treated as invalid if the
775 775 preceding or following lines do not specify characters in the current
776 776 coded character set. The use of the ellipsis symbol ties the definition
777 777 to a specific coded character set and may preclude the definition from
778 778 being portable between implementations.
779 779
780 780
781 781 The symbol UNDEFINED is interpreted as including all coded character
782 782 set values not specified explicitly or via the ellipsis symbol. Such
783 783 characters are inserted in the character collation order at the point
784 784 indicated by the symbol, and in ascending order according to their
785 785 coded character set values. If no UNDEFINED symbol is specified, and
786 786 the current coded character set contains characters not specified in
787 787 this section, the utility will issue a warning message and place such
788 788 characters at the end of the character collation order.
789 789
790 790
791 791 The optional operands for each collation-element are used to define the
792 792 primary, secondary, or subsequent weights for the collating element.
793 793 The first operand specifies the relative primary weight, the second the
794 794 relative secondary weight, and so on. Two or more collation-elements
795 795 can be assigned the same weight; they belong to the same equivalence
796 796 class if they have the same primary weight. Collation behaves as if,
797 797 for each weight level, elements subject to IGNORE are removed, unless
798 798 the position collation directive is specified for the corresponding
799 799 level with the order_start keyword. Then each successive pair of
800 800 elements is compared according to the relative weights for the
801 801 elements. If the two strings compare equal, the process is repeated for
802 802 the next weight level, up to the limit {COLL_WEIGHTS_MAX}.
803 803
804 804
805 805 Weights are expressed as characters described in Locale Definition
806 806 above, <collating-symbol>s, <collating-element>s, an ellipsis, or the
807 807 special symbol IGNORE. A single character, a <collating-symbol> or a
808 808 <collating-element> represent the relative position in the character
809 809 collating sequence of the character or symbol, rather than the
810 810 character or characters themselves. Thus, rather than assigning
811 811 absolute values to weights, a particular weight is expressed using the
812 812 relative order value assigned to a collating element based on its order
813 813 in the character collation sequence.
814 814
815 815
816 816 One-to-many mapping is indicated by specifying two or more concatenated
817 817 characters or symbolic names. For example, if the character <eszet> is
818 818 given the string "<s><s>" as a weight, comparisons are performed as if
819 819 all occurrences of the character <eszet> are replaced by <s><s>
820 820 (assuming that <s> has the collating weight <s>). If it is necessary to
821 821 define <eszet> and <s><s> as an equivalence class, then a collating
822 822 element must be defined for the string ss.
823 823
824 824
825 825 All characters specified via an ellipsis will by default be assigned
826 826 unique weights, equal to the relative order of characters. Characters
827 827 specified via an explicit or implicit UNDEFINED special symbol will by
828 828 default be assigned the same primary weight (that is, belong to the
829 829 same equivalence class). An ellipsis symbol as a weight is interpreted
830 830 to mean that each character in the sequence has unique weights, equal
831 831 to the relative order of their character in the character collation
832 832 sequence. The use of the ellipsis as a weight is treated as an error if
833 833 the collating element is neither an ellipsis nor the special symbol
834 834 UNDEFINED.
835 835
836 836
837 837 The special keyword IGNORE as a weight indicates that when strings are
838 838 compared using the weights at the level where IGNORE is specified, the
839 839 collating element is ignored; that is, as if the string did not contain
840 840 the collating element. In regular expressions and pattern matching, all
841 841 characters that are subject to IGNORE in their primary weight form an
842 842 equivalence class.
843 843
844 844
845 845 An empty operand is interpreted as the collating element itself.
846 846
847 847
848 848 For example, the order statement:
849 849
850 850 <a> <a>;<a>
851 851
852 852
853 853
854 854
855 855 is equal to:
856 856
857 857 <a>
858 858
859 859
860 860
861 861
862 862 An ellipsis can be used as an operand if the collating element was an
863 863 ellipsis, and is interpreted as the value of each character defined by
864 864 the ellipsis.
865 865
866 866
867 867 The collation order as defined in this section defines the
868 868 interpretation of bracket expressions in regular expressions.
869 869
870 870
871 871 Example:
872 872
873 873
874 874
875 875
876 876 order_start forward;backward
877 877 UNDEFINED IGNORE;IGNORE
878 878 <LOW>
879 879 <space> <LOW>;<space>
880 880 ... <LOW>;...
881 881 <a> <a>;<a>
882 882 <a-acute> <a>;<a-acute>
883 883 <a-grave> <a>;<a-grave>
884 884 <A> <a>;<A>
885 885 <A-acute> <a>;<A-acute>
886 886 <A-grave> <a>;<A-grave>
887 887 <ch> <ch>;<ch>
888 888 <Ch> <ch>;<Ch>
889 889 <s> <s>;<s>
890 890 <eszet> "<s><s>";"<eszet><eszet>"
891 891 order_end
892 892
893 893
894 894
895 895 This example is interpreted as follows:
896 896
897 897 1. The UNDEFINED means that all characters not specified in
898 898 this definition (explicitly or via the ellipsis) are ignored
899 899 for collation purposes; for regular expression purposes they
900 900 are ordered first.
901 901
902 902 2. All characters between <space> and <a> have the same primary
903 903 equivalence class and individual secondary weights based on
904 904 their ordinal encoded values.
905 905
906 906 3. All characters based on the upper- or lower-case character a
907 907 belong to the same primary equivalence class.
908 908
909 909 4. The multi-character collating element <ch> is represented by
910 910 the collating symbol <ch> and belongs to the same primary
911 911 equivalence class as the multi-character collating element
912 912 <Ch>.
913 913
914 914 order_end keyword
915 915 The collating order entries must be terminated with an order_end
916 916 keyword.
917 917
918 918 LC_MONETARY
919 919 The LC_MONETARY category defines the rules and symbols that are used
920 920 to format monetary numeric information. This information is available
921 921 through the localeconv(3C) function
922 922
923 923
924 924 The following items are defined in this category of the locale. The
925 925 item names are the keywords recognized by the localedef(1) utility when
926 926 defining a locale. They are also similar to the member names of the
927 927 lconv structure defined in <locale.h>. The localeconv function returns
928 928 {CHAR_MAX} for unspecified integer items and the empty string ("") for
929 929 unspecified or size zero string items.
930 930
931 931
932 932 In a locale definition file the operands are strings. For some
933 933 keywords, the strings can contain only integers. Keywords that are not
934 934 provided, string values set to the empty string (""), or integer
935 935 keywords set to -1, are used to indicate that the value is not
936 936 available in the locale.
937 937
938 938 int_curr_symbol
939 939 The international currency symbol. The operand is
940 940 a four-character string, with the first three
941 941 characters containing the alphabetic
942 942 international currency symbol in accordance with
943 943 those specified in the ISO 4217 standard. The
944 944 fourth character is the character used to
945 945 separate the international currency symbol from
946 946 the monetary quantity.
947 947
948 948
949 949 currency_symbol
950 950 The string used as the local currency symbol.
951 951
952 952
953 953 mon_decimal_point
954 954 The operand is a string containing the symbol
955 955 that is used as the decimal delimiter (radix
956 956 character) in monetary formatted quantities.
957 957
958 958
959 959 mon_thousands_sep
960 960 The operand is a string containing the symbol
961 961 that is used as a separator for groups of digits
962 962 to the left of the decimal delimiter in formatted
963 963 monetary quantities.
964 964
965 965
966 966 mon_grouping
967 967 Define the size of each group of digits in
968 968 formatted monetary quantities. The operand is a
969 969 sequence of integers separated by semicolons.
970 970 Each integer specifies the number of digits in
971 971 each group, with the initial integer defining the
972 972 size of the group immediately preceding the
973 973 decimal delimiter, and the following integers
974 974 defining the preceding groups. If the last
975 975 integer is not -1, then the size of the previous
976 976 group (if any) will be repeatedly used for the
977 977 remainder of the digits. If the last integer is
978 978 -1, then no further grouping will be performed.
979 979
980 980 The following is an example of the interpretation
981 981 of the mon_grouping keyword. Assuming that the
982 982 value to be formatted is 123456789 and the
983 983 mon_thousands_sep is ', then the following table
984 984 shows the result. The third column shows the
985 985 equivalent string in the ISO C standard that
986 986 would be used by the localeconv function to
987 987 accommodate this grouping.
988 988
989 989 mon_grouping Formatted Value ISO C String
990 990
991 991 3;-1 123456'789 "\3\177"
992 992 3 123'456'789 "\3"
993 993 3;2;-1 1234'56'789 "\3\2\177"
994 994 3;2 12'34'56'789 "\3\2"
995 995 -1 1234567898 "\177"
996 996
997 997
998 998 In these examples, the octal value of {CHAR_MAX}
999 999 is 177.
1000 1000
1001 1001
1002 1002 positive_sign
1003 1003 A string used to indicate a non-negative-valued
1004 1004 formatted monetary quantity.
1005 1005
1006 1006
1007 1007 negative_sign
1008 1008 A string used to indicate a negative-valued
1009 1009 formatted monetary quantity.
1010 1010
1011 1011
1012 1012 int_frac_digits
1013 1013 An integer representing the number of fractional
1014 1014 digits (those to the right of the decimal
1015 1015 delimiter) to be written in a formatted monetary
1016 1016 quantity using int_curr_symbol.
1017 1017
1018 1018
1019 1019 frac_digits
1020 1020 An integer representing the number of fractional
1021 1021 digits (those to the right of the decimal
1022 1022 delimiter) to be written in a formatted monetary
1023 1023 quantity using currency_symbol.
1024 1024
1025 1025
1026 1026 p_cs_precedes
1027 1027 In an application conforming to the SUSv3
1028 1028 standard, an integer set to 1 if the
1029 1029 currency_symbol precedes the value for a monetary
1030 1030 quantity with a non-negative value, and set to 0
1031 1031 if the symbol succeeds the value.
1032 1032
1033 1033 In an application not conforming to the SUSv3
1034 1034 standard, an integer set to 1 if the
1035 1035 currency_symbol or int_currency_symbol precedes
1036 1036 the value for a monetary quantity with a non-
1037 1037 negative value, and set to 0 if the symbol
1038 1038 succeeds the value.
1039 1039
1040 1040
1041 1041 p_sep_by_space
1042 1042 In an application conforming to the SUSv3
1043 1043 standard, an integer set to 0 if no space
1044 1044 separates the currency_symbol from the value for
1045 1045 a monetary quantity with a non-negative value,
1046 1046 set to 1 if a space separates the symbol from the
1047 1047 value, and set to 2 if a space separates the
1048 1048 symbol and the sign string, if adjacent.
1049 1049
1050 1050 In an application not conforming to the SUSv3
1051 1051 standard, an integer set to 0 if no space
1052 1052 separates the currency_symbol or int_curr_symbol
1053 1053 from the value for a monetary quantity with a
1054 1054 non-negative value, set to 1 if a space separates
1055 1055 the symbol from the value, and set to 2 if a
1056 1056 space separates the symbol and the sign string,
1057 1057 if adjacent.
1058 1058
1059 1059
1060 1060 n_cs_precedes
1061 1061 In an application conforming to the SUSv3
1062 1062 standard, an integer set to 1 if the
1063 1063 currency_symbol precedes the value for a monetary
1064 1064 quantity with a negative value, and set to 0 if
1065 1065 the symbol succeeds the value.
1066 1066
1067 1067 In an application not conforming to the SUSv3
1068 1068 standard, an integer set to 1 if the
1069 1069 currency_symbol or int_currency_symbol precedes
1070 1070 the value for a monetary quantity with a negative
1071 1071 value, and set to 0 if the symbol succeeds the
1072 1072 value.
1073 1073
1074 1074
1075 1075 n_sep_by_space
1076 1076 In an application conforming to the SUSv3
1077 1077 standard, an integer set to 0 if no space
1078 1078 separates the currency_symbol from the value for
1079 1079 a monetary quantity with a negative value, set to
1080 1080 1 if a space separates the symbol from the value,
1081 1081 and set to 2 if a space separates the symbol and
1082 1082 the sign string, if adjacent.
1083 1083
1084 1084 In an application not conforming to the SUSv3
1085 1085 standard, an integer set to 0 if no space
1086 1086 separates the currency_symbol or int_curr_symbol
1087 1087 from the value for a monetary quantity with a
1088 1088 negative value, set to 1 if a space separates the
1089 1089 symbol from the value, and set to 2 if a space
1090 1090 separates the symbol and the sign string, if
1091 1091 adjacent.
1092 1092
1093 1093
1094 1094 p_sign_posn
1095 1095 An integer set to a value indicating the
1096 1096 positioning of the positive_sign for a monetary
1097 1097 quantity with a non-negative value. The following
1098 1098 integer values are recognized for both
1099 1099 p_sign_posn and n_sign_posn:
1100 1100
1101 1101 In an application conforming to the SUSv3
1102 1102 standard:
1103 1103
1104 1104 0
1105 1105 Parentheses enclose the quantity and the
1106 1106 currency_symbol.
1107 1107
1108 1108
1109 1109 1
1110 1110 The sign string precedes the quantity and
1111 1111 the currency_symbol.
1112 1112
1113 1113
1114 1114 2
1115 1115 The sign string succeeds the quantity and
1116 1116 the currency_symbol.
1117 1117
1118 1118
1119 1119 3
1120 1120 The sign string precedes the
1121 1121 currency_symbol.
1122 1122
1123 1123
1124 1124 4
1125 1125 The sign string succeeds the
1126 1126 currency_symbol.
1127 1127
1128 1128 In an application not conforming to the SUSv3
1129 1129 standard:
1130 1130
1131 1131 0
1132 1132 Parentheses enclose the quantity and the
1133 1133 currency_symbol or int_curr_symbol.
1134 1134
1135 1135
1136 1136 1
1137 1137 The sign string precedes the quantity and
1138 1138 the currency_symbol or int_curr_symbol.
1139 1139
1140 1140
1141 1141 2
1142 1142 The sign string succeeds the quantity and
1143 1143 the currency_symbol or int_curr_symbol.
1144 1144
1145 1145
1146 1146 3
1147 1147 The sign string precedes the currency_symbol
1148 1148 or int_curr_symbol.
1149 1149
1150 1150
1151 1151 4
1152 1152 The sign string succeeds the currency_symbol
1153 1153 or int_curr_symbol.
1154 1154
1155 1155
1156 1156
1157 1157 n_sign_posn
1158 1158 An integer set to a value indicating the
1159 1159 positioning of the negative_sign for a negative
1160 1160 formatted monetary quantity.
1161 1161
1162 1162
1163 1163 int_p_cs_precedes
1164 1164 An integer set to 1 if the int_curr_symbol
1165 1165 precedes the value for a monetary quantity with a
1166 1166 non-negative value, and set to 0 if the symbol
1167 1167 succeeds the value.
1168 1168
1169 1169
1170 1170 int_n_cs_precedes
1171 1171 An integer set to 1 if the int_curr_symbol
1172 1172 precedes the value for a monetary quantity with a
1173 1173 negative value, and set to 0 if the symbol
1174 1174 succeeds the value.
1175 1175
1176 1176
1177 1177 int_p_sep_by_space
1178 1178 An integer set to 0 if no space separates the
1179 1179 int_curr_symbol from the value for a monetary
1180 1180 quantity with a non-negative value, set to 1 if a
1181 1181 space separates the symbol from the value, and
1182 1182 set to 2 if a space separates the symbol and the
1183 1183 sign string, if adjacent.
1184 1184
1185 1185
1186 1186 int_n_sep_by_space
1187 1187 An integer set to 0 if no space separates the
1188 1188 int_curr_symbol from the value for a monetary
1189 1189 quantity with a negative value, set to 1 if a
1190 1190 space separates the symbol from the value, and
1191 1191 set to 2 if a space separates the symbol and the
1192 1192 sign string, if adjacent.
1193 1193
1194 1194
1195 1195 int_p_sign_posn
1196 1196 An integer set to a value indicating the
1197 1197 positioning of the positive_sign for a positive
1198 1198 monetary quantity formatted with the
1199 1199 international format. The following integer
1200 1200 values are recognized for int_p_sign_posn and
1201 1201 int_n_sign_posn:
1202 1202
1203 1203 0
1204 1204 Parentheses enclose the quantity and the
1205 1205 int_curr_symbol.
1206 1206
1207 1207
1208 1208 1
1209 1209 The sign string precedes the quantity and
1210 1210 the int_curr_symbol.
1211 1211
1212 1212
1213 1213 2
1214 1214 The sign string precedes the quantity and
1215 1215 the int_curr_symbol.
1216 1216
1217 1217
1218 1218 3
1219 1219 The sign string precedes the
1220 1220 int_curr_symbol.
1221 1221
1222 1222
1223 1223 4
1224 1224 The sign string succeeds the
1225 1225 int_curr_symbol.
1226 1226
1227 1227
1228 1228
1229 1229 int_n_sign_posn
1230 1230 An integer set to a value indicating the
1231 1231 positioning of the negative_sign for a negative
1232 1232 monetary quantity formatted with the
1233 1233 international format.
1234 1234
1235 1235
1236 1236
1237 1237 The following table shows the result of various combinations:
1238 1238
1239 1239
1240 1240
1241 1241
1242 1242 p_sep_by_space
1243 1243 2 1 0
1244 1244 p_cs_precedes= 1 p_sign_posn= 0 ($1.25) ($1.25) ($1.25)
1245 1245 p_sign_posn= 1 +$1.25 +$1.25 +$1.25
1246 1246 p_sign_posn= 2 $1.25+ $1.25+ $1.25+
1247 1247 p_sign_posn= 3 +$1.25 +$1.25 +$1.25
1248 1248 p_sign_posn= 4 $+1.25 $+1.25 $+1.25
1249 1249 p_cs_precedes= 0 p_sign_posn= 0 (1.25 $) (1.25 $) (1.25$)
1250 1250 p_sign_posn= 1 +1.25 $ +1.25 $ +1.25$
1251 1251 p_sign_posn= 2 1.25$ + 1.25 $+ 1.25$+
1252 1252 p_sign_posn= 3 1.25+ $ 1.25 +$ 1.25+$
1253 1253 p_sign_posn= 4 1.25$ + 1.25 $+ 1.25$+
1254 1254
1255 1255
1256 1256
1257 1257 The monetary formatting definitions for the POSIX locale follow. The
1258 1258 code listing depicts the localedef(1) input, the table representing the
1259 1259 same information with the addition of localeconv(3C) and
1260 1260 nl_langinfo(3C) formats. All values are unspecified in the POSIX
1261 1261 locale.
1262 1262
1263 1263 LC_MONETARY
1264 1264 # This is the POSIX locale definition for
1265 1265 # the LC_MONETARY category.
1266 1266 #
1267 1267 int_curr_symbol ""
1268 1268 currency_symbol ""
1269 1269 mon_decimal_point ""
1270 1270 mon_thousands_sep ""
1271 1271 mon_grouping -1
1272 1272 positive_sign ""
1273 1273 negative_sign ""
1274 1274 int_frac_digits -1
1275 1275 frac_digits -1
1276 1276 p_cs_precedes -1
1277 1277 p_sep_by_space -1
1278 1278 n_cs_precedes -1
1279 1279 n_sep_by_space -1
1280 1280 p_sign_posn -1
1281 1281 n_sign_posn -1
1282 1282 int_p_cs_precedes -1
1283 1283 int_p_sep_by_space -1
1284 1284 int_n_cs_precedes -1
1285 1285 int_n_sep_by_space -1
1286 1286 int_p_sign_posn -1
1287 1287 int_n_sign_posn -1
1288 1288 #
1289 1289 END LC_MONETARY
1290 1290
1291 1291
1292 1292
1293 1293
1294 1294 The entry n/a indicates that the value is not available in the POSIX
1295 1295 locale.
1296 1296
1297 1297 LC_NUMERIC
1298 1298 The LC_NUMERIC category defines the rules and symbols that will be
1299 1299 used to format non-monetary numeric information. This information is
1300 1300 available through the localeconv(3C) function.
1301 1301
1302 1302
1303 1303 The following items are defined in this category of the locale. The
1304 1304 item names are the keywords recognized by the localedef utility when
1305 1305 defining a locale. They are also similar to the member names of the
1306 1306 lconv structure defined in <locale.h>. The localeconv() function
1307 1307 returns {CHAR_MAX} for unspecified integer items and the empty string
1308 1308 ("") for unspecified or size zero string items.
1309 1309
1310 1310
1311 1311 In a locale definition file the operands are strings. For some
1312 1312 keywords, the strings only can contain integers. Keywords that are not
1313 1313 provided, string values set to the empty string (""), or integer
1314 1314 keywords set to -1, will be used to indicate that the value is not
1315 1315 available in the locale. The following keywords are recognized:
1316 1316
1317 1317 decimal_point
1318 1318 The operand is a string containing the symbol that is
1319 1319 used as the decimal delimiter (radix character) in
1320 1320 numeric, non-monetary formatted quantities. This
1321 1321 keyword cannot be omitted and cannot be set to the
1322 1322 empty string. In contexts where standards limit the
1323 1323 decimal_point to a single byte, the result of
1324 1324 specifying a multi-byte operand is unspecified.
1325 1325
1326 1326
1327 1327 thousands_sep
1328 1328 The operand is a string containing the symbol that is
1329 1329 used as a separator for groups of digits to the left
1330 1330 of the decimal delimiter in numeric, non-monetary
1331 1331 formatted monetary quantities. In contexts where
1332 1332 standards limit the thousands_sep to a single byte,
1333 1333 the result of specifying a multi-byte operand is
1334 1334 unspecified.
1335 1335
1336 1336
1337 1337 grouping
1338 1338 Define the size of each group of digits in formatted
1339 1339 non-monetary quantities. The operand is a sequence of
1340 1340 integers separated by semicolons. Each integer
1341 1341 specifies the number of digits in each group, with the
1342 1342 initial integer defining the size of the group
1343 1343 immediately preceding the decimal delimiter, and the
1344 1344 following integers defining the preceding groups. If
1345 1345 the last integer is not -1, then the size of the
1346 1346 previous group (if any) will be repeatedly used for
1347 1347 the remainder of the digits. If the last integer is
1348 1348 -1, then no further grouping will be performed. The
1349 1349 non-monetary numeric formatting definitions for the
1350 1350 POSIX locale follow. The code listing depicts the
1351 1351 localedef input, the table representing the same
1352 1352 information with the addition of localeconv values,
1353 1353 and nl_langinfo constants.
1354 1354
1355 1355 LC_NUMERIC
1356 1356 # This is the POSIX locale definition for
1357 1357 # the LC_NUMERIC category.
1358 1358 #
1359 1359 decimal_point "<period>"
1360 1360 thousands_sep ""
1361 1361 grouping -1
1362 1362 #
1363 1363 END LC_NUMERIC
1364 1364
1365 1365
1366 1366
1367 1367
1368 1368
1369 1369
1370 1370
1371 1371 POSIX locale langinfo localeconv() localedef
1372 1372 Item Value Constant Value Value
1373 1373 --------------------------------------------------------------------
1374 1374 decimal_point "." RADIXCHAR "." .
1375 1375 thousands_sep n/a THOUSEP "" ""
1376 1376 grouping n/a - "" -1
1377 1377
1378 1378
1379 1379
1380 1380 The entry n/a indicates that the value is not available in the POSIX
1381 1381 locale.
1382 1382
1383 1383 LC_TIME
1384 1384 The LC_TIME category defines the interpretation of the field
1385 1385 descriptors supported by date(1) and affects the behavior of the
1386 1386 strftime(3C), wcsftime(3C), strptime(3C), and nl_langinfo(3C)
1387 1387 functions. Because the interfaces for C-language access and locale
1388 1388 definition differ significantly, they are described separately. For
1389 1389 locale definition, the following mandatory keywords are recognized:
1390 1390
1391 1391 abday
1392 1392 Define the abbreviated weekday names, corresponding to
1393 1393 the %a field descriptor (conversion specification in the
1394 1394 strftime(), wcsftime(), and strptime() functions). The
1395 1395 operand consists of seven semicolon-separated strings,
1396 1396 each surrounded by double-quotes. The first string is
1397 1397 the abbreviated name of the day corresponding to Sunday,
1398 1398 the second the abbreviated name of the day corresponding
1399 1399 to Monday, and so on.
1400 1400
1401 1401
1402 1402 day
1403 1403 Define the full weekday names, corresponding to the %A
1404 1404 field descriptor. The operand consists of seven
1405 1405 semicolon-separated strings, each surrounded by double-
1406 1406 quotes. The first string is the full name of the day
1407 1407 corresponding to Sunday, the second the full name of the
1408 1408 day corresponding to Monday, and so on.
1409 1409
1410 1410
1411 1411 abmon
1412 1412 Define the abbreviated month names, corresponding to the
1413 1413 %b field descriptor. The operand consists of twelve
1414 1414 semicolon-separated strings, each surrounded by double-
1415 1415 quotes. The first string is the abbreviated name of the
1416 1416 first month of the year (January), the second the
1417 1417 abbreviated name of the second month, and so on.
1418 1418
1419 1419
1420 1420 mon
1421 1421 Define the full month names, corresponding to the %B
1422 1422 field descriptor. The operand consists of twelve
1423 1423 semicolon-separated strings, each surrounded by double-
1424 1424 quotes. The first string is the full name of the first
1425 1425 month of the year (January), the second the full name of
1426 1426 the second month, and so on.
1427 1427
1428 1428
1429 1429 d_t_fmt
1430 1430 Define the appropriate date and time representation,
1431 1431 corresponding to the %c field descriptor. The operand
1432 1432 consists of a string, and can contain any combination of
1433 1433 characters and field descriptors. In addition, the
1434 1434 string can contain the escape sequences \\, \a, \b, \f,
1435 1435 \n, \r, \t, \v.
1436 1436
1437 1437
1438 1438 date_fmt
1439 1439 Define the appropriate date and time representation,
1440 1440 corresponding to the %C field descriptor. The operand
1441 1441 consists of a string, and can contain any combination of
1442 1442 characters and field descriptors. In addition, the
1443 1443 string can contain the escape sequences \\, \a, \b, \f,
1444 1444 \n, \r, \t, \v.
1445 1445
1446 1446
1447 1447 d_fmt
1448 1448 Define the appropriate date representation,
1449 1449 corresponding to the %x field descriptor. The operand
1450 1450 consists of a string, and can contain any combination of
1451 1451 characters and field descriptors. In addition, the
1452 1452 string can contain the escape sequences \\, \a, \b, \f,
1453 1453 \n, \r, \t, \v.
1454 1454
1455 1455
1456 1456 t_fmt
1457 1457 Define the appropriate time representation,
1458 1458 corresponding to the %X field descriptor. The operand
1459 1459 consists of a string, and can contain any combination of
1460 1460 characters and field descriptors. In addition, the
1461 1461 string can contain the escape sequences \\, \a, \b, \f,
1462 1462 \n, \r, \t, \v.
1463 1463
1464 1464
1465 1465 am_pm
1466 1466 Define the appropriate representation of the ante
1467 1467 meridiem and post meridiem strings, corresponding to the
1468 1468 %p field descriptor. The operand consists of two
1469 1469 strings, separated by a semicolon, each surrounded by
1470 1470 double-quotes. The first string represents the ante
1471 1471 meridiem designation, the last string the post meridiem
1472 1472 designation.
1473 1473
1474 1474
1475 1475 t_fmt_ampm
1476 1476 Define the appropriate time representation in the
1477 1477 12-hour clock format with am_pm, corresponding to the %r
1478 1478 field descriptor. The operand consists of a string and
1479 1479 can contain any combination of characters and field
1480 1480 descriptors. If the string is empty, the 12-hour format
1481 1481 is not supported in the locale.
1482 1482
1483 1483
1484 1484 era
1485 1485 Define how years are counted and displayed for each era
1486 1486 in a locale. The operand consists of semicolon-separated
1487 1487 strings. Each string is an era description segment with
1488 1488 the format:
1489 1489
1490 1490 direction:offset:start_date:end_date:era_name:era_format
1491 1491
1492 1492 according to the definitions below. There can be as
1493 1493 many era description segments as are necessary to
1494 1494 describe the different eras.
1495 1495
1496 1496 The start of an era might not be the earliest point For
1497 1497 example, the Christian era B.C. starts on the day before
1498 1498 January 1, A.D. 1, and increases with earlier time.
1499 1499
1500 1500 direction
1501 1501 Either a + or a - character. The +
1502 1502 character indicates that years closer to
1503 1503 the start_date have lower numbers than
1504 1504 those closer to the end_date. The -
1505 1505 character indicates that years closer to
1506 1506 the start_date have higher numbers than
1507 1507 those closer to the end_date.
1508 1508
1509 1509
1510 1510 offset
1511 1511 The number of the year closest to the
1512 1512 start_date in the era, corresponding to
1513 1513 the %Eg and %Ey field descriptors.
1514 1514
1515 1515
1516 1516 start_date
1517 1517 A date in the form yyyy/mm/dd, where yyyy,
1518 1518 mm, and dd are the year, month and day
1519 1519 numbers respectively of the start of the
1520 1520 era. Years prior to A.D. 1 are represented
1521 1521 as negative numbers.
1522 1522
1523 1523
1524 1524 end_date
1525 1525 The ending date of the era, in the same
1526 1526 format as the start_date, or one of the
1527 1527 two special values -* or +*. The value -*
1528 1528 indicates that the ending date is the
1529 1529 beginning of time. The value +* indicates
1530 1530 that the ending date is the end of time.
1531 1531
1532 1532
1533 1533 era_name
1534 1534 A string representing the name of the era,
1535 1535 corresponding to the %EC field descriptor.
1536 1536
1537 1537
1538 1538 era_format
1539 1539 A string for formatting the year in the
1540 1540 era, corresponding to the %EG and %EY
1541 1541 field descriptors.
1542 1542
1543 1543
1544 1544
1545 1545 era_d_fmt
1546 1546 Define the format of the date in alternative era
1547 1547 notation, corresponding to the %Ex field descriptor.
1548 1548
1549 1549
1550 1550 era_t_fmt
1551 1551 Define the locale's appropriate alternative time format,
1552 1552 corresponding to the %EX field descriptor.
1553 1553
1554 1554
1555 1555 era_d_t_fmt
1556 1556 Define the locale's appropriate alternative date and
1557 1557 time format, corresponding to the %Ec field descriptor.
1558 1558
1559 1559
1560 1560 alt_digits
1561 1561 Define alternative symbols for digits, corresponding to
1562 1562 the %O field descriptor modifier. The operand consists
1563 1563 of semicolon-separated strings, each surrounded by
1564 1564 double-quotes. The first string is the alternative
1565 1565 symbol corresponding with zero, the second string the
1566 1566 symbol corresponding with one, and so on. Up to 100
1567 1567 alternative symbol strings can be specified. The %O
1568 1568 modifier indicates that the string corresponding to the
1569 1569 value specified via the field descriptor will be used
1570 1570 instead of the value.
1571 1571
1572 1572
1573 1573 LC_TIME C-language Access
1574 1574 The following information can be accessed. These correspond to
1575 1575 constants defined in <langinfo.h> and used as arguments to the
1576 1576 nl_langinfo(3C) function.
1577 1577
1578 1578 ABDAY_x
1579 1579 The abbreviated weekday names (for example Sun), where x
1580 1580 is a number from 1 to 7.
1581 1581
1582 1582
1583 1583 DAY_x
1584 1584 The full weekday names (for example Sunday), where x is
1585 1585 a number from 1 to 7.
1586 1586
1587 1587
1588 1588 ABMON_x
1589 1589 The abbreviated month names (for example Jan), where x
1590 1590 is a number from 1 to 12.
1591 1591
1592 1592
1593 1593 MON_x
1594 1594 The full month names (for example January), where x is a
1595 1595 number from 1 to 12.
1596 1596
1597 1597
1598 1598 D_T_FMT
1599 1599 The appropriate date and time representation.
1600 1600
1601 1601
1602 1602 D_FMT
1603 1603 The appropriate date representation.
1604 1604
1605 1605
1606 1606 T_FMT
1607 1607 The appropriate time representation.
1608 1608
1609 1609
1610 1610 AM_STR
1611 1611 The appropriate ante-meridiem affix.
1612 1612
1613 1613
1614 1614 PM_STR
1615 1615 The appropriate post-meridiem affix.
1616 1616
1617 1617
1618 1618 T_FMT_AMPM
1619 1619 The appropriate time representation in the 12-hour clock
1620 1620 format with AM_STR and PM_STR.
1621 1621
1622 1622
1623 1623 ERA
1624 1624 The era description segments, which describe how years
1625 1625 are counted and displayed for each era in a locale. Each
1626 1626 era description segment has the format:
1627 1627
1628 1628 direction:offset:start_date:end_date:era_name:era_format
1629 1629
1630 1630
1631 1631 according to the definitions below. There will be as
1632 1632 many era description segments as are necessary to
1633 1633 describe the different eras. Era description segments
1634 1634 are separated by semicolons.
1635 1635
1636 1636 The start of an era might not be the earliest point For
1637 1637 example, the Christian era B.C. starts on the day before
1638 1638 January 1, A.D. 1, and increases with earlier time.
1639 1639
1640 1640 direction
1641 1641 Either a + or a - character. The +
1642 1642 character indicates that years closer to
1643 1643 the start_date have lower numbers than
1644 1644 those closer to the end_date. The -
1645 1645 character indicates that years closer to
1646 1646 the start_date have higher numbers than
1647 1647 those closer to the end_date.
1648 1648
1649 1649
1650 1650 offset
1651 1651 The number of the year closest to the
1652 1652 start_date in the era.
1653 1653
1654 1654
1655 1655 start_date
1656 1656 A date in the form yyyy/mm/dd, where yyyy,
1657 1657 mm, and dd are the year, month and day
1658 1658 numbers respectively of the start of the
1659 1659 era. Years prior to AD 1 are represented
1660 1660 as negative numbers.
1661 1661
1662 1662
1663 1663 end_date
1664 1664 The ending date of the era, in the same
1665 1665 format as the start_date, or one of the
1666 1666 two special values, -* or +*. The value -*
1667 1667 indicates that the ending date is the
1668 1668 beginning of time. The value +* indicates
1669 1669 that the ending date is the end of time.
1670 1670
1671 1671
1672 1672 era_name
1673 1673 The era, corresponding to the %EC
1674 1674 conversion specification.
1675 1675
1676 1676
1677 1677 era_format
1678 1678 The format of the year in the era,
1679 1679 corresponding to the %EY and %EY
1680 1680 conversion specifications.
1681 1681
1682 1682
1683 1683
1684 1684 ERA_D_FMT
1685 1685 The era date format.
1686 1686
1687 1687
1688 1688 ERA_T_FMT
1689 1689 The locale's appropriate alternative time format,
1690 1690 corresponding to the %EX field descriptor.
1691 1691
1692 1692
1693 1693 ERA_D_T_FMT
1694 1694 The locale's appropriate alternative date and time
1695 1695 format, corresponding to the %Ec field descriptor.
1696 1696
1697 1697
1698 1698 ALT_DIGITS
1699 1699 The alternative symbols for digits, corresponding to the
1700 1700 %O conversion specification modifier. The value consists
1701 1701 of semicolon-separated symbols. The first is the
1702 1702 alternative symbol corresponding to zero, the second is
1703 1703 the symbol corresponding to one, and so on. Up to 100
1704 1704 alternative symbols may be specified. The following
1705 1705 table displays the correspondence between the items
1706 1706 described above and the conversion specifiers used by
1707 1707 date(1) and the strftime(3C), wcsftime(3C), and
1708 1708 strptime(3C) functions.
1709 1709
1710 1710
1711 1711
1712 1712
1713 1713
1714 1714 +------------+-------------+---------------+
1715 1715 | localedef | langinfo | Conversion |
1716 1716 | Keyword | Constant | Specifier |
1717 1717 +------------+-------------+---------------+
1718 1718 | abday | ABDAY_x | %a |
1719 1719 | day | DAY_x | %A |
1720 1720 | abmon | ABMON_x | %b |
1721 1721 | mon | MON | %B |
1722 1722 | d_t_fmt | D_T_FMT | %c |
1723 1723 | date_fmt | DATE_FMT | %C |
1724 1724 | d_fmt | D_FMT | %x |
1725 1725 | t_fmt | T_FMT | %X |
1726 1726 | am_pm | AM_STR | %p |
1727 1727 | am_pm | PM_STR | %p |
1728 1728 |t_fmt_ampm | T_FMT_AMPM | %r |
1729 1729 | era | ERA | %EC, %Eg, |
1730 1730 | | | %EG, %Ey, %EY |
1731 1731 | era_d_fmt | ERA_D_FMT | %Ex |
1732 1732 | era_t_fmt | ERA_T_FMT | %EX |
1733 1733 |era_d_t_fmt | ERA_D_T_FMT | %Ec |
1734 1734 |alt_digits | ALT_DIGITS | %O |
1735 1735 +------------+-------------+---------------+
1736 1736
1737 1737 LC_TIME General Information
1738 1738 Although certain of the field descriptors in the POSIX locale (such as
1739 1739 the name of the month) are shown with initial capital letters, this
1740 1740 need not be the case in other locales. Programs using these fields may
1741 1741 need to adjust the capitalization if the output is going to be used at
1742 1742 the beginning of a sentence.
1743 1743
1744 1744
1745 1745 The LC_TIME descriptions of abday, day, mon, and abmon imply a
1746 1746 Gregorian style calendar (7-day weeks, 12-month years, leap years, and
1747 1747 so forth). Formatting time strings for other types of calendars is
1748 1748 outside the scope of this document set.
1749 1749
1750 1750
1751 1751 As specified under date in Locale Definition and strftime(3C), the
1752 1752 field descriptors corresponding to the optional keywords consist of a
1753 1753 modifier followed by a traditional field descriptor (for instance %Ex).
1754 1754 If the optional keywords are not supported by the implementation or are
1755 1755 unspecified for the current locale, these field descriptors are treated
1756 1756 as the traditional field descriptor. For instance, assume the following
1757 1757 keywords:
1758 1758
1759 1759 alt_digits "0th" ; "1st" ; "2nd" ; "3rd" ; "4th" ; "5th" ; \
1760 1760 "6th" ; "7th" ; "8th" ; "9th" ; "10th">
1761 1761 d_fmt "The %Od day of %B in %Y"
1762 1762
1763 1763
1764 1764
1765 1765
1766 1766 On 7/4/1776, the %x field descriptor would result in "The 4th day of
1767 1767 July in 1776" while 7/14/1789 would come out as "The 14 day of July in
1768 1768 1789" The above example is for illustrative purposes only. The %O
1769 1769 modifier is primarily intended to provide for Kanji or Hindi digits in
1770 1770 date formats.
1771 1771
1772 1772 LC_MESSAGES
1773 1773 The LC_MESSAGES category defines the format and values for affirmative
1774 1774 and negative responses.
1775 1775
1776 1776
1777 1777 The following keywords are recognized as part of the locale definition
1778 1778 file. The nl_langinfo(3C) function accepts upper-case versions of the
1779 1779 first four keywords.
1780 1780
1781 1781 yesexpr
1782 1782 The operand consists of an extended regular expression (see
1783 1783 regex(5)) that describes the acceptable affirmative response
1784 1784 to a question expecting an affirmative or negative response.
1785 1785
1786 1786
1787 1787 noexpr
1788 1788 The operand consists of an extended regular expression that
1789 1789 describes the acceptable negative response to a question
1790 1790 expecting an affirmative or negative response.
1791 1791
1792 1792
1793 1793 yesstr
1794 1794 The operand consists of a fixed string (not a regular
1795 1795 expression) that can be used by an application for
1796 1796 composition of a message that lists an acceptable
1797 1797 affirmative response, such as in a prompt.
1798 1798
1799 1799
1800 1800 nostr
1801 1801 The operand consists of a fixed string that can be used by
1802 1802 an application for composition of a message that lists an
1803 1803 acceptable negative response. The format and values for
1804 1804 affirmative and negative responses of the POSIX locale
1805 1805 follow; the code listing depicting the localedef input, the
1806 1806 table representing the same information with the addition of
1807 1807 nl_langinfo() constants.
1808 1808
1809 1809 LC_MESSAGES
1810 1810 # This is the POSIX locale definition for
1811 1811 # the LC_MESSAGES category.
1812 1812 #
1813 1813 yesexpr "<circumflex><left-square-bracket><y><Y>\
1814 1814 <right-square-bracket>"
1815 1815 #
1816 1816 noexpr "<circumflex><left-square-bracket><n><N>\
1817 1817 <right-square-bracket>"
1818 1818 #
1819 1819 yesstr "yes"
1820 1820 nostr "no"
1821 1821 END LC_MESSAGES
1822 1822
1823 1823
1824 1824
1825 1825
1826 1826
1827 1827
1828 1828
1829 1829 +------------------+-------------------+--------------------+
1830 1830 |localedef Keyword | langinfo Constant | POSIX Locale Value |
1831 1831 |yesexpr | YESEXPR | "^[yY]" |
1832 1832 |noexpr | NOEXPR | "^[nN]" |
1833 1833 |yesstr | YESSTR | "yes" |
1834 1834 |nostr | NOSTR | "no" |
1835 1835 +------------------+-------------------+--------------------+
1836 1836
1837 1837
1838 1838 In an application conforming to the SUSv3 standard, the information on
1839 1839 yesstr and nostr is not available.
↓ open down ↓ |
1264 lines elided |
↑ open up ↑ |
1840 1840
1841 1841 SEE ALSO
1842 1842 date(1), locale(1), localedef(1), sort(1), tr(1), uniq(1),
1843 1843 localeconv(3C), nl_langinfo(3C), setlocale(3C), strcoll(3C),
1844 1844 strftime(3C), strptime(3C), strxfrm(3C), wcscoll(3C), wcsftime(3C),
1845 1845 wcsxfrm(3C), wctype(3C), attributes(5), charmap(5), extensions(5),
1846 1846 regex(5)
1847 1847
1848 1848
1849 1849
1850 - April 9, 2016 LOCALE(5)
1850 + May 16, 2020 LOCALE(5)
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX