1 BYTEORDER(5)          Standards, Environments, and Macros         BYTEORDER(5)
   2 
   3 NAME
   4      byteorder, endian - byte order and endianness
   5 
   6 DESCRIPTION
   7      Integer values which occupy more than 1 byte in memory can be laid out in
   8      different ways on different platforms.  In particular, there is a major
   9      split between those which place the least significant byte of an integer
  10      at the lowest address, and those which place the most significant byte
  11      there instead.  As this difference relates to which end of the integer is
  12      found in memory first, the term endian is used to refer to a particular
  13      byte order.
  14 
  15      A platform is referred to as using a big-endian byte order when it places
  16      the most significant byte at the lowest address, and little-endian when
  17      it places the least significant byte first.  Some platforms may also
  18      switch between big- and little-endian mode and run code compiled for
  19      either.
  20 
  21      Historically, there have also been some systems that utilized
  22      middle-endian byte orders for integers larger than 2 bytes.  Such
  23      orderings are not in common use today.
  24 
  25      Endianness is also of particular importance when dealing with values that
  26      are being read into memory from an external source.  For example, network
  27      protocols such as IP conventionally define the fields in a packet as
  28      being always stored in big-endian byte order.  This means that a little-
  29      endian machine will have to perform transformations on these fields in
  30      order to process them.
  31 
  32    Examples
  33      To illustrate endianness in memory, let us consider the decimal integer
  34      2864434397.  This number fits in 32 bits of storage (4 bytes).
  35 
  36      On a big-endian system, this integer would be written into memory as the
  37      bytes 0xAA, 0xBB, 0xCC, 0xDD, in order from lowest memory address to
  38      highest.
  39 
  40      On a little-endian system, it would be written instead as the bytes 0xDD,
  41      0xCC, 0xBB, 0xAA, in that order.
  42 
  43      If both the big- and little-endian systems were asked to store this
  44      integer at address 0x100, we would see the following in each of their
  45      memory:
  46 
  47 
  48                          Big-Endian
  49 
  50              ++------++------++------++------++
  51              || 0xAA || 0xBB || 0xCC || 0xDD ||
  52              ++------++------++------++------++
  53                  ^^      ^^      ^^      ^^
  54                0x100   0x101   0x102   0x103
  55                  vv      vv      vv      vv
  56              ++------++------++------++------++
  57              || 0xDD || 0xCC || 0xBB || 0xAA ||
  58              ++------++------++------++------++
  59 
  60                        Little-Endian
  61 
  62      It is particularly important to note that even though the byte order is
  63      different between these two machines, the bit ordering within each byte,
  64      by convention, is still the same.
  65 
  66      For example, take the decimal integer 4660, which occupies in 16 bits (2
  67      bytes).
  68 
  69      On a big-endian system, this would be written into memory as 0x12, then
  70      0x34.
  71 
  72      On a little-endian system, it would be written as 0x34, then 0x12.  Note
  73      that this is not at all the same as seeing 0x43 then 0x21 in memory --
  74      only the bytes are re-ordered, not any bits (or nybbles) within them.
  75 
  76      As before, storing this at address 0x100:
  77 
  78                          Big-Endian
  79 
  80                      ++------++------++
  81                      || 0x12 || 0x34 ||
  82                      ++------++------++
  83                          ^^      ^^
  84                        0x100   0x101
  85                          vv      vv
  86                      ++------++------++
  87                      || 0x34 || 0x12 ||
  88                      ++------++------++
  89 
  90                         Little-Endian
  91 
  92      This example shows how an eight byte number, 0xBADCAFEDEADBEEF is stored
  93      in both big and little-endian:
  94 
  95                              Big-Endian
  96 
  97          +------+------+------+------+------+------+------+------+
  98          | 0xBA | 0xDC | 0xAF | 0xFE | 0xDE | 0xAD | 0xBE | 0xEF |
  99          +------+------+------+------+------+------+------+------+
 100             ^^     ^^     ^^     ^^     ^^     ^^     ^^     ^^
 101           0x100  0x101  0x102  0x103  0x104  0x105  0x106  0x107
 102             vv     vv     vv     vv     vv     vv     vv     vv
 103          +------+------+------+------+------+------+------+------+
 104          | 0xEF | 0xBE | 0xAD | 0xDE | 0xFE | 0xAF | 0xDC | 0xBA |
 105          +------+------+------+------+------+------+------+------+
 106 
 107                             Little-Endian
 108 
 109 
 110      The treatment of different endian values would not be complete without
 111      discussing PDP-endian, which is also known as middle-endian.  While the
 112      PDP-11 was a 16-bit little-endian system, it laid out 32-bit values in a
 113      different way from current little-endian systems.  First, it would divide
 114      a 32-bit number into two 16-bit numbers.  Each 16-bit number would be
 115      stored in little-endian; however, the two 16-bit words would be stored
 116      with the larger 16-bit word appearing first in memory, followed by the
 117      latter.
 118 
 119      The following image illustrates PDP-endian and compares it against
 120      little-endian values.  Here, we'll start with the value 0xAABBCCDD and
 121      show how the four bytes for it will be laid out, starting at 0x100.
 122 
 123                          PDP-Endian
 124 
 125              ++------++------++------++------++
 126              || 0xBB || 0xAA || 0xDD || 0xCC ||
 127              ++------++------++------++------++
 128                  ^^      ^^      ^^      ^^
 129                0x100   0x101   0x102   0x103
 130                  vv      vv      vv      vv
 131              ++------++------++------++------++
 132              || 0xDD || 0xCC || 0xBB || 0xAA ||
 133              ++------++------++------++------++
 134 
 135                        Little-Endian
 136 
 137 
 138    Network Byte Order
 139      The term 'network byte order' refers to big-endian ordering, and
 140      originates from the IEEE.  Early disagreements over which byte ordering
 141      to use for network traffic prompted RFC1700 to define that all IETF-
 142      specified network protocols use big-endian ordering unless noted
 143      explicitly otherwise.  The Internet protocol family (IP, and thus TCP and
 144      UDP etc) particularly adhere to this convention.
 145 
 146    Determining the System's Byte Order
 147      The operating system supports both big-endian and little-endian CPUs.  To
 148      make it easier for programs to determine the endianness of the platform
 149      they are being compiled for, functions and macro constants are provided
 150      in the system header files.
 151 
 152      The endianness of the system can be obtained by including the header
 153      <sys/types.h> and using the pre-processor macros _LITTLE_ENDIAN and
 154      _BIG_ENDIAN.  See types.h(3HEAD) for more information.
 155 
 156      Additionally, the header <endian.h> defines an alternative   means for
 157      determining the endianness of the current system.  See endian.h(3HEAD)
 158      for more information.
 159 
 160      illumos runs on both big- and little-endian systems.  When writing
 161      software for which the endianness is important, one must always check the
 162      byte order and convert it appropriately.
 163 
 164    Converting Between Byte Orders
 165      The system provides two different sets of functions to convert values
 166      between big-endian and little-endian.  They are defined in byteorder(3C)
 167      and endian(3C).
 168 
 169      The byteorder(3C) family of functions convert data between the host's
 170      native byte order and big- or little-endian.  The functions operate on
 171      either 16-bit, 32-bit, or 64-bit values.  Functions that convert from
 172      network byte order to the host's byte order start with the string ntoh,
 173      while functions which convert from the host's byte order to network byte
 174      order, begin with hton.  For example, to convert a 32-bit value, a long,
 175      from network byte order to the host's, one would use the function
 176      ntohl(3C).
 177 
 178      These functions have been standardized by POSIX.  However, the 64-bit
 179      variants, ntohll(3C) and htonll(3C) are not standardized and may not be
 180      found on other systems.  For more information on these functions, see
 181      byteorder(3C).
 182 
 183      The second family of functions, endian(3C), provide a means to convert
 184      between the host's byte order and big-endian and little-endian
 185      specifically.  While these functions are similar to those in
 186      byteorder(3C), they more explicitly cover different data conversions.
 187      Like them, these functions operate on either 16-bit, 32-bit, or 64-bit
 188      values.  When converting from big-endian, to the host's endianness, the
 189      functions begin with betoh.  If instead, one is converting data from the
 190      host's native endianness to another, then it starts with htobe.  When
 191      working with little-endian data, the prefixes letoh and htole convert
 192      little-endian data to the host's endianness and from the host's to
 193      little-endian respectively.
 194 
 195      These functions are not standardized and the header they appear in varies
 196      between the BSDs and GNU/Linux.  Applications that wish to be portable,
 197      should instead use the byteorder(3C) functions.
 198 
 199      All of these functions in both families simply return their input when
 200      the host's native byte order is the same as the desired order.  For
 201      example, when calling htonl(3C) on a big-endian system the original data
 202      is returned with no conversion or modification.
 203 
 204 SEE ALSO
 205      byteorder(3C), endian(3C), endian.h(3HEAD), inet(3HEAD)
 206 
 207 illumos                         August 2, 2018                         illumos