1 BYTEORDER(5) Standards, Environments, and Macros BYTEORDER(5) 2 3 NAME 4 byteorder, endian - byte order and endianness 5 6 DESCRIPTION 7 Integer values which occupy more than 1 byte in memory can be laid out in 8 different ways on different platforms. In particular, there is a major 9 split between those which place the least significant byte of an integer 10 at the lowest address, and those which place the most significant byte 11 there instead. As this difference relates to which end of the integer is 12 found in memory first, the term endian is used to refer to a particular 13 byte order. 14 15 A platform is referred to as using a big-endian byte order when it places 16 the most significant byte at the lowest address, and little-endian when 17 it places the least significant byte first. Some platforms may also 18 switch between big- and little-endian mode and run code compiled for 19 either. 20 21 Historically, there have also been some systems that utilized 22 middle-endian byte orders for integers larger than 2 bytes. Such 23 orderings are not in common use today. 24 25 Endianness is also of particular importance when dealing with values that 26 are being read into memory from an external source. For example, network 27 protocols such as IP conventionally define the fields in a packet as 28 being always stored in big-endian byte order. This means that a little- 29 endian machine will have to perform transformations on these fields in 30 order to process them. 31 32 Examples 33 To illustrate endianness in memory, let us consider the decimal integer 34 2864434397. This number fits in 32 bits of storage (4 bytes). 35 36 On a big-endian system, this integer would be written into memory as the 37 bytes 0xAA, 0xBB, 0xCC, 0xDD, in order from lowest memory address to 38 highest. 39 40 On a little-endian system, it would be written instead as the bytes 0xDD, 41 0xCC, 0xBB, 0xAA, in that order. 42 43 If both the big- and little-endian systems were asked to store this 44 integer at address 0x100, we would see the following in each of their 45 memory: 46 47 48 Big-Endian 49 50 ++------++------++------++------++ 51 || 0xAA || 0xBB || 0xCC || 0xDD || 52 ++------++------++------++------++ 53 ^^ ^^ ^^ ^^ 54 0x100 0x101 0x102 0x103 55 vv vv vv vv 56 ++------++------++------++------++ 57 || 0xDD || 0xCC || 0xBB || 0xAA || 58 ++------++------++------++------++ 59 60 Little-Endian 61 62 It is particularly important to note that even though the byte order is 63 different between these two machines, the bit ordering within each byte, 64 by convention, is still the same. 65 66 For example, take the decimal integer 4660, which occupies in 16 bits (2 67 bytes). 68 69 On a big-endian system, this would be written into memory as 0x12, then 70 0x34. 71 72 On a little-endian system, it would be written as 0x34, then 0x12. Note 73 that this is not at all the same as seeing 0x43 then 0x21 in memory -- 74 only the bytes are re-ordered, not any bits (or nybbles) within them. 75 76 As before, storing this at address 0x100: 77 78 Big-Endian 79 80 ++------++------++ 81 || 0x12 || 0x34 || 82 ++------++------++ 83 ^^ ^^ 84 0x100 0x101 85 vv vv 86 ++------++------++ 87 || 0x34 || 0x12 || 88 ++------++------++ 89 90 Little-Endian 91 92 This example shows how an eight byte number, 0xBADCAFEDEADBEEF is stored 93 in both big and little-endian: 94 95 Big-Endian 96 97 +------+------+------+------+------+------+------+------+ 98 | 0xBA | 0xDC | 0xAF | 0xFE | 0xDE | 0xAD | 0xBE | 0xEF | 99 +------+------+------+------+------+------+------+------+ 100 ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ 101 0x100 0x101 0x102 0x103 0x104 0x105 0x106 0x107 102 vv vv vv vv vv vv vv vv 103 +------+------+------+------+------+------+------+------+ 104 | 0xEF | 0xBE | 0xAD | 0xDE | 0xFE | 0xAF | 0xDC | 0xBA | 105 +------+------+------+------+------+------+------+------+ 106 107 Little-Endian 108 109 110 The treatment of different endian values would not be complete without 111 discussing PDP-endian, which is also known as middle-endian. While the 112 PDP-11 was a 16-bit little-endian system, it laid out 32-bit values in a 113 different way from current little-endian systems. First, it would divide 114 a 32-bit number into two 16-bit numbers. Each 16-bit number would be 115 stored in little-endian; however, the two 16-bit words would be stored 116 with the larger 16-bit word appearing first in memory, followed by the 117 latter. 118 119 The following image illustrates PDP-endian and compares it against 120 little-endian values. Here, we'll start with the value 0xAABBCCDD and 121 show how the four bytes for it will be laid out, starting at 0x100. 122 123 PDP-Endian 124 125 ++------++------++------++------++ 126 || 0xBB || 0xAA || 0xDD || 0xCC || 127 ++------++------++------++------++ 128 ^^ ^^ ^^ ^^ 129 0x100 0x101 0x102 0x103 130 vv vv vv vv 131 ++------++------++------++------++ 132 || 0xDD || 0xCC || 0xBB || 0xAA || 133 ++------++------++------++------++ 134 135 Little-Endian 136 137 138 Network Byte Order 139 The term 'network byte order' refers to big-endian ordering, and 140 originates from the IEEE. Early disagreements over which byte ordering 141 to use for network traffic prompted RFC1700 to define that all IETF- 142 specified network protocols use big-endian ordering unless noted 143 explicitly otherwise. The Internet protocol family (IP, and thus TCP and 144 UDP etc) particularly adhere to this convention. 145 146 Determining the System's Byte Order 147 The operating system supports both big-endian and little-endian CPUs. To 148 make it easier for programs to determine the endianness of the platform 149 they are being compiled for, functions and macro constants are provided 150 in the system header files. 151 152 The endianness of the system can be obtained by including the header 153 <sys/types.h> and using the pre-processor macros _LITTLE_ENDIAN and 154 _BIG_ENDIAN. See types.h(3HEAD) for more information. 155 156 Additionally, the header <endian.h> defines an alternative means for 157 determining the endianness of the current system. See endian.h(3HEAD) 158 for more information. 159 160 illumos runs on both big- and little-endian systems. When writing 161 software for which the endianness is important, one must always check the 162 byte order and convert it appropriately. 163 164 Converting Between Byte Orders 165 The system provides two different sets of functions to convert values 166 between big-endian and little-endian. They are defined in byteorder(3C) 167 and endian(3C). 168 169 The byteorder(3C) family of functions convert data between the host's 170 native byte order and big- or little-endian. The functions operate on 171 either 16-bit, 32-bit, or 64-bit values. Functions that convert from 172 network byte order to the host's byte order start with the string ntoh, 173 while functions which convert from the host's byte order to network byte 174 order, begin with hton. For example, to convert a 32-bit value, a long, 175 from network byte order to the host's, one would use the function 176 ntohl(3C). 177 178 These functions have been standardized by POSIX. However, the 64-bit 179 variants, ntohll(3C) and htonll(3C) are not standardized and may not be 180 found on other systems. For more information on these functions, see 181 byteorder(3C). 182 183 The second family of functions, endian(3C), provide a means to convert 184 between the host's byte order and big-endian and little-endian 185 specifically. While these functions are similar to those in 186 byteorder(3C), they more explicitly cover different data conversions. 187 Like them, these functions operate on either 16-bit, 32-bit, or 64-bit 188 values. When converting from big-endian, to the host's endianness, the 189 functions begin with betoh. If instead, one is converting data from the 190 host's native endianness to another, then it starts with htobe. When 191 working with little-endian data, the prefixes letoh and htole convert 192 little-endian data to the host's endianness and from the host's to 193 little-endian respectively. 194 195 These functions are not standardized and the header they appear in varies 196 between the BSDs and GNU/Linux. Applications that wish to be portable, 197 should instead use the byteorder(3C) functions. 198 199 All of these functions in both families simply return their input when 200 the host's native byte order is the same as the desired order. For 201 example, when calling htonl(3C) on a big-endian system the original data 202 is returned with no conversion or modification. 203 204 SEE ALSO 205 byteorder(3C), endian(3C), endian.h(3HEAD), inet(3HEAD) 206 207 illumos August 2, 2018 illumos