1 MHD(7I) Ioctl Requests MHD(7I) 2 3 NAME 4 mhd - multihost disk control operations 5 6 SYNOPSIS 7 #include <sys/mhd.h> 8 9 DESCRIPTION 10 The mhd ioctl(2) control access rights of a multihost disk, using disk 11 reservations on the disk device. 12 13 The stability level of this interface (see attributes(5)) is evolving. 14 As a result, the interface is subject to change and you should limit your 15 use of it. 16 17 The mhd ioctls fall into two major categories: (1) ioctls for non-shared 18 multihost disks and (2) ioctls for shared multihost disks. 19 20 One ioctl, MHIOCENFAILFAST, is applicable to both non-shared and shared 21 multihost disks. It is described after the first two categories. 22 23 All the ioctls require root privilege. 24 25 For all of the ioctls, the caller should obtain the file descriptor for 26 the device by calling open(2) with the O_NDELAY flag; without the 27 O_NDELAY flag, the open may fail due to another host already having a 28 conflicting reservation on the device. Some of the ioctls below permit 29 the caller to forcibly clear a conflicting reservation held by another 30 host, however, in order to call the ioctl, the caller must first obtain 31 the open file descriptor. 32 33 Non-shared multihost disks 34 Non-shared multihost disks ioctls consist of MHIOCTKOWN, MHIOCRELEASE, 35 HIOCSTATUS, and MHIOCQRESERVE. These ioctl requests control the access 36 rights of non-shared multihost disks. A non-shared multihost disk is one 37 that supports serialized, mutually exclusive I/O mastery by the connected 38 hosts. This is in contrast to the shared-disk model, in which concurrent 39 access is allowed from more than one host (see below). 40 41 A non-shared multihost disk can be in one of two states: 42 43 o Exclusive access state, where only one connected host has I/O 44 access 45 46 o Non-exclusive access state, where all connected hosts have I/O 47 access. An external hardware reset can cause the disk to enter 48 the non-exclusive access state. 49 50 Each multihost disk driver views the machine on which it's running as the 51 "local host"; each views all other machines as "remote hosts". For each 52 I/O or ioctl request, the requesting host is the local host. 53 54 Note that the non-shared ioctls are designed to work with SCSI-2 disks. 55 The SCSI-2 RESERVE/RELEASE command set is the underlying hardware 56 facility in the device that supports the non-shared ioctls. 57 58 The function prototypes for the non-shared ioctls are: 59 60 ioctl(fd, MHIOCTKOWN); 61 ioctl(fd, MHIOCRELEASE); 62 ioctl(fd, MHIOCSTATUS); 63 ioctl(fd, MHIOCQRESERVE); 64 65 MHIOCTKOWN Forcefully acquires exclusive access rights to the 66 multihost disk for the local host. Revokes all access 67 rights to the multihost disk from remote hosts. Causes 68 the disk to enter the exclusive access state. 69 70 Implementation Note: Reservations (exclusive access 71 rights) broken via random resets should be reinstated by 72 the driver upon their detection, for example, in the 73 automatic probe function described below. 74 75 MHIOCRELEASE Relinquishes exclusive access rights to the multihost disk 76 for the local host. On success, causes the disk to enter 77 the non- exclusive access state. 78 79 MHIOCSTATUS Probes a multihost disk to determine whether the local 80 host has access rights to the disk. Returns 0 if the 81 local host has access to the disk, 1 if it doesn't, and -1 82 with errno set to EIO if the probe failed for some other 83 reason. 84 85 MHIOCQRESERVE Issues, simply and only, a SCSI-2 Reserve command. If the 86 attempt to reserve fails due to the SCSI error Reservation 87 Conflict (which implies that some other host has the 88 device reserved), then the ioctl will return -1 with errno 89 set to EACCES. The MHIOCQRESERVE ioctl does NOT issue a 90 bus device reset or bus reset prior to attempting the 91 SCSI-2 reserve command. It also does not take care of re- 92 instating reservations that disappear due to bus resets or 93 bus device resets; if that behavior is desired, then the 94 caller can call MHIOCTKOWN after the MHIOCQRESERVE has 95 returned success. If the device does not support the 96 SCSI-2 Reserve command, then the ioctl returns -1 with 97 errno set to ENOTSUP. The MHIOCQRESERVE ioctl is intended 98 to be used by high-availability or clustering software for 99 a "quorum" disk, hence, the "Q" in the name of the ioctl. 100 101 Shared Multihost Disks 102 Shared multihost disks ioctls control access to shared multihost disks. 103 The ioctls are merely a veneer on the SCSI-3 Persistent Reservation 104 facility. Therefore, the underlying semantic model is not described in 105 detail here, see instead the SCSI-3 standard. The SCSI-3 Persistent 106 Reservations support the concept of a group of hosts all sharing access 107 to a disk. 108 109 The function prototypes and descriptions for the shared multihost ioctls 110 are as follows: 111 112 ioctl(fd, MHIOCGRP_INKEYS, (mhioc_inkeys_t *)k) 113 114 Issues the SCSI-3 command Persistent Reserve In Read Keys to the 115 device. On input, the field k->li should be initialized by the caller 116 with k->li.listsize reflecting how big of an array the caller has 117 allocated for the k->lilist field and with `k->li.listlen == 0'. On 118 return, the field k->li.listlen is updated to indicate the number of 119 reservation keys the device currently has: if this value is larger 120 than k->li.listsize then that indicates that the caller should have 121 passed a bigger k->li.list array with a bigger k->li.listsize. The 122 number of array elements actually written by the callee into 123 k->li.list is the minimum of k->li.listlen and k->li.listsize. The 124 field k->generation is updated with the generation information 125 returned by the SCSI-3 Read Keys query. If the device does not 126 support SCSI-3 Persistent Reservations, then this ioctl returns -1 127 with errno set to ENOTSUP. 128 129 ioctl(fd, MHIOCGRP_INRESV, (mhioc_inresvs_t *)r) 130 131 Issues the SCSI-3 command Persistent Reserve In Read Reservations to 132 the device. Remarks similar to MHIOCGRP_INKEYS apply to the array 133 manipulation. If the device does not support SCSI-3 Persistent 134 Reservations, then this ioctl returns -1 with errno set to ENOTSUP. 135 136 ioctl(fd, MHIOCGRP_REGISTER, (mhioc_register_t *)r) 137 138 Issues the SCSI-3 command Persistent Reserve Out Register. The fields 139 of structure r are all inputs; none of the fields are modified by the 140 ioctl. The field r->aptpl should be set to true to specify that 141 registrations and reservations should persist across device power 142 failures, or to false to specify that registrations and reservations 143 should be cleared upon device power failure; true is the recommended 144 setting. The field r->oldkey is the key that the caller believes the 145 device may already have for this host initiator; if the caller 146 believes that that this host initiator is not already registered with 147 this device, it should pass the special key of all zeros. To achieve 148 the effect of unregistering with the device, the caller should pass 149 its current key for the r->oldkey field and an r->newkey field 150 containing the special key of all zeros. If the device returns the 151 SCSI error code Reservation Conflict, this ioctl returns -1 with errno 152 set to EACCES. 153 154 ioctl(fd, MHIOCGRP_RESERVE, (mhioc_resv_desc_t *)r) 155 156 Issues the SCSI-3 command Persistent Reserve Out Reserve. The fields 157 of structure r are all inputs; none of the fields are modified by the 158 ioctl. If the device returns the SCSI error code Reservation 159 Conflict, this ioctl returns -1 with errno set to EACCES. 160 161 ioctl(fd, MHIOCGRP_PREEMPTANDABORT, (mhioc_preemptandabort_t *)r) 162 163 Issues the SCSI-3 command Persistent Reserve Out Preempt-And-Abort. 164 The fields of structure r are all inputs; none of the fields are 165 modified by the ioctl. The key of the victim host is specified by the 166 field r->victim_key. The field r->resvdesc supplies the preempter's 167 key and the reservation that it is requesting as part of the SCSI-3 168 Preempt-And-Abort command. If the device returns the SCSI error code 169 Reservation Conflict, this ioctl returns -1 with errno set to EACCES. 170 171 ioctl(fd, MHIOCGRP_PREEMPT, (mhioc_preemptandabort_t *)r) 172 173 Similar to MHIOCGRP_PREEMPTANDABORT, but instead issues the SCSI-3 174 command Persistent Reserve Out Preempt. (Note: This command is not 175 implemented). 176 177 ioctl(fd, MHIOCGRP_CLEAR, (mhioc_resv_key_t *)r) 178 Issues the SCSI-3 command Persistent Reserve Out Clear. The input 179 parameter r is the reservation key of the caller, which should have 180 been already registered with the device, by an earlier call to 181 MHIOCGRP_REGISTER. 182 183 For each device, the non-shared ioctls should not be mixed with the 184 Persistent Reserve Out shared ioctls, and vice-versa, otherwise, the 185 underlying device is likely to return errors, because SCSI does not 186 permit SCSI-2 reservations to be mixed with SCSI-3 reservations on a 187 single device. It is, however, legitimate to call the Persistent Reserve 188 In ioctls, because these are query only. Issuing the MHIOCGRP_INKEYS 189 ioctl is the recommended way for a caller to determine if the device 190 supports SCSI-3 Persistent Reservations (the ioctl will return -1 with 191 errno set to ENOTSUP if the device does not). 192 193 MHIOCENFAILFAST Ioctl 194 The MHIOCENFAILFAST ioctl is applicable for both non-shared and shared 195 disks, and may be used with either the non-shared or shared ioctls. 196 197 ioctl(fd, MHIOENFAILFAST, (unsigned int *)millisecs) 198 199 Enables or disables the failfast option in the multihost disk driver 200 and enables or disables automatic probing of a multihost disk, 201 described below. The argument is an unsigned integer specifying the 202 number of milliseconds to wait between executions of the automatic 203 probe function. An argument of zero disables the failfast option and 204 disables automatic probing. If the MHIOCENFAILFAST ioctl is never 205 called, the effect is defined to be that both the failfast option and 206 automatic probing are disabled. 207 208 Automatic Probing 209 The MHIOCENFAILFAST ioctl sets up a timeout in the driver to periodically 210 schedule automatic probes of the disk. The automatic probe function 211 works in this manner: The driver is scheduled to probe the multihost disk 212 every n milliseconds, rounded up to the next integral multiple of the 213 system clock's resolution. If 214 215 1. the local host no longer has access rights to the multihost 216 disk, and 217 218 2. access rights were expected to be held by the local host, 219 220 the driver immediately panics the machine to comply with the failfast 221 model. 222 223 If the driver makes this discovery outside the timeout function, 224 especially during a read or write operation, it is imperative that it 225 panic the system then as well. 226 227 RETURN VALUES 228 Each request returns -1 on failure and sets errno to indicate the error. 229 230 EPERM Caller is not root. 231 232 EACCES Access rights were denied. 233 234 EIO The multihost disk or controller was unable to 235 successfully complete the requested operation. 236 237 EOPNOTSUP The multihost disk does not support the operation. 238 For example, it does not support the SCSI-2 239 Reserve/Release command set, or the SCSI-3 Persistent 240 Reservation command set. 241 242 STABILITY 243 Uncommitted 244 245 SEE ALSO 246 ioctl(2), open(2), attributes(5) 247 248 illumos October 23, 2017 illumos