1 '\" te 2 .\" Copyright (c) 2005 Sun Microsystems, Inc. All Rights Reserved. 3 .\" The contents of this file are subject to the terms of the Common Development and Distribution License (the "License"). You may not use this file except in compliance with the License. 4 .\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE or http://www.opensolaris.org/os/licensing. See the License for the specific language governing permissions and limitations under the License. 5 .\" When distributing Covered Code, include this CDDL HEADER in each file and include the License file at usr/src/OPENSOLARIS.LICENSE. If applicable, add the following below this CDDL HEADER, with the fields enclosed by brackets "[]" replaced with your own identifying information: Portions Copyright [yyyy] [name of copyright owner] 6 .TH MHD 7I "Mar 18, 2011" 7 .SH NAME 8 mhd \- multihost disk control operations 9 .SH SYNOPSIS 10 .LP 11 .nf 12 \fB#include\fR \fB<sys/mhd.h>\fR 13 .fi 14 15 .SH DESCRIPTION 16 .sp 17 .LP 18 The \fBmhd\fR \fBioctl\fR(2) control access rights of a multihost disk, using 19 disk reservations on the disk device. 20 .sp 21 .LP 22 The stability level of this interface (see \fBattributes\fR(5)) is evolving. As 23 a result, the interface is subject to change and you should limit your use of 24 it. 25 .sp 26 .LP 27 The mhd ioctls fall into two major categories: (1) ioctls for non-shared 28 multihost disks and (2) ioctls for shared multihost disks. 29 .sp 30 .LP 31 One ioctl, \fBMHIOCENFAILFAST\fR, is applicable to both non-shared and shared 32 multihost disks. It is described after the first two categories. 33 .sp 34 .LP 35 All the ioctls require root privilege. 36 .sp 37 .LP 38 For all of the ioctls, the caller should obtain the file descriptor for the 39 device by calling \fBopen\fR(2) with the \fBO_NDELAY\fR flag; without the 40 \fBO_NDELAY\fR flag, the open may fail due to another host already having a 41 conflicting reservation on the device. Some of the ioctls below permit the 42 caller to forcibly clear a conflicting reservation held by another host, 43 however, in order to call the ioctl, the caller must first obtain the open file 44 descriptor. 45 .SS "Non-shared multihost disks" 46 .sp 47 .LP 48 Non-shared multihost disks ioctls consist of \fBMHIOCTKOWN\fR, 49 \fBMHIOCRELEASE\fR, \fBHIOCSTATUS\fR, and \fBMHIOCQRESERVE\fR. These ioctl 50 requests control the access rights of non-shared multihost disks. A non-shared 51 multihost disk is one that supports serialized, mutually exclusive I/O mastery 52 by the connected hosts. This is in contrast to the shared-disk model, in which 53 concurrent access is allowed from more than one host (see below). 54 .sp 55 .LP 56 A non-shared multihost disk can be in one of two states: 57 .RS +4 58 .TP 59 .ie t \(bu 60 .el o 61 Exclusive access state, where only one connected host has I/O access 62 .RE 63 .RS +4 64 .TP 65 .ie t \(bu 66 .el o 67 Non-exclusive access state, where all connected hosts have I/O access. An 68 external hardware reset can cause the disk to enter the non-exclusive access 69 state. 70 .RE 71 .sp 72 .LP 73 Each multihost disk driver views the machine on which it's running as the 74 "local host"; each views all other machines as "remote hosts". For each I/O or 75 ioctl request, the requesting host is the local host. 76 .sp 77 .LP 78 Note that the non-shared ioctls are designed to work with SCSI-2 disks. The 79 SCSI-2 RESERVE/RELEASE command set is the underlying hardware facility in the 80 device that supports the non-shared ioctls. 81 .sp 82 .LP 83 The function prototypes for the non-shared ioctls are: 84 .sp 85 .in +2 86 .nf 87 ioctl(fd, MHIOCTKOWN); 88 ioctl(fd, MHIOCRELEASE); 89 ioctl(fd, MHIOCSTATUS); 90 ioctl(fd, MHIOCQRESERVE); 91 .fi 92 .in -2 93 94 .sp 95 .ne 2 96 .na 97 \fB\fBMHIOCTKOWN\fR \fR 98 .ad 99 .RS 18n 100 Forcefully acquires exclusive access rights to the multihost disk for the local 101 host. Revokes all access rights to the multihost disk from remote hosts. 102 Causes the disk to enter the exclusive access state. 103 .sp 104 Implementation Note: Reservations (exclusive access rights) broken via random 105 resets should be reinstated by the driver upon their detection, for example, in 106 the automatic probe function described below. 107 .RE 108 109 .sp 110 .ne 2 111 .na 112 \fB\fBMHIOCRELEASE\fR \fR 113 .ad 114 .RS 18n 115 Relinquishes exclusive access rights to the multihost disk for the local host. 116 On success, causes the disk to enter the non- exclusive access state. 117 .RE 118 119 .sp 120 .ne 2 121 .na 122 \fB\fBMHIOCSTATUS\fR \fR 123 .ad 124 .RS 18n 125 Probes a multihost disk to determine whether the local host has access rights 126 to the disk. Returns \fB0\fR if the local host has access to the disk, 127 \fB1\fR if it doesn't, and \fB-1\fR with errno set to \fBEIO\fR if the probe 128 failed for some other reason. 129 .RE 130 131 .sp 132 .ne 2 133 .na 134 \fB\fBMHIOCQRESERVE\fR \fR 135 .ad 136 .RS 18n 137 Issues, simply and only, a SCSI-2 Reserve command. If the attempt to reserve 138 fails due to the SCSI error Reservation Conflict (which implies that some other 139 host has the device reserved), then the ioctl will return \fB-1\fR with errno 140 set to \fBEACCES\fR. The \fBMHIOCQRESERVE\fR ioctl does NOT issue a bus device 141 reset or bus reset prior to attempting the SCSI-2 reserve command. It also 142 does not take care of re-instating reservations that disappear due to bus 143 resets or bus device resets; if that behavior is desired, then the caller can 144 call \fBMHIOCTKOWN\fR after the \fBMHIOCQRESERVE\fR has returned success. If 145 the device does not support the SCSI-2 Reserve command, then the ioctl returns 146 \fB-1\fR with \fBerrno\fR set to \fBENOTSUP.\fR The \fBMHIOCQRESERVE\fR ioctl 147 is intended to be used by high-availability or clustering software for a 148 "quorum" disk, hence, the "Q" in the name of the ioctl. 149 .RE 150 151 .SS "Shared Multihost Disks" 152 .sp 153 .LP 154 Shared multihost disks ioctls control access to shared multihost disks. The 155 ioctls are merely a veneer on the SCSI-3 Persistent Reservation facility. 156 Therefore, the underlying semantic model is not described in detail here, see 157 instead the SCSI-3 standard. The SCSI-3 Persistent Reservations support the 158 concept of a group of hosts all sharing access to a disk. 159 .sp 160 .LP 161 The function prototypes and descriptions for the shared multihost ioctls are as 162 follows: 163 .sp 164 .ne 2 165 .na 166 \fB\fBioctl\fR(\fBfd\fR, \fBMHIOCGRP_INKEYS\fR, (\fBmhioc_inkeys_t\fR) 167 \fI*k\fR\fB);\fR\fR 168 .ad 169 .sp .6 170 .RS 4n 171 Issues the SCSI-3 command Persistent Reserve In Read Keys to the device. On 172 input, the field \fBk->li\fR should be initialized by the caller with 173 \fBk->li.listsize\fR reflecting how big of an array the caller has allocated 174 for the \fBk->li.list\fR field and with \fBk->li.listlen\fR \fB==\fR \fB0.\fR 175 On return, the field \fBk->li.listlen\fR is updated to indicate the number of 176 reservation keys the device currently has: if this value is larger than 177 \fBk->li.listsize\fR then that indicates that the caller should have passed a 178 bigger \fBk->li.list\fR array with a bigger \fBk->li.listsize.\fR The number of 179 array elements actually written by the callee into \fBk->li.list\fR is the 180 minimum of \fBk->li.listlen\fR and \fBk->li.listsize.\fR The field 181 k->generation is updated with the generation information returned by the SCSI-3 182 Read Keys query. If the device does not support SCSI-3 Persistent Reservations, 183 then this ioctl returns \fB-1\fR with \fBerrno\fR set to \fBENOTSUP\fR. 184 .RE 185 186 .sp 187 .ne 2 188 .na 189 \fB\fBioctl\fR(\fBfd\fR, \fBMHIOCGRP_INRESV\fR, (\fBmhioc_inresvs_t\fR) 190 \fI*r\fR\fB);\fR\fR 191 .ad 192 .sp .6 193 .RS 4n 194 Issues the SCSI-3 command Persistent Reserve In Read Reservations to the 195 device. Remarks similar to \fBMHIOCGRP_INKEYS\fR apply to the array 196 manipulation. If the device does not support SCSI-3 Persistent Reservations, 197 then this ioctl returns \fB-1\fR with \fBerrno\fR set to \fBENOTSUP\fR. 198 .RE 199 200 .sp 201 .ne 2 202 .na 203 \fB\fBioctl\fR(\fBfd\fR, \fBMHIOCGRP_REGISTER\fR, (\fBmhioc_register_t\fR) 204 \fI*r\fR\fB);\fR\fR 205 .ad 206 .sp .6 207 .RS 4n 208 Issues the SCSI-3 command Persistent Reserve Out Register. The fields of 209 structure \fIr\fR are all inputs; none of the fields are modified by the ioctl. 210 The field \fBr->aptpl\fR should be set to true to specify that registrations 211 and reservations should persist across device power failures, or to false to 212 specify that registrations and reservations should be cleared upon device power 213 failure; true is the recommended setting. The field \fBr->oldkey\fR is the key 214 that the caller believes the device may already have for this host initiator; 215 if the caller believes that that this host initiator is not already registered 216 with this device, it should pass the special key of all zeros. To achieve the 217 effect of unregistering with the device, the caller should pass its current key 218 for the \fBr->oldkey\fR field and an \fBr->newkey\fR field containing the 219 special key of all zeros. If the device returns the SCSI error code 220 Reservation Conflict, this ioctl returns \fB-1\fR with \fBerrno\fR set to 221 \fBEACCES\fR. 222 .RE 223 224 .sp 225 .ne 2 226 .na 227 \fB\fBioctl\fR(\fBfd\fR, \fBMHIOCGRP_RESERVE\fR, (\fBmhioc_resv_desc_t\fR) 228 \fI*r\fR\fB);\fR\fR 229 .ad 230 .sp .6 231 .RS 4n 232 Issues the SCSI-3 command Persistent Reserve Out Reserve. The fields of 233 structure \fIr\fR are all inputs; none of the fields are modified by the ioctl. 234 If the device returns the SCSI error code Reservation Conflict, this ioctl 235 returns \fB-1\fR with \fBerrno\fR set to \fBEACCES.\fR 236 .RE 237 238 .sp 239 .ne 2 240 .na 241 \fB\fBioctl\fR(\fBfd\fR, \fBMHIOCGRP_PREEMPTANDABORT\fR, 242 (\fBmhioc_preemptandabort_t\fR) \fI*r\fR\fB);\fR\fR 243 .ad 244 .sp .6 245 .RS 4n 246 Issues the SCSI-3 command Persistent Reserve Out Preempt-And-Abort. The fields 247 of structure \fIr\fR are all inputs; none of the fields are modified by the 248 ioctl. The key of the victim host is specified by the field 249 \fBr->victim_key\fR. The field \fBr->resvdesc\fR supplies the preempter's key 250 and the reservation that it is requesting as part of the SCSI-3 251 Preempt-And-Abort command. If the device returns the SCSI error code 252 Reservation Conflict, this ioctl returns \fB-1\fR with \fBerrno\fR set to 253 \fBEACCES.\fR 254 .RE 255 256 .sp 257 .ne 2 258 .na 259 \fB\fBioctl\fR(\fBfd\fR, \fBMHIOCGRP_PREEMPT\fR, 260 (\fBmhioc_preemptandabort_t\fR) \fI*r\fR\fB);\fR\fR 261 .ad 262 .sp .6 263 .RS 4n 264 Similar to \fBMHIOCGRP_PREEMPTANDABORT\fR, but instead issues the SCSI-3 265 command Persistent Reserve Out Preempt. (Note: This command is not 266 implemented). 267 .RE 268 269 .sp 270 .ne 2 271 .na 272 \fB\fBioctl\fR(\fBfd\fR, \fBMHIOCGRP_CLEAR\fR, (\fBmhioc_resv_key_t\fR) 273 \fI*r\fR\fB);\fR\fR 274 .ad 275 .sp .6 276 .RS 4n 277 Issues the SCSI-3 command Persistent Reserve Out Clear. The input parameter 278 \fIr\fR is the reservation key of the caller, which should have been already 279 registered with the device, by an earlier call to \fBMHIOCGRP_REGISTER\fR. 280 .RE 281 282 .sp 283 .LP 284 For each device, the non-shared ioctls should not be mixed with the Persistent 285 Reserve Out shared ioctls, and vice-versa, otherwise, the underlying device is 286 likely to return errors, because SCSI does not permit SCSI-2 reservations to be 287 mixed with SCSI-3 reservations on a single device. It is, however, legitimate 288 to call the Persistent Reserve In ioctls, because these are query only. 289 Issuing the \fBMHIOCGRP_INKEYS\fR ioctl is the recommended way for a caller to 290 determine if the device supports SCSI-3 Persistent Reservations (the ioctl 291 will return \fB-1\fR with \fBerrno\fR set to \fBENOTSUP\fR if the device does 292 not). 293 .SS "MHIOCENFAILFAST Ioctl" 294 .sp 295 .LP 296 The \fBMHIOCENFAILFAST\fR ioctl is applicable for both non-shared and shared 297 disks, and may be used with either the non-shared or shared ioctls. 298 .sp 299 .ne 2 300 .na 301 \fB\fBioctl\fR(\fBfd\fR, \fBMHIOENFAILFAST\fR, (unsigned int \fI*\fR) 302 \fImillisecs\fR\fB);\fR\fR 303 .ad 304 .sp .6 305 .RS 4n 306 Enables or disables the failfast option in the multihost disk driver and 307 enables or disables automatic probing of a multihost disk, described below. 308 The argument is an unsigned integer specifying the number of milliseconds to 309 wait between executions of the automatic probe function. An argument of zero 310 disables the failfast option and disables automatic probing. If the 311 \fBMHIOCENFAILFAST\fR ioctl is never called, the effect is defined to be that 312 both the failfast option and automatic probing are disabled. 313 .RE 314 315 .SS "Automatic Probing" 316 .sp 317 .LP 318 The \fBMHIOCENFAILFAST\fR ioctl sets up a timeout in the driver to periodically 319 schedule automatic probes of the disk. The automatic probe function works in 320 this manner: The driver is scheduled to probe the multihost disk every n 321 milliseconds, rounded up to the next integral multiple of the system clock's 322 resolution. If 323 .RS +4 324 .TP 325 1. 326 the local host no longer has access rights to the multihost disk, and 327 .RE 328 .RS +4 329 .TP 330 2. 331 access rights were expected to be held by the local host, 332 .RE 333 .sp 334 .LP 335 the driver immediately panics the machine to comply with the failfast model. 336 .sp 337 .LP 338 If the driver makes this discovery outside the timeout function, especially 339 during a read or write operation, it is imperative that it panic the system 340 then as well. 341 .SH RETURN VALUES 342 .sp 343 .LP 344 Each request returns \fB-1\fR on failure and sets \fBerrno\fR to indicate the 345 error. 346 .sp 347 .ne 2 348 .na 349 \fB\fBEPERM\fR \fR 350 .ad 351 .RS 14n 352 Caller is not root. 353 .RE 354 355 .sp 356 .ne 2 357 .na 358 \fB\fBEACCES\fR \fR 359 .ad 360 .RS 14n 361 Access rights were denied. 362 .RE 363 364 .sp 365 .ne 2 366 .na 367 \fB\fBEIO\fR\fR 368 .ad 369 .RS 14n 370 The multihost disk or controller was unable to successfully complete the 371 requested operation. 372 .RE 373 374 .sp 375 .ne 2 376 .na 377 \fB\fBEOPNOTSUP\fR \fR 378 .ad 379 .RS 14n 380 The multihost disk does not support the operation. For example, it does not 381 support the SCSI-2 Reserve/Release command set, or the SCSI-3 Persistent 382 Reservation command set. 383 .RE 384 385 .SH ATTRIBUTES 386 .sp 387 .LP 388 See \fBattributes\fR(5) for a description of the following attributes: 389 .sp 390 391 .sp 392 .TS 393 box; 394 c | c 395 l | l . 396 ATTRIBUTE TYPE ATTRIBUTE VALUE 397 _ 398 Stability Evolving 399 .TE 400 401 .SH SEE ALSO 402 .sp 403 .LP 404 \fBioctl\fR(2), \fBopen\fR(2), \fBattributes\fR(5), open(2)