Print this page
We've seen a situation where the NFS4-client tries endlessly to return a
delegation. This happened after a server reboot. The server responds with
STALE_STATEID to the DELEGRETURN. This case is not directly handled by
nfs4_do_delegreturn. Instead, it triggers a recovery of the clientid, which
in turn triggers a reclaim of all open files for this server.
To find the open files, rtable4 is enumerated for each mount to the server.
This is supposed to reopen the file, so that the next DELEGRETURN will
succeed.
In this case, the rnode is not in the rtable-hash, so it never gets
recovered. This leads to an endless delegreturn-loop, iterated once per
second.
This fix tests on NFS4_STALE_STATEID in combination with the rnode not
being in the hash. In this case, it treats the error as fatal and just
discards the delegation.

Split Close
Expand all
Collapse all
          --- old/usr/src/uts/common/fs/nfs/nfs4_callback.c
          +++ new/usr/src/uts/common/fs/nfs/nfs4_callback.c
↓ open down ↓ 1499 lines elided ↑ open up ↑
1500 1500                  }
1501 1501  
1502 1502                  nfs4delegreturn_otw(rp, cr, &e);
1503 1503  
1504 1504                  /*
1505 1505                   * Ignore some errors on delegreturn; no point in marking
1506 1506                   * the file dead on a state destroying operation.
1507 1507                   */
1508 1508                  if (e.error == 0 && (nfs4_recov_marks_dead(e.stat) ||
1509 1509                      e.stat == NFS4ERR_BADHANDLE ||
1510      -                    e.stat == NFS4ERR_STALE))
     1510 +                    e.stat == NFS4ERR_STALE ||
     1511 +                    (e.stat == NFS4ERR_STALE_STATEID &&
     1512 +                     !(rp->r_flags & R4HASHED))))
1511 1513                          needrecov = FALSE;
1512 1514                  else
1513 1515                          needrecov = nfs4_needs_recovery(&e, TRUE, vp->v_vfsp);
1514 1516  
1515 1517                  if (needrecov) {
1516 1518                          nfs4delegreturn_save_lost_rqst(e.error, &lost_rqst,
1517 1519                              cr, vp);
1518 1520                          (void) nfs4_start_recovery(&e, mi, vp,
1519 1521                              NULL, &rp->r_deleg_stateid,
1520 1522                              lost_rqst.lr_op == OP_DELEGRETURN ?
↓ open down ↓ 1004 lines elided ↑ open up ↑
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX