Print this page
We've seen a situation where the NFS4-client tries endlessly to return a
delegation. This happened after a server reboot. The server responds with
STALE_STATEID to the DELEGRETURN. This case is not directly handled by
nfs4_do_delegreturn. Instead, it triggers a recovery of the clientid, which
in turn triggers a reclaim of all open files for this server.
To find the open files, rtable4 is enumerated for each mount to the server.
This is supposed to reopen the file, so that the next DELEGRETURN will
succeed.
In this case, the rnode is not in the rtable-hash, so it never gets
recovered. This leads to an endless delegreturn-loop, iterated once per
second.
This fix tests on NFS4_STALE_STATEID in combination with the rnode not
being in the hash. In this case, it treats the error as fatal and just
discards the delegation.

*** 1505,1515 **** * Ignore some errors on delegreturn; no point in marking * the file dead on a state destroying operation. */ if (e.error == 0 && (nfs4_recov_marks_dead(e.stat) || e.stat == NFS4ERR_BADHANDLE || ! e.stat == NFS4ERR_STALE)) needrecov = FALSE; else needrecov = nfs4_needs_recovery(&e, TRUE, vp->v_vfsp); if (needrecov) { --- 1505,1517 ---- * Ignore some errors on delegreturn; no point in marking * the file dead on a state destroying operation. */ if (e.error == 0 && (nfs4_recov_marks_dead(e.stat) || e.stat == NFS4ERR_BADHANDLE || ! e.stat == NFS4ERR_STALE || ! (e.stat == NFS4ERR_STALE_STATEID && ! !(rp->r_flags & R4HASHED)))) needrecov = FALSE; else needrecov = nfs4_needs_recovery(&e, TRUE, vp->v_vfsp); if (needrecov) {