From d822447c272ed4590f73c4b1c847fc7922950d22 Mon Sep 17 00:00:00 2001 From: Andrew Deason Date: Sun, 10 May 2020 17:02:21 -0500 Subject: [PATCH] FBSD: Avoid extra vcache puts in afs_root Our two 'goto tryagain;' cases in afs_root() have a couple of refcounting problems: - If vget() returns an error, but any of the cases in the first if() statement are true (such as, afs_globalVp does not have CStatd set), we'll vput() and afs_PutVCache(), but vget() didn't grab a reference, since it returned an error. So we'll put references we don't actually have. - If we enter the first if() block when vget() returns a success, we vput() the reference we got from vget(), but we also put an additional reference by calling afs_PutVCache(). If afs_globalVp still points to this same vcache, this can cause afs_globalVp to point to a vcache without a ref held for it. Because of this, if afs_globalVp loses CStatd while we're waiting for the vnode lock, this can cause the afs_globalVp vcache to not have any refs held for it, which causes all sorts of other possible problems, where the usecount for afs_globalVp can drop to 0 when we don't expect it to. To fix these issues, remove the extra afs_PutVCache(), and check for an error from vget() before doing the other afs_globalVp-related checks. The relevant code path involved here can be stressed by frequently causing lookups via /afs/..., while at the same time causing frequent callback breaks on the root vnode. This can be achieved by using an RW volume as the root volume (with non-dynroot), and having another client constantly modify the root directory in that volume while a FreeBSD client constantly does /afs/... lookups in separate pids. Thanks to tcreech@tcreech.com for reporting and helping investigate the relevant issue. Change-Id: I812d063b3d60ac6eb841863cc7fba3e152393910 Reviewed-on: https://gerrit.openafs.org/14206 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Andrew Deason --- src/afs/FBSD/osi_vfsops.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/afs/FBSD/osi_vfsops.c b/src/afs/FBSD/osi_vfsops.c index 9fb3f0f235..0f5628148d 100644 --- a/src/afs/FBSD/osi_vfsops.c +++ b/src/afs/FBSD/osi_vfsops.c @@ -233,15 +233,15 @@ tryagain: AFS_GUNLOCK(); error = vget(vp, LK_EXCLUSIVE | LK_RETRY, td); AFS_GLOCK(); + if (error != 0) { + goto tryagain; + } /* we dropped the glock, so re-check everything it had serialized */ if (!afs_globalVp || !(afs_globalVp->f.states & CStatd) || tvp != afs_globalVp) { vput(vp); - afs_PutVCache(tvp); goto tryagain; } - if (error != 0) - goto tryagain; /* * I'm uncomfortable about this. Shouldn't this happen at a * higher level, and shouldn't we busy the top-level directory