mirror of
https://git.openafs.org/openafs.git
synced 2025-01-23 17:27:34 +00:00
e022864c92
Currently, if UV_ReleaseVolume() in 'vos release' fails to release the volume to some sites, we update the vlentry for the volume by calling VLDB_ReplaceEntry with LOCKREL_TIMESTAMP. This clears the lock timestamp in the vlentry, but not the lock flags (VLOP_RELEASE). We then fully unlock the vlentry later during cleanup after the 'rfail' label by calling ubik_VL_ReleaseLock(). The timestamp-only unlock via VLDB_ReplaceEntry is unusual; all other code paths in OpenAFS clear a vlentry lock by clearing all lock-related fields at the same time (LOCKREL_OPCODE | LOCKREL_AFSID | LOCKREL_TIMESTAMP). If only the lock timestamp is cleared and specifically the VLOP_RELEASE flag is set, future calls to VL_SetLock() cause the vlserver to return a different error code: VL_RERELEASE instead of VL_ENTRYLOCKED. If UV_ReleaseVolume() sees VL_RERELEASE, we act as if the vlentry was locked successfully; the intention was probably to allow a subsequent 'vos release' to fix the partially-failed release. This scheme has a few problems: If a volume is partially locked, multiple 'vos release' processes could see the VL_RERELEASE error for the same volume, and all of them would assume they have locked the vlentry. This would bypass the exclusion that vlentry locks normally provide, and allows multiple processes to go through the 'vos release' process for the same volume at the same time, causing various problems. Similarly, a second 'vos release' process could see the VL_RERELEASE error while a first 'vos release' process is still running its cleanup. The first process would then fully unlock the vlentry while the second process is running the release, causing the release to run with a fully unlocked volume. This would allow any 'vos' operation against the volume to run at the same time as the release, potentially causing various problems. Furthermore, it's rare for a vlentry to be left in this partially-locked state. Since UV_ReleaseVolume() fully unlocks the vlentry during cleanup, the 'vos release' process would need to die before running the last ubik_VL_ReleaseLock(), or the last ubik_VL_ReleaseLock() would need to fail. Even if the VL_RERELEASE scheme worked well, it almost always is not used, since the vlentry is usually left completely unlocked after 'vos release' runs, even after a failure. It's possible that the details of this scheme used to work differently, perhaps avoiding the full unlock after a partial release failure. However, the current behavior has existed for quite a long time (since before OpenAFS 1.0), so it seems unlikely that anyone still expects whatever the original behavior was. To avoid all of this nonsense, just don't pass LOCKREL_TIMESTAMP during the failure-related VLDB_ReplaceEntry() call. Instead, the vlentry remains locked until we fully unlock it during the normal cleanup processing. Do the same for the equivalent code path in libadmin's UV_ReleaseVolume(). After this commit, all callers that pass any LOCKREL_* flags pass all of them at the same time, so partial unlocks should no longer be possible. For now, don't remove the special handling of the VL_RERELEASE error code after trying to lock a vlentry. Maybe this can be removed in the future, but for now, keep this to retain existing behavior if there are existing partially-locked entries in the VLDB. Thanks for mvitale@sinenomine.net for related work drawing attention to this. Change-Id: I6b053f88c923684bd95df3e2022e9cf125e45d04 Reviewed-on: https://gerrit.openafs.org/15867 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> |
||
---|---|---|
build-tools | ||
doc | ||
src | ||
tests | ||
.gitignore | ||
.gitreview | ||
.mailmap | ||
.splintrc | ||
acinclude.m4 | ||
CODING | ||
configure-libafs.ac | ||
configure.ac | ||
CONTRIBUTING | ||
INSTALL | ||
libafsdep | ||
LICENSE | ||
Makefile-libafs.in | ||
Makefile.in | ||
NEWS | ||
NTMakefile | ||
README | ||
README-WINDOWS | ||
regen.sh |
AFS is a distributed file system that enables users to share and access all of the files stored in a network of computers as easily as they access the files stored on their local machines. The file system is called distributed for this exact reason: files can reside on many different machines, but are available to users on every machine. OpenAFS 1.0 was originally released by IBM under the terms of the IBM Public License 1.0 (IPL10). For details on IPL10 see the LICENSE file in this directory. The current OpenAFS distribution is licensed under a combination of the IPL10 and many other licenses as granted by the relevant copyright holders. The LICENSE file in this directory contains more details, thought it is not a comprehensive statement. See INSTALL for information about building and installing OpenAFS on various platforms. See CODING for developer information and guidelines. See NEWS for recent changes to OpenAFS.