afs: afs_Analyze, don't retry if fatal sig pending

The afs_Analyze() function can lead to excessive retries in cases where
a recoverable error cannot be resolved, or when a process is signaled to
terminate (e.g., with SIGKILL).  This can cause unnecessary floods of
kernel messages or RPC requests.

afs_Analyze() analyzes RPC results, indicating if a retry is appropriate
for "recoverable" errors. Some recoverable errors may persist for an
extended period of time (e.g., a busy volume that may require time to
recover or an unavailable service). A user may desire to cancel the
request in these cases. Normally when retrying an operation there is a
sleep between retries (using VSleep() or similar functions).  On Linux
systems when there is pending SIGKILL, sleep will return immediately,
so the operation is retried immediately without any delay.

For most recoverable errors, there is a limit on the number of times the
request is retried; for instance, VBUSY errors are retried 100 times
before giving up.  But for network errors when hardmount is enabled,
there is no limit on the number of retries, so it is possible that
retries will be done immediately forever, possibly making the machine
slow or even unusable until the error goes away or hardmount is
disabled.

In afs_Analyze() add a call to afs_kill_pending() to check if the
process is being terminated.  If the process is pending termination,
return a status indicating that the RPC request should not be retried.

Change-Id: I972931790bf680a181f1ebc45dfe7d355f7641cd
Reviewed-on: https://gerrit.openafs.org/15747
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
This commit is contained in:
Cheyenne Wills 2024-05-14 19:44:02 -06:00 committed by Andrew Deason
parent 8a983426fb
commit e7b2a5063b

View File

@ -458,6 +458,12 @@ afs_Analyze(struct afs_conn *aconn, struct rx_connection *rxconn,
afs_FinalizeReq(areq);
if (afs_kill_pending()) {
/* If the current process is terminating, don't attempt to retry */
shouldRetry = 0;
goto out;
}
if (AFS_IS_DISCONNECTED && !AFS_IN_SYNC) {
/* On reconnection, act as connected. XXX: for now.... */
/* SXW - This may get very tired after a while. We should try and