LINUX: Block non-fatal signals when sleeping

Currently, when sleeping on LINUX, we either block all signals
(afs_osi_Sleep), or do not block signals at all (afs_osi_SleepSig).
The only caller of afs_osi_SleepSig() is afs_read(), with the
intention that a user can interrupt the process while it's waiting for
data from an unresponsive server.

This can cause problems when afs_read() is called during a paging
request. In this case, if we are interrupted by a signal, we'll return
an error (EINTR) from afs_linux_readpage(), which causes a SIGBUS to
be sent to the process if it's not already dying.

If we're interrupted by a fatal signal, the process is already dying
so this is fine. But if we're interrupted by a non-fatal signal with
an installed signal handler, the SIGBUS almost always causes the
process to die immediately (and maybe dump core), where it otherwise
would have been fine.

This can be very confusing to a user when it happens, since it's not
immediately obvious that AFS was involved at all; the dumped core
often just shows the SIGBUS generated during a mundane memory load or
store. This situation is most easily seen when running golang out of
/afs, but has also been seen with git and other programs. Anything
that makes heavy use of signals while data is being fetched from /afs
is likely to trigger the behavior.

This problem in general may not be specific to Linux, since the
relevant code path is in cross-platform code (afs_read ->
afs_osi_SleepSig). But notably, this does not happen on Solaris[1];
other platforms may have their own quirks that prevent this from
becoming a problem.

To avoid this for Linux, block all signals except SIGKILL when we're
sleeping via afs_osi_SleepSig(). This allows the process to be killed
if needed, but prevents interruption from any non-fatal signal.

Ideally we would put our process in TASK_KILLABLE state, using
functions like wait_event_killable() or wait_event_state() instead of
blocking signals. But these functions have evolved over time, making
this approach complex or even impossible for various Linux versions in
our current design. Future commits may improve this; for now, do the
simpler fix and just block signals.

We could theoretically still allow non-fatal signals to interrupt
sleeps when called from a syscall like read(). But this is difficult
with the current structure of our Linux integration (syscall i/o is
implemented on top of the paging system, like most Linux filesystems),
and other filesystems tend to not do this.

Also, interrupting a stalled afs_read() in general doesn't currently
work very well anyway. The only callers of afs_osi_SleepSig() look
like this, to wait for a background fetch (BOP_FETCH) to complete:

    while (!code && tdc->mflags & DFFetchReq) {
	/* other locks, etc */
	ReleaseReadLock(&tdc->lock);
	code = afs_osi_SleepSig(&tdc->validPos);
	ObtainReadLock(&tdc->lock);
    }

A signal will cause afs_osi_SleepSig() to return, but then we must
wait to get tdc->lock. The background BOP_FETCH operation will keep
tdc->lock write-locked for the entire fetch from the fileserver
(afs_CacheFetchProc()), so we won't be able to continue until the
fetch completes.

Future commits may improve this, but for now just avoid the
unnecessary SIGBUS errors.

[1] On Solaris, the equivalent code path (afs_GetOnePage()) does not
go through afs_read() with noLock==0, but instead calls
afs_GetDCache() to handle fetching data. It does call afs_ustrategy()
to fill a page, which does technically call afs_read(), but with
noLock==1, and so avoids the case where we submit a BOP_FETCH and
wait for it. Fetching data from the network happens in the
afs_GetDCache() call, and so does not use afs_osi_SleepSig().

Change-Id: Ic9c0c35bd1a2a8fd1901d91dae10cb7719194d25
Reviewed-on: https://gerrit.openafs.org/15637
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
This commit is contained in:
Andrew Deason 2023-11-06 13:49:59 -06:00 committed by Benjamin Kaduk
parent 4dfae3fe35
commit addee42959

View File

@ -132,7 +132,7 @@ afs_addevent(char *event)
#define relevent(evp) ((evp)->refcount--) #define relevent(evp) ((evp)->refcount--)
static int static int
afs_linux_sleep(void *event, int aintok) afs_linux_sleep(void *event, int killable)
{ {
struct afs_event *evp; struct afs_event *evp;
int seq; int seq;
@ -151,22 +151,22 @@ afs_linux_sleep(void *event, int aintok)
AFS_GUNLOCK(); AFS_GUNLOCK();
if (!aintok) { SIG_LOCK(current);
SIG_LOCK(current); saved_set = current->blocked;
saved_set = current->blocked; if (killable) {
siginitsetinv(&current->blocked, sigmask(SIGKILL));
} else {
sigfillset(&current->blocked); sigfillset(&current->blocked);
RECALC_SIGPENDING(current);
SIG_UNLOCK(current);
} }
RECALC_SIGPENDING(current);
SIG_UNLOCK(current);
code = wait_event_freezable(evp->cond, seq != evp->seq); code = wait_event_freezable(evp->cond, seq != evp->seq);
if (!aintok) { SIG_LOCK(current);
SIG_LOCK(current); current->blocked = saved_set;
current->blocked = saved_set; RECALC_SIGPENDING(current);
RECALC_SIGPENDING(current); SIG_UNLOCK(current);
SIG_UNLOCK(current);
}
AFS_GLOCK(); AFS_GLOCK();