From 9bc06a059121207b354fdf97f65029d8c2b3df30 Mon Sep 17 00:00:00 2001 From: Andrew Deason Date: Tue, 7 Feb 2023 22:48:23 -0600 Subject: [PATCH] vol: Re-evaluate conditons for cond vars Most users of cond vars follow this general pattern when waiting for a condition: while (!condition) { CV_WAIT(cv, mutex); } But a few places in src/vol do this: if (!condition) { CV_WAIT(cv, mutex); } It is important to always re-check for the relevant condition after waiting for a CV, even if it seems like we only need to wait exactly once, because pthread_cond_wait() is allowed to wake up its caller spuriously even the CV hasn't been signalled. On Solaris, this can actually happen if the calling thread is interrupted by a signal. In VInitPreAttachVolumes() for DAFS, currently this can cause a segfault if CV_WAIT returns while 'vq' is empty. We will try to queue_Remove() the head of the queue itself, resulting in vq.head.next being set to NULL, which will segfault when we try to pull the next item off of the queue. We generally cannot be interrupted by a signal when using opr's softsig, because signals are only delivered to the softsig thread and blocked in all other threads. It is technically possible to trigger this situation on Solaris by sending the (unblockable) SIGCANCEL signal, though this would be very unusual. To make sure issues like this cannot happen and to avoid weird corner cases, adjust all of our CV waiters to wait for a CV using a while() loop or similar pattern. Spurious wakeups may be impossible with LWP, but just try to make all code use a similar structure to be safe. Thanks for mvitale@sinenomine.net for finding and investigating the relevant issue. Change-Id: Ib188708b909661c28fa1a709306a02b01d3cd6d3 Reviewed-on: https://gerrit.openafs.org/15327 Reviewed-by: Cheyenne Wills Reviewed-by: Mark Vitale Reviewed-by: Marcio Brito Barbosa Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot --- src/vol/volume.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/src/vol/volume.c b/src/vol/volume.c index d9a86f2ed2..6c167caed2 100644 --- a/src/vol/volume.c +++ b/src/vol/volume.c @@ -1025,7 +1025,7 @@ VInitPreAttachVolumes(int nthreads, struct volume_init_queue *vq) while (nthreads) { /* dequeue next volume */ opr_mutex_enter(&vq->mutex); - if (queue_IsEmpty(vq)) { + while (queue_IsEmpty(vq)) { opr_cv_wait(&vq->cv, &vq->mutex); } vb = queue_First(vq, volume_init_batch); @@ -1195,7 +1195,9 @@ VShutdown_r(void) if (VInit < 2) { Log("VShutdown: aborting attach volumes\n"); vinit_attach_abort = 1; - VOL_CV_WAIT(&vol_init_attach_cond); + while (VInit < 2) { + VOL_CV_WAIT(&vol_init_attach_cond); + } } for (params.n_parts=0, diskP = DiskPartitionList; @@ -1320,11 +1322,13 @@ VShutdown_r(void) if (VInit < 2) { Log("VShutdown: aborting attach volumes\n"); vinit_attach_abort = 1; + while (VInit < 2) { #ifdef AFS_PTHREAD_ENV - VOL_CV_WAIT(&vol_init_attach_cond); + VOL_CV_WAIT(&vol_init_attach_cond); #else - LWP_WaitProcess(VInitAttachVolumes); + LWP_WaitProcess(VInitAttachVolumes); #endif /* AFS_PTHREAD_ENV */ + } } Log("VShutdown: shutting down on-line volumes...\n");