bozo: avoid canceling the sigkill timer for hung processes

A sigkill signal is sent to fileserver processes when a timeout is
exceeded for shutting down processes for the fs/dafs bnode.
(Currently 30 minutes for the fileserver, 1 minute for the other
server processes.)

If the bnode goal is set to run before this timeout expires, the
timer is incorrectly stopped, and a wedged process is never killed.
Fix this by not canceling the timer when a fs/dafs process has been
signaled to shutdown, regardless of the current goal.

Change-Id: I2eca8bcb4bac690f3ef671ca4cf375164ff34d5e
Reviewed-on: http://gerrit.openafs.org/7920
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
This commit is contained in:
Michael Meffie 2012-08-01 11:42:34 -04:00 committed by Derrick Brashear
parent 1d8f374266
commit 09f5a1e605

View File

@ -771,7 +771,12 @@ SetNeedsClock(struct fsbnode *ab)
{ {
afs_int32 timeout = POLLTIME; afs_int32 timeout = POLLTIME;
if (ab->b.goal == 1 && ab->fileRunning && ab->volRunning if ((ab->fileSDW && !ab->fileKillSent) || (ab->volSDW && !ab->volKillSent)
|| (ab->scanSDW && !ab->scanKillSent) || (ab->salSDW && !ab->salKillSent)
|| (ab->salsrvSDW && !ab->salsrvKillSent)) {
/* SIGQUIT sent, will send SIGKILL if process does not exit */
ab->needsClock = 1;
} else if (ab->b.goal == 1 && ab->fileRunning && ab->volRunning
&& (!ab->scancmd || ab->scanRunning) && (!ab->scancmd || ab->scanRunning)
&& (!ab->salsrvcmd || ab->salsrvRunning)) { && (!ab->salsrvcmd || ab->salsrvRunning)) {
if (ab->b.errorStopCount) { if (ab->b.errorStopCount) {