"The first is to change the gfp_mask passed to kmalloc(). Using GFP_KERNEL,
it is possible that the VM will call back to the filesystem to free up
memory to satisfy the kmalloc request. GFP_NOFS will prevent this possible
recursion. I believe GFP_NOFS first appeared in the 2.4.6 kernel.
The second change involves the call to schedule() when vmalloc() fails. This
can also cause a hang. The schedule() call could be replaced with:
set_current_state(TASK_INTERRUPTIBLE);
schedule_timeout(HZ);"
"This fixes a livelock condition introduced in my earlier
resource starvation patch; apparently I had erred too far
on the side of "wake up just in case". The livelock bug
is exhibited when running 10 fsstress processes at once;
if many processes are waiting for a new Rx call, they get
stuck in an uninterruptible kernel loop waking each other
up."
"This patch fixes a deadlock in the new dcache locking scheme.
The underlying bug apparently existed before, but due to the
absence of locking, it probably resulted in spuriously high
refcounts rather than deadlock.
The problem happens when there are zero-length dcache entries
associated with a file; this is demonstrated by fsx, which
hangs after running on AFS for a while. The writeback loop
never releases dcache entries unless they're stored back to
the server as part of a sequential byte range."
"The particular problem seems to be, when size
is computed to be zero, tsmall is not filled in with valid data,
and ProcessFS is called with a zeroed out OutStatus. This causes
the file to magically turn into a directory (VDIR), among other
things"
"The second part of the patch doesn't fix any bug that I've ran into
thus far, but seemed like a good idea while I was reading the code
to find the former bug."
Currently it's possible to give StoreData negative Pos/Length/FileLength
arguments and thereby set the volume quota usage to arbitrary values.
This patch makes these values unsigned, since negative file positions
and lengths don't make sense anyway.
no reason server etcdir needs to be forced world readable; nothing need
default to those cellconfig files except in the localauth case and then
you need to be able to read the KeyFile anyway
This patch implements more fine-grained locking for dcache entries.
The main advantage is that multiple chunks of the same file can be
fetched at once. This means that an incorrectly-guessed prefetch
won't block other fetches, prefetches of multiple chunks can occur
in parallel, and multiple processes sharing the same file can read
from different parts of the file at once.
This patch fixes a resource starvation condition in Rx. The
problem arises, for instance, when more than 4 daemons try to
prefetch chunks of the same file at once. The fifth daemon is
stuck in MAKECALL_WAITING state, never getting a chance to run,
because the other 4 daemons never yield to the scheduler after
releasing the call, and just grab the call back again.
afs_RemoveCellEntry holds afs_xcell; setserverprefs modified the same
structure but did not which was problematic if something changed out from under
it
an ext3 journal in the vice cache (root of the partition) is allowable
we have no useful way to discern ext2 from ext3 without groveling in fstab
so just allow it
"My theory of what happened is roughly as follows:
Process tries to read data from AFS (as part of a page fault);
issues a new Rx call on an Rx connection to the fileserver.
The server transmits some data back to the client, but some packet
is lost.
Something tries to garbage-collect/destroy the connection; since
there is an active call, it can't do so, but issues an rx_AckAll
anyway, which acknowledges all packets transmitted by the server
as having been received. Server flushes its retransmit queue.
Client waits forever for the lost packet to arrive, but since the
server has already flushed the transmit queue, it cannot possibly
retransmit it.
All this is happening while the client has read-locked its address
space (since the read is part of a page fault). /proc accesses that
try to poke into that processes address space hang waiting for said
lock, causing the lossage we actually observed."
(as originally discovered by ted@mit.edu)
"This fix deals with the following lose case:
Client starts a call that, for some reason, takes a long time on the
server. While the client waits for the server to finish, client and
server usually send each other keep alive packets. If something
causes those packets to be delayed or dropped, then the client will
conclude that the call has failed or finished (usually failed), while
the server is still *busy* doing the call.
In this circumstance, the client will initiate another call and the
server will correctly respond that it is busy. Unfortunately, if the
callNumber of a received packet doesn't match the callNumber of the
outstanding call, then the client never sees that the server says it's
busy. Instead the server appears as a black hole to the client.
This fix ensures that the client sees the busy packets when its
callNumber is reasonably out of sync with the server."
this caused a call to pdflush to happen at the wrong time, which should fix
the zero filled files problem, the osi_assert(cred) problem and the
execsorwriters == 0 warnings to go away
if you're not using ufs logging it's ok to replace solaris fsck with vfsck,
except sometimes it exits with 40 and that's not a failure to the solaris
scripts.
make it so for us also
This patch makes sure that in-kernel aliases to non-existant names aren't
accidentally created due to case mismatch (e.g. "athena" being created as
a symlink to "athena.MIT.EDU", while "athena.mit.edu" is the real cell
that already exists). It also lowercases cell names in AFSDB lookups,
otherwise the same problem appears in userspace (eg "aklog athena" tries
to obtain tokens for cell "athena.MIT.EDU").