don't try so hard to give up all callbacks. If the server doesn't
respond in 10 seconds, too bad!
cleanup the server probe code a bit. reorganize the code so that we
can avoid unnecessary pointer evaluation. add a missing include file.
* Do not give back callbacks to down servers
* Output more cm_scache_t data in afsd_alloc.log
* call VolStatus_Service_Stopped after the service has stopped
This delta adds an interface to an optional volume status handler.
The handler (if provided) receives status updates when volumes
change state between online, offline, busy, and alldown.
enable afsdb records for get cellinfo lookup outside of afsd_service.exe
====================
This delta was composed from multiple commits as part of the CVS->Git migration.
The checkin message with each commit was inconsistent.
The following are the additional commit messages.
====================
do not perform afsdb lookup for Freelance.Local.Root cell
The UNIX client does not follow mount points or symlinks when evaluating
ioctl paths during commands such as "fs examine". The Windows client did
which was annoying when you wanted to know the FID of a mount point that
was not properly being evaluated.
Since the library creates its own background thread, the library must
load its own reference to itself to prevent the library from being
unloaded behind its back.
remove the conditionalized code used to give up callbacks in response
to stat cache recycling due to performance impacts described in the
commit for DELTA windows-give-up-callbacks-20070627
This large patch adds support for giving up callbacks in response to three
events:
1. power management suspend
2. power management shutdown
3. stat cache object recycling
The third item is submitted as a condition compilation if GIVE_UP_CALLBACKS
is defined. Properly handing callback give ups and the associated race
conditions with revokes and fetch status requests requires a great deal of
over head. The first attempt used one GiveUpCallBacks RPC for each callback
that was being dropped as the stat cache object was recycled. This resulted
in a 27% performance drop in the MIT stress test. The code that is being
committed maintains a callback give up list on each server object. The
callback is added to the list as the callbacks are dropped and then they
are sent to the server in bulk by the background daemon thread if the
server is known to be UP after a ping. Logic is added to the
EndCallbackRequest and CallbackRevoke operations to ensure that race
conditions are addressed. With all of this, there is a 17% performance drop
in the MIT stress test.
As a result, it is my conclusion that the client side costs associated with
optimizing the load on the server are simply too high. I am committing this
code to ensure that it is not lost. I will remove this support in the next
patch while leaving the support for giving up all callbacks in response
to suspend and shutdown events.
FIXES 63763
probe for something else for 2.4 and older
====================
This delta was composed from multiple commits as part of the CVS->Git migration.
The checkin message with each commit was inconsistent.
The following are the additional commit messages.
====================
i'll spare you
return an error when the cm_fid_t * is NULL since we can't look up
the volume to obtain a server list without knowing which volume we
should be looking up
if the fidp is known to be NULL, don't call cm_GetServerList()
Add name and ID hash tables for cell lookups. cell lookups occur on
every request. sometimes multiple times. removing the walking of the
cell list when there are dozens of cells decreases cpu utilization and
increases throughput.
there were two sets of registry values that could be used to configure
the daemon thread check intervals. keep the one that was documented
in the release notes and discard the other.
Add a registry value "daemonCheckOfflineVolInterval" to configure the
offline volume check interval.
Ensure that the cm_GetConn... functions initialized the output variables
to NULL on error.
When we are faking the status data we can use the vnode value to determine
if the object should be treated as a directory or file. even is a directory
and odd is a file. This works even when we have never successfully
obtained status data for the object.
If the we can match up the host address from which the revoke was received
with one of our cm_server_t objects, then we know which cell the revoke
has been received from. With that information we can ensure that we only
revoke the status of cm_scache_t objects belonging to that cell.
Reverse the order of the allCellsp list. Append new cells onto the end
of the list. This ensures that the workstation cell will always be the
first in the list. Adding additional cells will not degrade the performance
to the workstation cell.
No longer permit cm_GetCell() or cm_FindCellByID() to return NULL simply
because cm_UpdateCell() failed. The cm_cell_t object still exists and
is valid even if the vlServersp list is empty.
Modify the lock management in cm_GetCell_Gen() to ensure we drop all the
locks.
In cm_Analyze() update the volume status when one of the servers reports
VBUSY or VRESTARTING.
fix deadlock on cm_volumeLock introduced by last week's work
in cm_Analyze, make sure we get a cm_cell_t reference otherwise we
won't find the cm_volume_t we are searching for when ALLOFFLINE or
ALLBUSY.
VMWare adapters have proven unreliable replacements for the Microsoft
loopback adapter. Registering AFS often results in a name space collision.
Add cm_DumpCells() function and dump the cells as part of "fs memdump"
Dump all cm_scache_t and cm_volume_t regardless of reference counts
Fix cm_GetCell_Gen() to not allocate a new cm_cell_t when evaluating
mount points to aliases. Instead, after looking up the alias successfully
search the allCellsp list for the fullname of the cell. If found, use
the existing entry and cleanup the one we were about to allocate.
Use read locks whenever possible instead of write locks when searching
the allCellsp list.