Background: OpenAFS is vulnerable to crashing in the linux kernel symlink
code when running on kernel versions between 2.6.10 to 2.6.12. This also
includes all RHEL4 kernels, because RHEL4 includes the code from 2.6.10. The
problem is that the symlink text caching API, page_follow_link() et al, is
unsuitable for network filesystems where the page cache may be invalidated
in parallel with a path lookup.
This crash can be triggered easily by doing a bunch of path lookups
involving symlinks (e.g., stat() on various files pointed to through links),
while simultaneously running 'fs flushvol' on the volume containing the
symlinks.
The simplest way to fix this problem is to disable the use of symlink text
caching when the kernel does not provide a usable symlink API.
Based on Chris Wing's analysis which stated in part:
GFP_NOFS tells the allocator not to recurse back into the filesystem if it's
necessary to free up memory. However, vmalloc() does not have such an
option. Therefore, calling osi_Alloc() to request more than a page of
memory may end up recursing back into AFS to try to free unused inodes or
dentries.
In this case, what happened was that osi_Alloc() is called within an
AFS_GLOCK(); osi_Alloc() calls vmalloc() which tries to free dentry objects,
which then calls back into the AFS module. Unfortunately, AFS_GLOCK() is
already held and we deadlock.
The afskfw library contains an unprotected call to krb5_free_context
which can result in krb5_free_context being called with a NULL pointer.
MIT's Kerberos libraries do not check that the pointer is non-NULL and
will attempt to use it as a valid pointer which will in turn result
in an invalid memory access error.
This library is used by afslogon.dll which is loaded by winlogon.exe.
If the krb5 profile is invalid, the krb5_init_context call will fail
to allocate a krb5_context structure which can then result in
krb5_free_context being called with a NULL pointer.
An unhandled exception within winlogon.exe will cause a blue screen event
on Windows 2000, XP and 2003.
add a new Windows only pioctl VIOC_PATH_AVAILABILITY that is used
to query the server status for a specified path. Return values
include:
online
offline
all busy
all down
not afs
Fix eventlog reporting. Do not attempt to log an event if the event
source registration fails. Use DebugEvent0 instead of DebugEvent
when there are no parameters.
Modify the LOOKUPKEYCHAIN macro to recognize ERROR_MORE_DATA errors.
Fix the reading of Domain specific configuration for LogonScript and
TheseCells. Previously the dwSize value was being overwritten so that
subsequent RegQueryValueEx call would fail.
Fix a memory leak in the TheseCells reading code.
Add support for Domain specific "Realm" specification. The realm is
the realm to be appended to the username. When logging in as a domain
or to the local machine, the specified "Domain" name is not going to be
a valid realm name.
Construct a proper principal name based upon the domain specified realm
for use in obtaining tokens with KFW.
If the domain specified "TheseCells" list includes the default cell,
do not obtain tokens twice.
There are two serious problems with integrated logon:
(1) openafs afslogon.dll obtains Kerberos v5 tickets and then forwards them
into the logon session. This was done because MIT KFW did not have
such functionality. As of KFW 3.1, KFW does, so we are removing it.
the functionality worked by copying the credentials to a FILE ccache
and then using the Logon Event Handler to move the credentials into
an API ccache and delete the temporary file. For non-interactive
logons the Logon Event handlers do not get triggered. Neither do
LogonScripts get executed. As a side effect, for each logon a
credential cache file was left behind.
(2) when combined with non-interactive logons, there are some very bad
side effects if a network provider performs Kerberos v5 operations.
Each logon occurs in a new logon session and will spawn a private
copy of krbcc32s.exe.
As a result, integrated logon is being disabled for non-interactive
logons.
Improve cache manager performance behind NATs:
* drop cm_daemonCheckUpInterval from 10 minutes to 4 minutes to bring
it under the minimum recommended default port mapping idle timeout
value for NATs
* when a timeout on an rx connection occurs, retry the request once
after forcing a new rx connection. If there was a NAT and the port
mapping changed, the server would respond to the original addr:port
associated with the rx connection. Forcing a new connection will
allow the request to be responded to if the server is accessible.
This should eliminate the UP-DOWN-UP-DOWN bouncing that user's have
seen when working from behind a NAT.
move the AFS Server Manager and AFS Account Manager data cache from
the TransarcCorporation key to the OpenAFS key. The data formats are
not compatible between the two versions and we don't want to be forced
to erase data if users switch back and forth between the two products
during OpenAFS evaluation.
Move the detection of which LAN adapter to use from smb_Init to
smb_NetbiosInit so that it is executed after the service is resumed
via a power management event. Otherwise, when the network comes back
up the service attempts to bind to all LAN adapters instead of just
the loopback or the configured one.
find lana by name is used by the afs control panel to populate the
lana list box. don't use the function to find by name. just use
it to generate the list of all lana names.