A race condition where the pthread_recursive_mutex_t::owner that is maintained
by AFS doesn’t match with the thread that is trying to unlock.
This leads to AFS file server and ptserver crash due to assertion failure
where it was trying to unlock the grmutex.
We saw the race more often when our customer migrated their machines from
Power8 to Power9 systems and increased the SMT value from 2 to 4.
fileserver Assertion failed! file keys.c, line 911.
ptserver Assertion failed! file userok.c, line 78.
File: keys.c
889 int
890 afsconf_GetKeyByTypes(struct afsconf_dir *dir, afsconf_keyType type,
891 int kvno, int subType,struct afsconf_typedKey **key)
892 {
893 int code = 0;
894 struct subTypeList *subTypeEntry;
895
896 LOCK_GLOBAL_MUTEX;
897
…
910 out:
911 UNLOCK_GLOBAL_MUTEX; <<<<
912 return code;
913 }
Consider a following situation,
cpu0 , cpu1 and T0, T1 and T2 are the cpus and timestamps respectively,
T0: thread1 locks grmutex performs some operations and unlocks the same,
thus has itself set as pthread_recursive_mutex_t::owner. Since presently we do
not reset it, thus, pthread_recursive_mutex_t::owner = thread0.
T1: thread0 starts on cpu0.
T2: thread1 starts on cpu1.
T3: thread0 tries to lock AFS grmutex and acquires corresponding pthread_mutex,
now before thread0 updates pthread_recursive_mutex_t::owner, a context switch
happens.
T3: thread1 on cpu1 tries to acquire grmutex and sees itself as the
pthread_recursive_mutex_t::owner, possibly as it was not reset and updated yet.
So thread1 thinks itself as the owner and proceeds.
T4: thread0 updates the pthread_recursive_mutex_t::owner this time it is also
synced across the cpu caches.
T5: thread1 tries to unlock the grmutex and crashes because now it’s not the
owner of the mutex.
Debugging:
We implemented a circular log to store certain values related to grmutex which
helped in debugging us this further.
({ \
time_t t; \
time(&t); \
LOG_EVENT("%s: Unlocking TID %u: %s:%d owner %lu " \
"locked %d pthread_self %u times_inside %d\n", \
ctime(&t), (unsigned)grmutex.mut.__data.__owner,\
__func__ , __LINE__, \
grmutex.owner, grmutex.locked, (unsigned)pthread_self(), \
grmutex.times_inside); \
opr_Verify(pthread_recursive_mutex_unlock(&grmutex)==0); \
})
$614 = "Mon Sep 11 19:35:34 2023\n: Locking TID 136896:
afsconf_GetKeyByTypes:896 owner 140735030161776 locked 1
pthread_self 2305880432 times_inside 1\n\000 2\n",
$615 = "Mon Sep 11 19:35:34 2023\n: Unlocking TID 136896:
afsconf_IsLocalRealmMatch:602 owner 140735030161776 locked 1
pthread_self 1836773744 times_inside 2\n",
$617 = "Mon Sep 11 19:35:34 2023\n: Unlocking TID 136896:
afsconf_GetKeyByTypes:911 owner 140735030161776 locked 1
pthread_self 2305880432 times_inside 1\n\000\061\n",
Solution:
This problem was resolved after resetting thread_recursive_mutex_t::owner in
global mutex unlock function.
Thanks to Todd DeSantis for helping with debugging, review and verification of
this problem.
Change-Id: Ibe01518094388080a143e31c70ab7ce0ddfca702
Signed-off-by: Indira Sawant <indira.sawant@ibm.com>
Reviewed-on: https://gerrit.openafs.org/15604
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Commit c1d39153da added a cc-wrapper
script to help generate CTF data for all objects, which is generated
by configure. Remove it in the distclean target so we don't leave
a stray file hanging around.
Commit 48ce41a447 added a pkgbuild.sh
script, to be used to build packages for newer versions of macOS,
supplementing the buildpkg.sh that was previously used. Clean it
up in distclean as well (buildpkg.sh was already listed).
Also clean up AFS_component_version_number.c in
libuafs/Makefile.common.in.
These issues were detected while resolving
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1048470 .
Change-Id: I0fea00924a998ee316d48cf76103ea9871a34c6b
Reviewed-on: https://gerrit.openafs.org/15605
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Update all three copies in the tree, and the rpm specfile.
Change-Id: I1685af8c6bfb27c189f8fecb6bc152cf19bf2c26
Reviewed-on: https://gerrit.openafs.org/15601
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
This commit introduces the new set of changes/ files required
to successfully build and package the OpenAFS source code on
MacOS 14.X "Sonoma".
Change-Id: I1e62746b0a131854eeaef31d1457885d43d0b67a
Signed-off-by: GANESH CHAUDHARI <gangovind@in.ibm.com>
Reviewed-on: https://gerrit.openafs.org/15591
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Tested-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The Linux 6.7 commit "fs: new accessor methods for atime and mtime"
(077c212f03) is a follow up to the Linux 6.6 commit "fs: add ctime
accessors infrastructure" (9b6304c1d5)
With the above 6.7 commit, the inode's i_atime and i_mtime are renamed
to __i_atime and __i_mtime and accessing these members should use the
new accessor functions.
This commit is similar to the OpenAFS commit "Linux 6.6: convert to
ctime accessor functions" (072c7934cd)
Add autoconf tests to detect when we need to use the new accessors and
introduce new wrapper functions to get and set an inode's atime and
mtime.
Note, unlike the (072c7934cd) commit, we need to add support for
reading an inode's atime and mtime, so this commit has the getters for
the atime and mtime members.
Change-Id: Ide818c946ed98a1401ef59864c2cdb4eccee9c99
Reviewed-on: https://gerrit.openafs.org/15597
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The directory package as implemented in AFS-2 allocates space for each
directory entry as a DirEntry struct followed by 0-8 contiguous
DirXEntry structs, as needed. This is implemented by:
- afs_dir_NameBlobs calculates the number of blocks needed
- FindBlobs allocates and returns index of entry
- afs_dir_GetBlob returns pointer to 1st DirEntry struct
After this, we populate DirEntry (and any contiguous DirXEntry blocks)
with open code. Most existing code writes the entry's name via a string
copy operation to DirEntry->name, which is only 16 bytes long.
Therefore, for dir entry names that are 16 bytes or longer, OpenAFS
routinely does string copies that look like buffer overruns. This has
not previously caused problems because the OpenAFS code has arranged for
a sufficiently large amount of contiguous memory to be available.
However, this remains undefined behavior in the C abstract virtual
machine; thus compilers are not required to produce safe operation.
Recent changes in the OpenAFS build chain have made this approach no
longer viable:
1) Linux 6.5 commit df8fc4e934c12b 'kbuild: Enable
-fstrict-flex-arrays=3' modified the hardening of several kernel
string operations when running with CONFIG_FORTIFY_SOURCE=y.
2) gcc 13 commit 79a89108dd352cd9288f5de35481b1280c7588a5
'__builtin_dynamic_object_size: Recognize builtin' provides some
enhancements to _builtin_object_size. The Linux commit above will now
use these when the kernel is built with gcc 13.
When OpenAFS is built under Linux 6.5 or higher and gcc 13 or higher,
the hardened strlcpy will BUG for directory entry names longer than 16
characters.
Since there are multiple places where OpenAFS writes directory names,
there are several symptoms that may manifest. However, the first one is
usually a kernel BUG at cache manager initialization if running with
afsd -dynroot _and_ there are any cell names 15 characters or longer in
the client CellServDB. (A 15-character cellname reaches the 16
character limit when -dyrnoot adds the RW mountpoint ".<cellname>".)
Address this by using flexible arrays (standardized with C99). A
flexible array is a variable-length array that is declared with no size
at all, e.g., name[].
Create an autoconf test to determine whether the compiler supports
flexible arrays.
Create a new struct DirEntryFlex. If the compiler supports
flexible arrays, define name[]; otherwise retain the name[16]
definition.
Whenever we write a directory name, use DirEntryFlex so that any
hardening will be satisfied that there is sufficient space for the name.
However, the actual guarantee that this is true is still provided by the
OpenAFS directory routines mentioned above - all of these remain
unchanged.
The DirEntry struct remains unchanged for continued use in OpenAFS, as
well as for any out-of-tree users of the directory package.
Change-Id: I6da5c6c295f051be90017084e5b3a3ef24d1271f
Reviewed-on: https://gerrit.openafs.org/15573
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Since the original IBM code import, the DirOK test for directory entry
names has been off-by-1; it says that directory names of length MAXENAME
256 are "too-long".
Modify DirOK to properly validate directory entry names during salvage.
While here, remove MAXENAME in favor of AFSNAMEMAX.
Change-Id: I76b57899b225934ae60684d704eeb26223e5b8f3
Reviewed-on: https://gerrit.openafs.org/15574
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
In commit 'Linux 6.6: convert to ctime accessor functions' (072c7934cd)
the functiom afs_inode_set_ctime was defined to use time64_t when it
should have used a time_t as the data type for the sec parameter.
See the commit 'LINUX 5.6: define time_t and use timespec/timespec64'
(78049987aa).
The time64_t data type was introduced in Linux 3.17. A build failure
will occur when building on kernels prior to Linux 3.17.
Change-Id: I9b3ce32f8885cde969583cfc74294cdabc54e698
Reviewed-on: https://gerrit.openafs.org/15595
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
The Linux 6.6 commit: "fs: pass the request_mask to generic_fillattr"
(0d72b92883) added an additional parameter to Linux's
generic_fillattr() function.
For openafs, generic_fillattr() is called from the inode_operations
method "getattr", which is implemented in afs_linux_getattr(). The value
for the request_mask parameter is an existing parameter that is passed
to the inode_operations "getattr" method.
Add an autoconf test for 4 parameters to the generic_fillattr function
and update afs_linux_getattr() to pass the request_mask to
generic_fillattr().
Change-Id: Ie1e6b15bd85931debe0fd446760674ddd0492578
Reviewed-on: https://gerrit.openafs.org/15561
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Kailas Zadbuke <kailashsz@in.ibm.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The Linux 6.6 commit "fs: add ctime accessors infrastructure"
(9b6304c1d5) added accessor functions for an inode's ctime member.
A follow on commit "fs: rename i_ctime field to __i_ctime" (13bc244578)
changed the name of the inode member ctime to __i_ctime to indicate it's
a private member.
Add an autoconf test for the ctime accessor function
'inode_set_ctime()'.
Add an afs_inode_set_ctime to LINUX/osi_machdep.h that is either defined
as a macro to Linux's inode_set_ctime, or implements a static inline
function to set a inode's ctime.
Convert the setting of an inode's ctime to use afs_inode_set_ctime().
For more information behind the Linux change, see the commit message
for:
"Merge tag 'v6.6-vfs.ctime'
of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs" (615e95831)
Change-Id: I920aac8029fc3e2c9b7741341b434802403a4577
Reviewed-on: https://gerrit.openafs.org/15560
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
If the caller gives "/vicep" or "vicep" to volutil_GetPartitionID(),
we'll strlcpy() an empty string into ascii[], setting ascii[0] to 0.
Then we check the value of ascii[1], which is uninitialized.
This doesn't result in any bad behavior, because we then immediately
check the value of ascii[0] (which is 0), and return -1. But reading
the uninitialized ascii[1] triggers errors in tools like valgrind, and
is possibly fragile for future code changes.
To avoid this, make sure ascii[] is initialized at the start of the
function, and check if we have copied an empty string into ascii[].
While we are here, also add tests in volutil-t for invalid "vicep*"
strings, to match the existing tests for invalid "/vicep*" strings.
Change-Id: I724f893d4bb6421b955c1c89629ab9f277be98bc
Reviewed-on: https://gerrit.openafs.org/15526
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
A few places in the tree check if a volume name ends in
.readonly/.backup, but don't check the string length beforehand. This
can result in looking at a few bytes before the start of the string,
which may contain garbage data, or may be an invalid memory address.
A few commits have fixed this same issue over the years, such as
4221d7acc8 (Fix segmentation fault in vsu_GetVolumeID), but haven't
caught all of them. Try to fix all of the remaining cases here.
Change-Id: I736b8fa2a45dce7e5255aa055bcf7975f68e939a
Reviewed-on: https://gerrit.openafs.org/15525
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
The Linux 6.6 commit "sysctl: Add size to register_sysctl" (9edbfe92a0)
renamed the Linux function register_sysctl() to register_sysctl_sz() and
added a size parameter. For backward compatibility, a macro,
register_sysctl, is provided. The macro calculates the size of the
ctl_table being registered and passes the size to register_sysctl_sz.
However, in order to perform the size calculation, an array of ctl_table
structures must be passed as the 2nd parameter.
This change only affects the autoconf test used to determine if Linux
provides register_sysctl.
Update the autoconf test for register_sysctl to use an actual ctl_table
structure for the 2nd parameter instead of a NULL.
Change-Id: I8660ad252d7483646011a0c9418e3e95ac690a21
Reviewed-on: https://gerrit.openafs.org/15559
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
The commit 'linux: Replace fop iterate with fop iterate_shared'
(gerrit 15528) uses a Linux provided wrapper for the file_operations
iterate_shared member. The wrapper is provided by Linux to give
filesystem time to migrate to using the iterate_shared directly.
Review of the locking done by the afs_linux_readdir shows that it is
unnecessary to use the migration wrapper.
Note: Removing the use of the wrapper as a separate commit from the
15528 commit in case there is a need to use the wrapper. In addition,
we would like to use the wrapper with the current openafs stable release
and use the development branch for additional testing.
Change-Id: I662f95f69c2d06948f764b9520567efbca8f692c
Reviewed-on: https://gerrit.openafs.org/15550
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
The Linux 6.5 commit:
'vfs: get rid of old '->iterate' directory operation' (3e32715496)
removed the filesystem_operations iterate method. The replacement
method, iterate_shared, was introduced with the Linux 4.6 commit:
'introduce a parallel variant of ->iterate()' (6192269444)
The above commits indicate that the iterate_shared is an "almost"
drop-in replacement for iterate. The vfs documentation for
iterate_shared has caveats on the implementation (serializing in-core
per-inode or per-dentry modifications and using d_alloc_parallel if
doing dcache pre-seeding). A wrapper is provided to assist filesystems
with the migration from iterate to iterate_shared. Until it can be
verified that afs_linux_readdir meets the above requirements, we will
use the wrapper (ref 3e32715496 commit)
Add configure tests for the iterate_shared file_operations member and
for the wrap_directory_iterator function.
Update osi_vnodeops.c to use iterate_shared and the wrapper if they are
both available.
Change-Id: I60d7605a79947198a429def5bce77cd09a16c304
Reviewed-on: https://gerrit.openafs.org/15528
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Most users of cond vars follow this general pattern when waiting for a
condition:
while (!condition) {
CV_WAIT(cv, mutex);
}
But a few places in src/vol do this:
if (!condition) {
CV_WAIT(cv, mutex);
}
It is important to always re-check for the relevant condition after
waiting for a CV, even if it seems like we only need to wait exactly
once, because pthread_cond_wait() is allowed to wake up its caller
spuriously even the CV hasn't been signalled. On Solaris, this can
actually happen if the calling thread is interrupted by a signal.
In VInitPreAttachVolumes() for DAFS, currently this can cause a
segfault if CV_WAIT returns while 'vq' is empty. We will try to
queue_Remove() the head of the queue itself, resulting in vq.head.next
being set to NULL, which will segfault when we try to pull the next
item off of the queue.
We generally cannot be interrupted by a signal when using opr's
softsig, because signals are only delivered to the softsig thread and
blocked in all other threads. It is technically possible to trigger
this situation on Solaris by sending the (unblockable) SIGCANCEL
signal, though this would be very unusual.
To make sure issues like this cannot happen and to avoid weird corner
cases, adjust all of our CV waiters to wait for a CV using a while()
loop or similar pattern. Spurious wakeups may be impossible with LWP,
but just try to make all code use a similar structure to be safe.
Thanks for mvitale@sinenomine.net for finding and investigating the
relevant issue.
Change-Id: Ib188708b909661c28fa1a709306a02b01d3cd6d3
Reviewed-on: https://gerrit.openafs.org/15327
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
In rxi_ReceiveServerCall(), we compare the callNumber in the given
packet to the callNumber on the conn's channel. If the packet's
callNumber is smaller, it indicates the packet is for an earlier call
that has since ended, and we ignore it.
However, we perform this check after checking whether we need to
allocate a new call (conn->call[channel] is NULL), or use an existing
call. If we allocate a new call, we don't check the conn's callNumber
at all, and unconditionally set it to the callNumber in the packet.
This means that if a server call ends and is successfully
rxi_FreeCall()'d, the server will accept any callNumber on the same
channel. If the server sees an old DATA packet from an earlier call
after this happens, it means the server can effectively re-run an RPC
that has already completed successfully, or that the client has
already seen fail.
A server can see a DATA packet for an old call under a variety of
situations, which is, after all, why we check it (dup'd/delayed
packets, the client could still be trying to run an old call, etc).
Seeing one for a freed call is less likely since that usually requires
more time to have passed, but is still possible.
Checking the callNumber was effectively moved in commit 99b43273c0
(rx: prevent connection channel assignment race) in the 1.7.x series.
This commit makes the check similar to how it was in 1.6.x and
earlier.
Change-Id: Ia4990e210d4cd455744586c046490c0f3f56853a
Reviewed-on: https://gerrit.openafs.org/15524
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
The Linux 6.5 commit:
panic: make function declarations visible (d9cdb43189)
added a declaration for abort into panic.h.
When building the Linux kernel module, the build fails with the
following:
src/crypto/hcrypto/kernel/config.h:95:20: error: static declaration of
‘abort’ follows non-static declaration
95 | static_inline void abort(void) {osi_Panic("hckernel aborting\n"
);}
| ^~~~~
...
from ./include/linux/wait.h:9,
from /openafs/src/afs/sysincludes.h:118,
from /openafs/src/crypto/hcrypto/kernel/config.h:30:
./include/linux/panic.h:36:6: note: previous declaration of ‘abort’
with type ‘void(void)’
36 | void abort(void);
| ^~~~~
Update the declaration in hcrypto/kernel/config.h to change the function
name from abort to _afscrypto_abort and use a preprocessor define to
map abort to _afscrypto_abort.
Change-Id: I72a4d01b12c9d944481569748f129002b6a8c8fc
Reviewed-on: https://gerrit.openafs.org/15501
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>
The linux 6.5 commit:
"sysctl: Remove register_sysctl_table" (b8cbc0855a)
removed the Linux function register_sysctl_table(). The replacement
function is register_sysctl(), which offers a simpler interface.
Add an autoconf test for the Linux function register_sysctl and add a
call to register_sysctl when available.
Notes:
The Linux function register_sysctl was added in Linux 3.3 with the
commit:
'sysctl: Add register_sysctl for normal sysctl users' (fea478d410)
with a note that it is a simpler interface.
The function register_sysctl_table was marked as deprecated with the
Linux 6.3 commit:
'proc_sysctl: enhance documentation' (1dc8689e4c)
Change-Id: I657bee103cd1f4ab2e60b3f3471c3fc12d8b6be4
Reviewed-on: https://gerrit.openafs.org/15500
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Some of the comments refer to names from an earlier implementation.
Update them to match the current implementation.
No functional change is incurred by this commit.
Change-Id: I9d16477abbfceceaf6b79d6781542d9c6df3468f
Reviewed-on: https://gerrit.openafs.org/15414
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Commit f2003ed68c (gcc: Avoid false positive use-after-free in crypto)
added a configure check to detect whether the compiler we're using
exhibits the use-after-free warning bug. We add -O0 to CFLAGS for the
test to make sure the bug triggers for gcc 12, but if the user has
specified, for example, CFLAGS=-D_FORTIFY_SOURCE=1, this causes
the compile check to always fail:
/usr/include/features.h:330:4: error: #warning _FORTIFY_SOURCE requires compiling with optimization (-O) [-Werror=cpp]
# warning _FORTIFY_SOURCE requires compiling with optimization (-O)
This causes _OPENAFS_UAF_COMPILE_IFELSE to always fail, and so we
throw an AC_MSG_ERROR during configure.
To allow the build to continue with _FORTIFY_SOURCE, make sure
_FORTIFY_SOURCE is undefined for this specific test. The compile test
can then succeed with -O0 (unless we trigger the use-after-free bug,
of course).
Change-Id: Ib0d21b07b1e7892a6d411ee0c34a4ff7baf75cc3
Reviewed-on: https://gerrit.openafs.org/15499
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The Linux 6.5 commit:
'splice: Remove generic_file_splice_read()' (c6585011bc)
replaces the function generic_file_splice_read() with the function
filemap_splice_read().
The Linux function 'filemap_splice_read()' was introduced with the
Linux 6.3 commits:
'splice: Add a func to do a splice from a buffered file without
ITER_PIPE' (07073eb01c)
'splice: Export filemap/direct_splice_read()' (7c8e01ebf2)
With updates in Linux 6.5:
'splice: Fix filemap_splice_read() to use the correct inode'
(c37222082f) -- which fixes a problem in the code.
'splice: Make filemap_splice_read() check s_maxbytes' (83aeff881e)
Due to the fact that there could be problems with splice support prior
to Linux 6.5 (where filemap_splice_read()'s use was expanded to
additional filesystems other than just cifs), we only want to use
'filemap_splice_read()' in Linux 6.5 and later.
The LINUX/osi_vnodeops.c file is updated to use 'filemap_splice_read()',
for Linux 6.5 and later, for the splice_read member of the
file_operations structure.
Change-Id: Ia07c66ddc5d4f836f230b5d03567803e124b93b8
Reviewed-on: https://gerrit.openafs.org/15486
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The pod2man tool determines a man page title (set in the .TH macro) from
the input filename, unless the -n (--name) option is specified.
Our AFS::ukernel man page input file is named AFS.ukernel.pod to avoid
colons in the filename (since colon characters are not supported on
Windows), so the generated man page contains the title "AFS.ukernel"
instead of "AFS::ukernel".
Use the pod2man -n (--name) option when converting section 3 man pages
to override the automatic title naming. This fixes the .TH macro in the
generated AFS::ukernel.3 file. Fortunately, the -n (--name) option is
only needed for section 3 man pages.
Specifying the pod2man -n (--name) option is simpler and less invasive than
renaming pod3/AFS.ukernel.pod to pod3/lib/AFS/ukernel.pod (which would
also fix the embedded title).
Change-Id: I495ea2d30ce1b34698519ffa34a39362c449ba09
Reviewed-on: https://gerrit.openafs.org/15363
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The test programs in bucoord, butc, and butm have bit rotted and no
longer build. Update the code to fix error and warnings, and remove
dead code. Update the relevant makefiles so these test programs build
by default to avoid bitrot in the future.
With this change the test programs will build and run, but a tape
drive device is required to perform testing.
Change-Id: I978b98fe63fe6051409fc160e8c7d43050bcf707
Reviewed-on: https://gerrit.openafs.org/15098
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Due to a bug in gcc-12 and gcc-13, several warnings are generated for a
use-after-free in crypto.c, which leads to a build failure with
--enable-checking:
src/external/heimdal/krb5/crypto.c:1157:9: error: pointer ‘p’ may be
used after ‘realloc’ [-Werror=use-after-free]
1157 | free(p);
| ^~~~~~~
src/external/heimdal/krb5/crypto.c:1155:20: note: call to ‘realloc’
here
1155 | result->data = realloc(p, sz);
| ^~~~~~~~~~~~~~
However, reviewing the code around these warnings shows that the
use-after-free warnings are incorrectly generated (false positive). The
documentation for realloc states that realloc will return a NULL and not
alter the storage passed if there was an error allocating and the size
passed is non-zero.
There is a possible work-around for the false positive. One can use a
variable that is not a member of a structure to hold and test the value
returned from realloc, then update the structure member from that
variable.
However, the code that is producing the message is in a heimdal external
file, so we cannot modify the source. So just use the compiler flag
-Wno-use-after-free to avoid the warning/error.
Update configure to add tests for the -Wno-use-after-free flag, update
the Makefile to add the flag for CFLAGS.crypto.lo, and update CODING
for the new exception.
Because this is an important check, only disable the warning if the
compiler exhibits this specific bug. We do this by adding specific
configure tests for the compiler bug and conditionally set a CFLAG
variable if the bug is present.
NOTE: The false positive and work-around can be demonstrated with the
following code using gcc-12 (with -O0) or gcc-13 (not sensitive to the
optimization level):
somestruct->somepointer = realloc(ptr, somesize);
if (somestruct->somepointer == NULL && somesize != 0) {
free(ptr); << gets flagged as use-after-free
handle enomem...
}
However the following doesn't get flagged:
char *tmpptr = realloc(ptr, somesize);
if (tmpptr == NULL && somesize != 0) {
free(ptr);
handle enomem...
}
somestruct->somepointer = tmpptr;
The GCC ticket https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110501
has been marked as confirmed.
Change-Id: I28566354da4199389964210f954cda1213098088
Reviewed-on: https://gerrit.openafs.org/15471
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Since the original IBM code import, attach2 has set the volume's index
bitmaps to NULL in preparation for allocating and initalizing new
bitmaps. However, the volume may already have bitmaps from previous
operations, and this is much more likely with DAFS. In this case, the
old bitmaps are leaked.
Instead, free any existing bitmap before allocating a new one.
Discovered via Solaris libumem.so.1.
Change-Id: I77e42ccd36cedead060974a4fe99e37b4f262093
Reviewed-on: https://gerrit.openafs.org/15428
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
If the Linux kernel has CONFIG_WERROR enabled, and openafs is configured
with --disable-checking, the libafs kernel module fails to build:
/src/libafs/MODLOAD-5.14.0-305.el9.x86_64-MP/evp.c:501:9:
error: cast between incompatible function types from
‘void (*)(void *)’ to ‘int (*)(EVP_MD_CTX *)’ {aka
‘int (*)(struct hc_EVP_MD_CTX *)’} [-Werror=cast-function-type]
501 | (hc_evp_md_init)null_Init,
| ^
The openafs commit:
Linux-5.17: Kernel build uses -Wcast-function-type (6bdfa97673)
fixed above error when the Linux kernel has CONFIG_WERROR enabled and
openafs is configured with --enable-checking. But we will still fail
when CONFIG_WERROR is enabled and openafs is configured with
--disable-checking (which is the default).
Update osconf.m4 to always set CFLAGS_NOCAST_FUNCTION_TYPE, so that it
can be used to avoid the above compiler check even when checking is
disabled.
NOTE: The only use of CFLAGS_NOCAST_FUNCTION_TYPE is to correct the
warnings flagged in external/heimdal/hcrypto/evp.c and evp-algs.c.
NOTE: --enable-checking=all can be used to bypass setting the define.
Change-Id: I39824f38cfbcf045f59744de5109a99fd4ad041a
Reviewed-on: https://gerrit.openafs.org/15417
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
The Linux kernel module build will issue a warning when a stack frame
exceeds a specific size via the -Wframe-larger-than= compiler flag (with
a default size of 2048 bytes on most architectures).
At least one distribution, Oracle's Linux with the Unbreakable
Enterprise Kernel (UEK), hardens this check by changing the warning to
an error (-Werror=frame-larger-than=).
Several of the openafs autoconf tests use objects that are allocated on
the stack when testing for structures, or members of a structure.
When the warning is changed to an error, configure fails in several
locations when testing against Linux's task_struct structure, which
exceeds 2048 bytes in size.
openafs/conftest.dir/conftest.c:72:1: error: the frame size of 9984
bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
Update the autoconf Linux tests that allocate structures to use a
statically allocated structure instead of one allocated on the stack.
Change-Id: Ica44373b02fb4efa81d8fcbc0337aea9719b4fb6
Reviewed-on: https://gerrit.openafs.org/15477
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
The Gentoo distribution changed the default fortify_source setting for
clang (to match the default being used for gcc). This change causes the
following build error:
src/dir/salvage.c:26:9: error: 'printf' macro redefined
[-Werror,-Wmacro-redefined]
^
/usr/include/bits/stdio2.h:89:11: note: previous definition is here
^
1 error generated.
If the fortify_source level is greater than 1, glibc can define printf
as a macro. The clang compiler has a default check for macro
redefinitions (-Wmacro-redefined), while gcc does not provide this
option.
Remove the:
#define printf Log
in src/dir/salvage.c and update callers to use Log() instead of
printf().
Clean up lines that split the call to Log and its first parameter. Fix
the indentation for Log's parameters.
There are no functional changes with this commit.
Change-Id: I732c8345dfd4cf140883bff4b53df9ab9bea5a85
Reviewed-on: https://gerrit.openafs.org/15462
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
By default, the makesrpm.pl script downloads a CellServDB file from
grand.central.org. The URL of the file to be downloaded is specified in
the openafs.spec.in file from the source distribution archive. Network
access can be avoided by specifying a local file to be packaged with the
'cellservdb' positional argument. Due to implementation limitations of
the makesrpm.pl script, the specified file must be in current working
directory and the filename must the same as the CellServDB URL basename
in the openafs.spec.in file (currently 'CellServDB.2022-05-09').
This change adds support for building packages with an alternative
CellServDB, and adds support for specifying a local CellServDB file by
path.
Add the --cellservdb-url option to override the URL used to download the
CellServDB file when the local file is not specified.
Update the sed commands used to generate the openafs.spec file to
update the CellServDB source directive (Source20) when an alternate
URL is given with the --cellservdb-url.
Replace the shell pipe line to extract the CellServDB URL from the
openafs.spec.in file with perl code. Search for the 'Source20' source
directive instead of searching for the string 'dl/cellservdb' to find
the CellServDB source URL. (After this commit, the source directive
number of the CellServDB source should not be changed without also
changing the number in makesrpm.pl).
Change the handling of the 'cellservdb' positional argument so the local
file is copied to required name in the temporary SOURCES directory. This
also allows us to specify the local file by path, instead of requiring
the local CellServDB file to be in the current working directory when
running makesrpm.pl.
The --cellservdb-url option and the 'cellservdb' argument are not
exclusive. If both are specified, the URL will be set in the generated
openafs.spec file to indicate the upstream location of the packaged
CellServDB file provided by the 'cellservdb' positional argument.
Change-Id: If8818bcdee8500a164f992f42df3c4f9a587d0cd
Reviewed-on: https://gerrit.openafs.org/15405
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
It looks that EndTimestamp holds the user’s token expiration time even after
function afs_MarkUserExpired() gets called from afs_pag_destroy(). So it
seems that the PAGs/tokens are not being reclaimed. This will lead to pag
buildup on the system.
The PAG/'struct unixuser' is not being freed because afs_GCUserData() checks
afs_HasUsableTokens(), which checks EndTimestamp, which says the tokens still
haven't expired. So the PAG doesn't get freed.
This behavior has existed since commit ba1d050c6e (Make unixuser's vid
advisory).
To fix this, change afs_GCuserData() to check for the UHasTokens flag, and
ignore any tokens if UHasTokens isn't set. This causes the PAG to be freed,
since we don't look at the EndTimestamp of the tokens at all.
Thanks Todd DeSantis for your support.
Change-Id: Ic46e2487b1971aed9ca6b6e5aa5e5d0b2fd9c31e
Signed-off-by: Kailas Zadbuke <kailashsz@in.ibm.com>
Reviewed-on: https://gerrit.openafs.org/15404
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Since its introduction with commit 7e4e06b87a "Derive DES/fcrypt
session key from other key types", rxkad_derive_des_key has failed to
free the memory associated with its HMAC context struct.
This results in a leak of at least 352 bytes for each rxkad challenge
response processed by an OpenAFS server when using rxkad-kdf.
Free the memory by calling HMAC_CTX_cleanup after each round of the
loop.
Discovered via Solaris libumem.so.1.
Change-Id: Ic9f130f7229c70e4eaa68ba1f0a213b4b23229cc
Reviewed-on: https://gerrit.openafs.org/15427
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Our 'fs flush' and related commands (flushall, flushvolume) clear the
relevant entries in the OpenAFS stat cache and data cache, which can
fix problems if the cache ever becomes incorrect for any reason. (This
can happen after bugs, repairing corrupted volumes, disaster recovery
scenarios, and similar edge cases.)
However, on Linux, these commands don't affect the VFS dentry cache.
If someone needs to use an 'fs flush' command to fix a problem, this
will fix the OpenAFS cache, but the Linux dcache can still be wrong.
The only way to manually flush dcache entries is to use the global
'drop_caches' mechanism, which is a very heavweight operation, only
accessible to root.
For example:
$ ls -l
ls: cannot access foo.1: No such file or directory
total 2
drwxrwxr-x. 2 bin adeason 2048 Apr 6 14:20 dir
-?????????? ? ? ? ? ? foo.1
$ fs flush .
$ ls -l
ls: cannot access foo.1: No such file or directory
total 2
drwxrwxr-x. 2 bin adeason 2048 Apr 6 14:20 dir
-?????????? ? ? ? ? ? foo.1
$ sudo sysctl -q -w vm.drop_caches=3
$ ls -l
total 3
drwxrwxr-x. 2 bin adeason 2048 Apr 6 14:20 dir
-rw-rw-r--. 1 bin adeason 29 Sep 22 2022 foo.1
To make the 'fs flush' commands be effective in more situations,
change afs_ResetVCache() to also invalidate the dcache entries
associated with each vcache we reset. To make things simpler and
reduce locking complexity, do this by setting d_time to 0, and don't
directly run dcache-managing functions like d_invalidate or d_drop,
etc.
The above example now becomes:
$ ls -l
ls: cannot access foo.1: No such file or directory
total 2
drwxrwxr-x. 2 bin adeason 2048 Apr 6 14:20 dir
-?????????? ? ? ? ? ? foo.1
$ fs flush .
$ ls -l
total 3
drwxrwxr-x. 2 bin adeason 2048 Apr 6 14:20 dir
-rw-rw-r--. 1 bin adeason 29 Sep 22 2022 foo.1
Change-Id: Ic95ef6137f5b671baf68a4890ef1562f029af794
Reviewed-on: https://gerrit.openafs.org/15391
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
On Linux, we can define a .getattr callback for files and directories,
which is called when a caller requests metadata for the file, such as
during lstat(). For regular files and directories, we set this to
afs_linux_getattr(), which updates the metadata for the file in
question if it's out of date (CStatd not set).
For symlinks, however, we don't set .getattr at all. This would seem
to allow symlink metadata to become stale if another client changes
it, but the metadata often happens to stay up to date via other means.
For example, we can see the following happen:
- Another client changes, for example, the owner of a symlink we have
cached.
- The fileserver sends us a callback break, and so we clear CStatd for
the vcache for the symlink.
- lstat() is called for the symlink, which causes
afs_linux_dentry_revalidate() to be called for the cached dentry.
Since CStatd is not set for the vcache, we verify the entry by
calling afs_lookup(), and then update the symlink's metadata via
afs_getattr() and vattr2inode().
However, if CStatd _is_ set for the symlink when lstat() is called,
afs_linux_dentry_revalidate() will not update anything, and will just
return success. So, if we manage to set CStatd for the symlink without
updating it's Linux VFS metadata, the Linux metadata won't be updated,
and we'll report the old metadata to the caller of lstat().
We can set CStatd without updating the Linux VFS info in a few
different ways. A few pioctls such as PRemoveMount or PFlushMount can
do this if they encounter an error, but the only code paths that call
these pioctls in OpenAFS (via the 'fs' utility) also lstat() the
relevant path, so this doesn't happen in practice.
A more common way that this can occur is via afs_DoBulkStat(). If
userspace triggers a bulkstat that includes the symlink in question,
the symlink would gain the CStatd flag without any interaction with
the Linux VFS.
For example, say a symlink was chown'd from 'adeason' to 'root'. On
another client with the symlink cached, running 'ls -l' on the symlink
itself would show the updated owner, because
afs_linux_dentry_revalidate() updates the metadata:
$ ls -l dir.slink
lrwxr-xr-x. 1 root root 3 May 5 14:48 dir.slink -> dir
But if we 'ls -l' the entire directory, which contains other entries,
we will bulkstat many of the entries, possibly including the symlink.
And so we may see the old metadata:
$ ls -l
total 9
[...]
lrwxr-xr-x. 1 adeason root 3 May 5 14:48 dir.slink -> dir
Triggering this behavior requires a bulkstat to be triggered before we
access the symlink itself, and so triggering this behavior depends on
the order of the entries in the directory as well as whether the other
items in the dir are cached. As such, triggering this behavior during
normal operation tends to be inconsistent and confusing.
The only lstat() info for symlinks that can change like this is the
owner, group, and modtime; mode bits cannot change, and neither can
the length/size (or the contents in general). So, stale metadata tends
to not be very noticeable.
To fix all of this, set .getattr to afs_linux_getattr() for symlinks,
just like we do for regular files and directories. This ensures that
we will update the Linux VFS metadata for the symlink when it is
requested, so we won't return stale metadata to callers.
This behavior appears to have existed for symlinks on Linux for quite
a while, possibly since our Linux 2.6 support was added. The behavoir
may have been introduced around commit b860b359d5
(initial-linux24-support-20001105) or commit 0054374495
(linux22-fix-20040405). Before those commits, we defined a .revalidate
callback for symlinks, which was called on older Linux versions before
fetching file metadata, and so probably ensured that an lstat() on a
symlink returned properly updated info.
Change-Id: I90d751e3175e24e9f9409cea6c079b097c78b51b
Reviewed-on: https://gerrit.openafs.org/15423
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Each time the bosserver starts, it checks for the presence of the client
configuration directory and the CellServDB and ThisCell files within it.
When not found, the bosserver creates the client cell configuration
directory. When the CellServDB and ThisCell files are not present in the
client configuration directory, the bosserver creates symlinks to the
server's CellServDB and ThisCell files. This feature of the bosserver
was a convenience when older versions of vos and pts required a client
configuration directory.
However, modern deployments tend to use packaging, with separate client
and server packages. The client configuration directory and files in it
are typically managed by the client packaging. Currently, packagers
must work around these symlinks created by the bosserver. Additionally,
the CellServDB and ThisCell symlinks are hazardous since writing to the
client-side configuration will overwrite the server configuration.
This commit removes the creation the client configuration directory and
the CellServDB and ThisCell symlinks during bosserver startup. This
change is intended to decouple the server from the client, help to avoid
overwriting the server configuration, and avoid requiring client artifacts
on a server.
Change-Id: I8aba81185a52e6792d36ae28a430ebe1c23919d2
Reviewed-on: https://gerrit.openafs.org/12586
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Add an initialization retry in the bos, vos, and pts commands to
fallback to the server configuration directory when initialization fails
with the client configuration directory. This allows admins to run
unauthenticated bos, vos, and pts commands on servers without a client
configuration (including symlinks created by the bosserver) without
any extra command line options.
Perform the initialization retry only when the -localauth or -config
options are not given. The bos, vos, and pts commands already use the
server configuration path when the -localauth option is given, so there
is no point in retrying the same path. The vos and pts -config option
specifies the path to be used, so we do not fallback to a different
directory when the user specifies the configuration path to be used.
While here, change the scope of the confdir variable in vos.c from a
global to a local variable, since it is only used within the
MyBeforeProc() function.
This change does not add a vsu_ClientInit() retry in the bos salvage
command. That command always requires authorization, so when run without
-localauth requires a token (and therefore a cache manager and client
cell configuration).
Update the bos, vos, and pts man pages to describe this new fallback
method to lookup the configuration directory. (The AFSCONF environment
variable and .AFSCONF files are currently undocumented in the man pages.
They should be documented or removed from the code in a future change.)
Change-Id: I55c3109494db744e7bc2defcb54eaee3b4e30018
Reviewed-on: https://gerrit.openafs.org/15351
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
During shutdown, the function shutdown_osisleep is called in
src/afs/afs_osi.c. The body of this function is platform-
specific, and on AIX there is a call to xmfree:
xmfree(tmp);
However, on AIX, xmfree actually takes two arguments:
int xmfree ( ptr, heap)
caddr_t ptr;
caddr_t heap;
This is called elsewhere in the OpenAFS code correctly for
kernel_heap. In src/afs/AIX/osi_sleep.c we start using the
pinned_heap, but never xmfree it. Therefore, we need to do so here
during the shutdown. Here we include a header file which defines
pinned_heap, and then supply it as an argument to xmfree. This
prevents a kernel panic during OS shutdown.
The panic might go unnoticed in many environments, as during a
reboot, the system will normally dump and restart anyway. However,
if kdb is loaded (bosboot -a -D), the system will break into the
debugger before the full shutdown procedure completes. The stack
trace shows the following:
KDB(0)> stack
pvthread+01D200 STACK:
[00023900]abend_trap+000000 ()
[000EFF24]xmfree_frontend+0000A4 (??, ??, ??)
[F1000000C064CF1C]shutdown_osisleep@AF72_5+0000BC (??)
[F1000000C064CB7C]shutdown_osi+00001C ()
[F1000000C064A51C]afs_shutdown+0003BC (0000000100000001)
[F1000000C05A8DD4]afs_unmount+000094 (F1000A01501D4C10, 0000000000000000)
[F1000000C059FCE0]vfs_unmount+0000A0 (F1000A01501D4C10, 0000000000000000,
F1000A015047B07C)
[00014D70].hkey_legacy_gate+00004C ()
[006A6AAC]vfs_unmount+00008C (??, ??, ??)
[006B4228]kunmount+000228 (??, ??, ??, ??)
[006B4944]uvmount+000204 (??, ??)
[00003954]syscall+00024C ()
[100084FC]helper_UMfunc+00027C (??, ??)
[10003D48]dounmount+0000C8 (??, ??, ??, ??)
[100044DC]umountmain+0001BC (??, ??)
[10000AD4]main+0000B4 (??, ??)
[10000168]__start+000068 ()
Change-Id: I3131dc7d184a299e7d7806ce8a0c3d890b5c3624
Reviewed-on: https://gerrit.openafs.org/15419
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Addresses a situation where a write in AFS space can cause a kernel
panic. In src/afs/afs_osi_uio.c in the function afsio_partialcopy:
size_t space_len = sizeof(struct uio) +
sizeof(struct iovec) * AFS_MAXIOVCNT;
/* Allocate a block that can contain both the UIO and the iovec */
space = osi_AllocSmallSpace(space_len);
On newer AIX systems (since at least 6.1), space_len is larger than
AFS_SMALLOCSIZ. When osi_AllocSmallSpace is called, the following
test in src/afs/afs_osi_alloc.c causes a kernel panic:
if (size > AFS_SMALLOCSIZ)
osi_Panic("osi_AllocSmallS: size=%d\n", (int)size);
This is due to the following definition in src/config/afs_args.h:
/*
* Note that the AFS_*ALLOCSIZ values should be multiples of sizeof(void*) to
* accomodate pointer alignment.
*/
/* Used in rx.c as well as afs directory. */
#if defined(AFS_AIX32_ENV) || defined(AFS_HPUX_ENV)
/* XXX Because of rxkad_cprivate... XXX */
#define AFS_SMALLOCSIZ (38*sizeof(void *)) /* "Small" allocated size */
#else
#define AFS_SMALLOCSIZ (64*sizeof(void *)) /* "Small" allocated size */
#endif
All the supported AIX platforms define AFS_AIX32_ENV in
src/config/param.rs_aixXX.h, where XX is the AIX version. Therefore,
all the AIX platforms end up with AFS_SMALLOCSIZ = 152 bytes instead
of 256. To resolve this, we will modify the preprocessor test to use
the second case for AIX versions greater than 6.1. This issue may be
present on earlier releases of AIX as well, but AIX 5.3 and older
test systems were not available at this time.
Also, a spelling error in the comment was fixed.
Change-Id: Id6b094e28e1be1cfb71296c5c31e60592ba91395
Reviewed-on: https://gerrit.openafs.org/15418
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Commit 00fd29251e audit-prototypes-20090315 wrapped audmakebuf with an
ifdef for AFS_AIX32_ENV. However, this made redundant a pre-existing
ifdef for the same symbol within audmakebuf.
Remove the redundant ifdef. While here, add a concluding comment to the
superseding endif.
No functional change is incurred by this commit.
Change-Id: I5b58538c0371ad820f6de662b8367e6e3aee1e66
Reviewed-on: https://gerrit.openafs.org/15413
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
In the original IBM code import, auditing was for AIX and fileservers
only. The /usr/afs/local/Audit configuration file supported two
directives:
AFS_AUDIT_AllEvents enabled writing to AIX audit subsystem
Echo_Trail enabled echoing to stdout (osi_echo_trail=1)
However, even in the IBM AFS 3.6 Administrator Guide, only
AFS_AUDIT_AllEvents was documented; Echo_Trail was undocumented.
Later, commit 16d67791dc auditlogs-for-everyone-20050702 extended
auditing to more platforms and servers, but the Audit file support
remained unchanged.
Later, commit 7b0b6a0ce9 "Enhance audit logs to support SysV message
queues" removed the last effective use of osi_echo_trail.
Remove the vestigial support for echoing the audit records to stdout.
Change-Id: I2c2d5d0a76c5a3a5d8560c93e2587ab6d057c8ed
Reviewed-on: https://gerrit.openafs.org/15412
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
The msghdr structure is used as a parameter to the Linux kernel
functions kernel_sendmsg() and kernel_recvmsg(). Some required fields
need to be set prior to calling these functions, but there are also
additional structure members that may not be used by the calling code.
Some of these fields may be initialized by the Linux kernel functions
being used, but there may be some that are left uninitialized.
To ensure that all fields in the msghdr structure are cleared, use
memset to zero the entire structure. This will eliminate the need to set
individual fields to 0 or NULL.
Change-Id: I7cdc7539084d13efedbb20e7ebd5c242693a628c
Reviewed-on: https://gerrit.openafs.org/15409
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Since the initial Linux commit (2.6.12-rc2) the Linux function
kernel_recvmsg() handles the initialization of the msghdr members
related to the iov (msghdr.msg_iter and for earlier kernels
msghdr.msg_iov and msghdr.msg_iovlen).
The code in osi_NetReceive for Linux unnecessarily initializes these
structure members prior to calling kernel_recvmsg().
Remove the unnecessary code from osi_NetReceive along with the
surrounding preprocessor check for STRUCT_MSGHDR_HAS_MSG_ITER. Since
this is the only location that uses this define, also remove the
associated autoconf test.
NOTE: This was discovered while investigating changes needed for
Linux 6.4, due to: "iov_iter: add iter_iovec() helper" (de4f5fed3f)
which renamed the Linux structure iter_iovec's member iov to __iov.
Since the openafs code that was affected by the Linux 6.4 change is
being removed as unnecessary, this commit is not Linux 6.4 specific,
but effects all versions of Linux.
Change-Id: Id9931728ad96b07e5e1c3472d9689fe6dd2905f9
Reviewed-on: https://gerrit.openafs.org/15408
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The maxsize argument in xdr_string() is garbage when called by
xdr_free(), since xdr_free() only passes the XDR handle and the xdr
string to be freed. Sometimes the size check fails and xdr_string()
returns early, without freeing the string and without setting the object
pointer to NULL.
Usually this just results in leaking the string's memory. But since
commit 9ae5b599c7 (bos: Let xdr allocate rpc output strings), many
callers in bos.c rely on xdr_free(xdr_string) to set the given string
to NULL; if this doesn't happen, subsequent calls to BOZO_ RPCs can
corrupt memory, often causing the 'bos' process to segfault.
We only need the maxsize check when encoding or decoding, so avoid
accessing the maxsize agument when the op mode is XDR_FREE.
In general, xdr_free() can only safely be used on xdr 2-argument xdr
functions, so must be avoided when freeing xdr opaque, byte, and union
types.
This change makes it safe to use xdr_free() to free xdr strings, but in
the future, we should provide a typesafe and less fragile function for
freeing xdr strings returned from RPCs. Currently, xdr_free(xdr_string)
is only called by the bos client and the tests.
Change-Id: I2bbd6cfe48b77f1fd826207b69f50df9c0158a10
Reviewed-on: https://gerrit.openafs.org/15343
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
We have a couple of places in the code that iterate over the dentry
aliases of an inode, and each of these involves a small #ifdef ladder
to handle slightly different ways of traversing the relevant list.
Split this logic into its own compatibility macro,
afs_d_alias_foreach[_reverse], to contain this ugliness in
osi_compat.h and make the callers more readable.
This commit should incur no functional change; it is just code
reorganization.
Change-Id: I0babe22fffb23e100af5cdd50f38ab2b8f3ee2c6
Reviewed-on: https://gerrit.openafs.org/15390
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Through a series of Linux 6.3 commits starting with:
'f2fs: project ids aren't idmapped' (64b4cdf22f) and ending with
'fs: move mnt_idmap' (3707d84c13)
the inode operations functions were changed to take a mnt_idmap
structure instead of a user_namespace structure. These were pulled in
via the the merge commit:
'Pull vfs idmapping updates from Christian Brauner' (05e6295f7b)
The commit message for the merge contains background and overall
information for this conversion.
The above change simply changes the functions to use a different
structure (mnt_idmap instead of user_namespace). For afs, it is a
simple change to swap the structures. But for some of the Linux calls
(generic_fillattr(), setattr_prepare(), and inode_op->setattr()) we
would like to use the Linux symbol nop_mnt_idmap, but that symbol is
exported as GPL only. Instead, we will obtain its address from the
current task structure at libafs initialization (much the same way as
obtaining current user namespace for afs_ns).
Add autoconf tests to determine if inode_operations.create() uses the
mnt_idmap structure. If so set a generic configure define for
"IOP_TAKES_MNT_IDMAP".
Update afs's inode operations functions to take and use a mnt_idmap
instead of a user_namespace.
At libafs initialization, obtain the mnt_idmap from the current task
and save it as an afs global variable, afs_mnt_idmap, to be used where
needed.
Change-Id: Ie48b3941860998ddc7f54be48959fc62119bf48e
Reviewed-on: https://gerrit.openafs.org/15347
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Linux 6.3 commit
'filelock: move file locking definitions to separate header file'
(5970e15dbc)
relocated file lock related declarations from 'linux/fs.h' into its own
header file, 'linux/filelock.h'.
Add autoconf tests to check for the header file 'linux/filelock.h' and
update function checks for locks_lock_file_wait() and posix_lock_file().
Update osi_compat.h to include linux/filelock.h if it is available.
Change-Id: If209730ad8483d15ad34b2242b098e7e1b352d9d
Reviewed-on: https://gerrit.openafs.org/15346
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
The code for the butc_test and ttest programs look to be developer
scaffolding for an earlier version of the backup server. Those programs
use a version of the TC_PerformDump RPC which pre-dates OpenAFS, so the
butc_test and ttest have never been buildable in OpenAFS.
Remove this obsolete scaffolding code since it is not maintained and
trigger errors when scanned with static analyzers.
Change-Id: Ia3d0c8d3758e1ed41d285beafa64c326dc11658d
Reviewed-on: https://gerrit.openafs.org/15097
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
On AIX 7.2 and higher, /usr/include/net/netisr.h includes a header
<sys/libsysp.h>. This is an internal kernel library header that IBM
does not ship. The include of sys/libsysp.h is new in AIX 7.2. The
only information we need from net/netisr.h are the definitions for
NETISR_MAX and NET_KPROC. Using the macro AFS_AIX72_ENV, defined in
src/config/param.rs_aix72.h and param.rs_aix73.h, we can just
provide these definitions directly and avoid the include of
net/netisr.h. As part of this update we also add the macro
AFS_AIX72_ENV to src/config/param.rs_aix72.h as it was missing there
but is required for the edit to src/rx/AIX/rx_knet.c to work. On
earlier versions of AIX, we will continue to include net/netisr.h. A
case has been opened with IBM and they acknowledge the problem and
are working on a solution. However, we still need to be able to deal
with the situation as shipped from IBM. A future AIX 7.4 will likely
be able to include net/netisr.h again.
Change-Id: I9a1cbe02557cee9343d7f1a016f0223ba14076fa
Reviewed-on: https://gerrit.openafs.org/15312
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Update the build rule for compile_et under new platform rs_aix73.
Change-Id: I29d512c20f259935632f5aba8ff6ee28ed42b37f
Reviewed-on: https://gerrit.openafs.org/15311
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Enable the configure script to detect and configure AIX 7.3.
Update INSTALL to reflect new platform.
Change-Id: I8efe14731f7ba3a6cf65a913a2303181567ac3f1
Reviewed-on: https://gerrit.openafs.org/15310
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>