openafs

mirror of https://git.openafs.org/openafs.git synced 2025-01-18 23:10:58 +00:00

Author	SHA1	Message	Date
Indira Sawant	e4fda3481d	util: Clear owner when unlocking recursive mutex A race condition where the pthread_recursive_mutex_t::owner that is maintained by AFS doesn’t match with the thread that is trying to unlock. This leads to AFS file server and ptserver crash due to assertion failure where it was trying to unlock the grmutex. We saw the race more often when our customer migrated their machines from Power8 to Power9 systems and increased the SMT value from 2 to 4. fileserver Assertion failed! file keys.c, line 911. ptserver Assertion failed! file userok.c, line 78. File: keys.c 889 int 890 afsconf_GetKeyByTypes(struct afsconf_dir dir, afsconf_keyType type, 891 int kvno, int subType,struct afsconf_typedKey key) 892 { 893 int code = 0; 894 struct subTypeList subTypeEntry; 895 896 LOCK_GLOBAL_MUTEX; 897 … 910 out: 911 UNLOCK_GLOBAL_MUTEX; <<<< 912 return code; 913 } Consider a following situation, cpu0 , cpu1 and T0, T1 and T2 are the cpus and timestamps respectively, T0: thread1 locks grmutex performs some operations and unlocks the same, thus has itself set as pthread_recursive_mutex_t::owner. Since presently we do not reset it, thus, pthread_recursive_mutex_t::owner = thread0. T1: thread0 starts on cpu0. T2: thread1 starts on cpu1. T3: thread0 tries to lock AFS grmutex and acquires corresponding pthread_mutex, now before thread0 updates pthread_recursive_mutex_t::owner, a context switch happens. T3: thread1 on cpu1 tries to acquire grmutex and sees itself as the pthread_recursive_mutex_t::owner, possibly as it was not reset and updated yet. So thread1 thinks itself as the owner and proceeds. T4: thread0 updates the pthread_recursive_mutex_t::owner this time it is also synced across the cpu caches. T5: thread1 tries to unlock the grmutex and crashes because now it’s not the owner of the mutex. Debugging: We implemented a circular log to store certain values related to grmutex which helped in debugging us this further. ({ \ time_t t; \ time(&t); \ LOG_EVENT("%s: Unlocking TID %u: %s:%d owner %lu " \ "locked %d pthread_self %u times_inside %d\n", \ ctime(&t), (unsigned)grmutex.mut.__data.__owner,\ __func__ , __LINE__, \ grmutex.owner, grmutex.locked, (unsigned)pthread_self(), \ grmutex.times_inside); \ opr_Verify(pthread_recursive_mutex_unlock(&grmutex)==0); \ }) $614 = "Mon Sep 11 19:35:34 2023\n: Locking TID 136896: afsconf_GetKeyByTypes:896 owner 140735030161776 locked 1 pthread_self 2305880432 times_inside 1\n\000 2\n", $615 = "Mon Sep 11 19:35:34 2023\n: Unlocking TID 136896: afsconf_IsLocalRealmMatch:602 owner 140735030161776 locked 1 pthread_self 1836773744 times_inside 2\n", $617 = "Mon Sep 11 19:35:34 2023\n: Unlocking TID 136896: afsconf_GetKeyByTypes:911 owner 140735030161776 locked 1 pthread_self 2305880432 times_inside 1\n\000\061\n", Solution: This problem was resolved after resetting thread_recursive_mutex_t::owner in global mutex unlock function. Thanks to Todd DeSantis for helping with debugging, review and verification of this problem. Change-Id: Ibe01518094388080a143e31c70ab7ce0ddfca702 Signed-off-by: Indira Sawant <indira.sawant@ibm.com> Reviewed-on: https://gerrit.openafs.org/15604 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-12-28 18:05:40 -05:00
Benjamin Kaduk	d6880c7763	build: clean up some more generated files Commit `c1d39153da` added a cc-wrapper script to help generate CTF data for all objects, which is generated by configure. Remove it in the distclean target so we don't leave a stray file hanging around. Commit `48ce41a447` added a pkgbuild.sh script, to be used to build packages for newer versions of macOS, supplementing the buildpkg.sh that was previously used. Clean it up in distclean as well (buildpkg.sh was already listed). Also clean up AFS_component_version_number.c in libuafs/Makefile.common.in. These issues were detected while resolving https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1048470 . Change-Id: I0fea00924a998ee316d48cf76103ea9871a34c6b Reviewed-on: https://gerrit.openafs.org/15605 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-12-23 14:59:17 -05:00
Michael Meffie	26fdb47dba	CellServDB update 31 October 2023 Update all three copies in the tree, and the rpm specfile. Change-Id: I1685af8c6bfb27c189f8fecb6bc152cf19bf2c26 Reviewed-on: https://gerrit.openafs.org/15601 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-12-14 12:13:18 -05:00
Ganesh Chaudhari	f379e1b255	macos: Add support for MacOS 14.X (Sonoma) This commit introduces the new set of changes/ files required to successfully build and package the OpenAFS source code on MacOS 14.X "Sonoma". Change-Id: I1e62746b0a131854eeaef31d1457885d43d0b67a Signed-off-by: GANESH CHAUDHARI <gangovind@in.ibm.com> Reviewed-on: https://gerrit.openafs.org/15591 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net> Tested-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-11-16 13:01:44 -05:00
Cheyenne Wills	8962767a7e	Linux 6.7: convert to inode a/mtime accessor funcs The Linux 6.7 commit "fs: new accessor methods for atime and mtime" (077c212f03) is a follow up to the Linux 6.6 commit "fs: add ctime accessors infrastructure" (9b6304c1d5) With the above 6.7 commit, the inode's i_atime and i_mtime are renamed to __i_atime and __i_mtime and accessing these members should use the new accessor functions. This commit is similar to the OpenAFS commit "Linux 6.6: convert to ctime accessor functions" (`072c7934cd`) Add autoconf tests to detect when we need to use the new accessors and introduce new wrapper functions to get and set an inode's atime and mtime. Note, unlike the (`072c7934cd`) commit, we need to add support for reading an inode's atime and mtime, so this commit has the getters for the atime and mtime members. Change-Id: Ide818c946ed98a1401ef59864c2cdb4eccee9c99 Reviewed-on: https://gerrit.openafs.org/15597 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-11-16 12:20:18 -05:00
Mark Vitale	e2ec16cf94	dir: Introduce struct DirEntryFlex The directory package as implemented in AFS-2 allocates space for each directory entry as a DirEntry struct followed by 0-8 contiguous DirXEntry structs, as needed. This is implemented by: - afs_dir_NameBlobs calculates the number of blocks needed - FindBlobs allocates and returns index of entry - afs_dir_GetBlob returns pointer to 1st DirEntry struct After this, we populate DirEntry (and any contiguous DirXEntry blocks) with open code. Most existing code writes the entry's name via a string copy operation to DirEntry->name, which is only 16 bytes long. Therefore, for dir entry names that are 16 bytes or longer, OpenAFS routinely does string copies that look like buffer overruns. This has not previously caused problems because the OpenAFS code has arranged for a sufficiently large amount of contiguous memory to be available. However, this remains undefined behavior in the C abstract virtual machine; thus compilers are not required to produce safe operation. Recent changes in the OpenAFS build chain have made this approach no longer viable: 1) Linux 6.5 commit df8fc4e934c12b 'kbuild: Enable -fstrict-flex-arrays=3' modified the hardening of several kernel string operations when running with CONFIG_FORTIFY_SOURCE=y. 2) gcc 13 commit 79a89108dd352cd9288f5de35481b1280c7588a5 '__builtin_dynamic_object_size: Recognize builtin' provides some enhancements to _builtin_object_size. The Linux commit above will now use these when the kernel is built with gcc 13. When OpenAFS is built under Linux 6.5 or higher and gcc 13 or higher, the hardened strlcpy will BUG for directory entry names longer than 16 characters. Since there are multiple places where OpenAFS writes directory names, there are several symptoms that may manifest. However, the first one is usually a kernel BUG at cache manager initialization if running with afsd -dynroot _and_ there are any cell names 15 characters or longer in the client CellServDB. (A 15-character cellname reaches the 16 character limit when -dyrnoot adds the RW mountpoint ".<cellname>".) Address this by using flexible arrays (standardized with C99). A flexible array is a variable-length array that is declared with no size at all, e.g., name[]. Create an autoconf test to determine whether the compiler supports flexible arrays. Create a new struct DirEntryFlex. If the compiler supports flexible arrays, define name[]; otherwise retain the name[16] definition. Whenever we write a directory name, use DirEntryFlex so that any hardening will be satisfied that there is sufficient space for the name. However, the actual guarantee that this is true is still provided by the OpenAFS directory routines mentioned above - all of these remain unchanged. The DirEntry struct remains unchanged for continued use in OpenAFS, as well as for any out-of-tree users of the directory package. Change-Id: I6da5c6c295f051be90017084e5b3a3ef24d1271f Reviewed-on: https://gerrit.openafs.org/15573 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>	2023-11-09 12:24:58 -05:00
Mark Vitale	319c1ca87a	dir: Allow 256-byte directory entry names in salvager Since the original IBM code import, the DirOK test for directory entry names has been off-by-1; it says that directory names of length MAXENAME 256 are "too-long". Modify DirOK to properly validate directory entry names during salvage. While here, remove MAXENAME in favor of AFSNAMEMAX. Change-Id: I76b57899b225934ae60684d704eeb26223e5b8f3 Reviewed-on: https://gerrit.openafs.org/15574 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2023-11-09 12:24:51 -05:00
Cheyenne Wills	a2f77c5307	Linux: Fix to use time_t instead of time64_t In commit 'Linux 6.6: convert to ctime accessor functions' (`072c7934cd`) the functiom afs_inode_set_ctime was defined to use time64_t when it should have used a time_t as the data type for the sec parameter. See the commit 'LINUX 5.6: define time_t and use timespec/timespec64' (`78049987aa`). The time64_t data type was introduced in Linux 3.17. A build failure will occur when building on kernels prior to Linux 3.17. Change-Id: I9b3ce32f8885cde969583cfc74294cdabc54e698 Reviewed-on: https://gerrit.openafs.org/15595 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2023-11-02 12:35:31 -04:00
Cheyenne Wills	4fed232b80	Linux 6.6: Pass request_mask to generic_fillattr The Linux 6.6 commit: "fs: pass the request_mask to generic_fillattr" (0d72b92883) added an additional parameter to Linux's generic_fillattr() function. For openafs, generic_fillattr() is called from the inode_operations method "getattr", which is implemented in afs_linux_getattr(). The value for the request_mask parameter is an existing parameter that is passed to the inode_operations "getattr" method. Add an autoconf test for 4 parameters to the generic_fillattr function and update afs_linux_getattr() to pass the request_mask to generic_fillattr(). Change-Id: Ie1e6b15bd85931debe0fd446760674ddd0492578 Reviewed-on: https://gerrit.openafs.org/15561 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Kailas Zadbuke <kailashsz@in.ibm.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-10-05 12:30:15 -04:00
Cheyenne Wills	072c7934cd	Linux 6.6: convert to ctime accessor functions The Linux 6.6 commit "fs: add ctime accessors infrastructure" (9b6304c1d5) added accessor functions for an inode's ctime member. A follow on commit "fs: rename i_ctime field to __i_ctime" (13bc244578) changed the name of the inode member ctime to __i_ctime to indicate it's a private member. Add an autoconf test for the ctime accessor function 'inode_set_ctime()'. Add an afs_inode_set_ctime to LINUX/osi_machdep.h that is either defined as a macro to Linux's inode_set_ctime, or implements a static inline function to set a inode's ctime. Convert the setting of an inode's ctime to use afs_inode_set_ctime(). For more information behind the Linux change, see the commit message for: "Merge tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs" (615e95831) Change-Id: I920aac8029fc3e2c9b7741341b434802403a4577 Reviewed-on: https://gerrit.openafs.org/15560 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-10-01 19:02:04 -04:00
Andrew Deason	ec5bcd1a0c	util: Avoid bad ascii[1] in volutil_GetPartitionID If the caller gives "/vicep" or "vicep" to volutil_GetPartitionID(), we'll strlcpy() an empty string into ascii[], setting ascii[0] to 0. Then we check the value of ascii[1], which is uninitialized. This doesn't result in any bad behavior, because we then immediately check the value of ascii[0] (which is 0), and return -1. But reading the uninitialized ascii[1] triggers errors in tools like valgrind, and is possibly fragile for future code changes. To avoid this, make sure ascii[] is initialized at the start of the function, and check if we have copied an empty string into ascii[]. While we are here, also add tests in volutil-t for invalid "vicep" strings, to match the existing tests for invalid "/vicep" strings. Change-Id: I724f893d4bb6421b955c1c89629ab9f277be98bc Reviewed-on: https://gerrit.openafs.org/15526 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2023-09-17 00:46:03 -04:00
Andrew Deason	94c8a458a0	Check length before .readonly/.backup suffix A few places in the tree check if a volume name ends in .readonly/.backup, but don't check the string length beforehand. This can result in looking at a few bytes before the start of the string, which may contain garbage data, or may be an invalid memory address. A few commits have fixed this same issue over the years, such as `4221d7acc8` (Fix segmentation fault in vsu_GetVolumeID), but haven't caught all of them. Try to fix all of the remaining cases here. Change-Id: I736b8fa2a45dce7e5255aa055bcf7975f68e939a Reviewed-on: https://gerrit.openafs.org/15525 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2023-09-17 00:45:44 -04:00
Cheyenne Wills	76879b2879	LINUX: Pass an array of structs to register_sysctl The Linux 6.6 commit "sysctl: Add size to register_sysctl" (9edbfe92a0) renamed the Linux function register_sysctl() to register_sysctl_sz() and added a size parameter. For backward compatibility, a macro, register_sysctl, is provided. The macro calculates the size of the ctl_table being registered and passes the size to register_sysctl_sz. However, in order to perform the size calculation, an array of ctl_table structures must be passed as the 2nd parameter. This change only affects the autoconf test used to determine if Linux provides register_sysctl. Update the autoconf test for register_sysctl to use an actual ctl_table structure for the 2nd parameter instead of a NULL. Change-Id: I8660ad252d7483646011a0c9418e3e95ac690a21 Reviewed-on: https://gerrit.openafs.org/15559 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2023-09-10 20:46:02 -04:00
Cheyenne Wills	b2f70affa6	linux: Remove use of wrap_directory_iterator The commit 'linux: Replace fop iterate with fop iterate_shared' (gerrit 15528) uses a Linux provided wrapper for the file_operations iterate_shared member. The wrapper is provided by Linux to give filesystem time to migrate to using the iterate_shared directly. Review of the locking done by the afs_linux_readdir shows that it is unnecessary to use the migration wrapper. Note: Removing the use of the wrapper as a separate commit from the 15528 commit in case there is a need to use the wrapper. In addition, we would like to use the wrapper with the current openafs stable release and use the development branch for additional testing. Change-Id: I662f95f69c2d06948f764b9520567efbca8f692c Reviewed-on: https://gerrit.openafs.org/15550 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2023-09-10 19:56:21 -04:00
Cheyenne Wills	7437f4d377	linux: Replace fop iterate with fop iterate_shared The Linux 6.5 commit: 'vfs: get rid of old '->iterate' directory operation' (3e32715496) removed the filesystem_operations iterate method. The replacement method, iterate_shared, was introduced with the Linux 4.6 commit: 'introduce a parallel variant of ->iterate()' (6192269444) The above commits indicate that the iterate_shared is an "almost" drop-in replacement for iterate. The vfs documentation for iterate_shared has caveats on the implementation (serializing in-core per-inode or per-dentry modifications and using d_alloc_parallel if doing dcache pre-seeding). A wrapper is provided to assist filesystems with the migration from iterate to iterate_shared. Until it can be verified that afs_linux_readdir meets the above requirements, we will use the wrapper (ref 3e32715496 commit) Add configure tests for the iterate_shared file_operations member and for the wrap_directory_iterator function. Update osi_vnodeops.c to use iterate_shared and the wrapper if they are both available. Change-Id: I60d7605a79947198a429def5bce77cd09a16c304 Reviewed-on: https://gerrit.openafs.org/15528 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2023-08-31 12:37:14 -04:00
Andrew Deason	9bc06a0591	vol: Re-evaluate conditons for cond vars Most users of cond vars follow this general pattern when waiting for a condition: while (!condition) { CV_WAIT(cv, mutex); } But a few places in src/vol do this: if (!condition) { CV_WAIT(cv, mutex); } It is important to always re-check for the relevant condition after waiting for a CV, even if it seems like we only need to wait exactly once, because pthread_cond_wait() is allowed to wake up its caller spuriously even the CV hasn't been signalled. On Solaris, this can actually happen if the calling thread is interrupted by a signal. In VInitPreAttachVolumes() for DAFS, currently this can cause a segfault if CV_WAIT returns while 'vq' is empty. We will try to queue_Remove() the head of the queue itself, resulting in vq.head.next being set to NULL, which will segfault when we try to pull the next item off of the queue. We generally cannot be interrupted by a signal when using opr's softsig, because signals are only delivered to the softsig thread and blocked in all other threads. It is technically possible to trigger this situation on Solaris by sending the (unblockable) SIGCANCEL signal, though this would be very unusual. To make sure issues like this cannot happen and to avoid weird corner cases, adjust all of our CV waiters to wait for a CV using a while() loop or similar pattern. Spurious wakeups may be impossible with LWP, but just try to make all code use a similar structure to be safe. Thanks for mvitale@sinenomine.net for finding and investigating the relevant issue. Change-Id: Ib188708b909661c28fa1a709306a02b01d3cd6d3 Reviewed-on: https://gerrit.openafs.org/15327 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2023-08-03 19:16:11 -04:00
Andrew Deason	448e82b452	rx: Check for callNumber before NULL server call In rxi_ReceiveServerCall(), we compare the callNumber in the given packet to the callNumber on the conn's channel. If the packet's callNumber is smaller, it indicates the packet is for an earlier call that has since ended, and we ignore it. However, we perform this check after checking whether we need to allocate a new call (conn->call[channel] is NULL), or use an existing call. If we allocate a new call, we don't check the conn's callNumber at all, and unconditionally set it to the callNumber in the packet. This means that if a server call ends and is successfully rxi_FreeCall()'d, the server will accept any callNumber on the same channel. If the server sees an old DATA packet from an earlier call after this happens, it means the server can effectively re-run an RPC that has already completed successfully, or that the client has already seen fail. A server can see a DATA packet for an old call under a variety of situations, which is, after all, why we check it (dup'd/delayed packets, the client could still be trying to run an old call, etc). Seeing one for a freed call is less likely since that usually requires more time to have passed, but is still possible. Checking the callNumber was effectively moved in commit `99b43273c0` (rx: prevent connection channel assignment race) in the 1.7.x series. This commit makes the check similar to how it was in 1.6.x and earlier. Change-Id: Ia4990e210d4cd455744586c046490c0f3f56853a Reviewed-on: https://gerrit.openafs.org/15524 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2023-08-03 19:15:56 -04:00
Cheyenne Wills	c4c16890d9	hcrypto: rename abort to _afscrypto_abort The Linux 6.5 commit: panic: make function declarations visible (d9cdb43189) added a declaration for abort into panic.h. When building the Linux kernel module, the build fails with the following: src/crypto/hcrypto/kernel/config.h:95:20: error: static declaration of ‘abort’ follows non-static declaration 95 \| static_inline void abort(void) {osi_Panic("hckernel aborting\n" );} \| ^~~~~ ... from ./include/linux/wait.h:9, from /openafs/src/afs/sysincludes.h:118, from /openafs/src/crypto/hcrypto/kernel/config.h:30: ./include/linux/panic.h:36:6: note: previous declaration of ‘abort’ with type ‘void(void)’ 36 \| void abort(void); \| ^~~~~ Update the declaration in hcrypto/kernel/config.h to change the function name from abort to _afscrypto_abort and use a preprocessor define to map abort to _afscrypto_abort. Change-Id: I72a4d01b12c9d944481569748f129002b6a8c8fc Reviewed-on: https://gerrit.openafs.org/15501 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2023-07-20 12:27:39 -04:00
Cheyenne Wills	fb31d299e6	Linux 6.5: Use register_sysctl() The linux 6.5 commit: "sysctl: Remove register_sysctl_table" (b8cbc0855a) removed the Linux function register_sysctl_table(). The replacement function is register_sysctl(), which offers a simpler interface. Add an autoconf test for the Linux function register_sysctl and add a call to register_sysctl when available. Notes: The Linux function register_sysctl was added in Linux 3.3 with the commit: 'sysctl: Add register_sysctl for normal sysctl users' (fea478d410) with a note that it is a simpler interface. The function register_sysctl_table was marked as deprecated with the Linux 6.3 commit: 'proc_sysctl: enhance documentation' (1dc8689e4c) Change-Id: I657bee103cd1f4ab2e60b3f3471c3fc12d8b6be4 Reviewed-on: https://gerrit.openafs.org/15500 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-07-20 12:10:05 -04:00
Mark Vitale	bcb0fa5a2a	audit: Update comments Some of the comments refer to names from an earlier implementation. Update them to match the current implementation. No functional change is incurred by this commit. Change-Id: I9d16477abbfceceaf6b79d6781542d9c6df3468f Reviewed-on: https://gerrit.openafs.org/15414 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-07-13 12:30:05 -04:00
Andrew Deason	2e6b80e4bc	cf: Undef _FORTIFY_SOURCE for use-after-free check Commit `f2003ed68c` (gcc: Avoid false positive use-after-free in crypto) added a configure check to detect whether the compiler we're using exhibits the use-after-free warning bug. We add -O0 to CFLAGS for the test to make sure the bug triggers for gcc 12, but if the user has specified, for example, CFLAGS=-D_FORTIFY_SOURCE=1, this causes the compile check to always fail: /usr/include/features.h:330:4: error: #warning _FORTIFY_SOURCE requires compiling with optimization (-O) [-Werror=cpp] # warning _FORTIFY_SOURCE requires compiling with optimization (-O) This causes _OPENAFS_UAF_COMPILE_IFELSE to always fail, and so we throw an AC_MSG_ERROR during configure. To allow the build to continue with _FORTIFY_SOURCE, make sure _FORTIFY_SOURCE is undefined for this specific test. The compile test can then succeed with -O0 (unless we trigger the use-after-free bug, of course). Change-Id: Ib0d21b07b1e7892a6d411ee0c34a4ff7baf75cc3 Reviewed-on: https://gerrit.openafs.org/15499 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-07-13 12:22:12 -04:00
Cheyenne Wills	0e06eb78f2	Linux 6.5: Replace generic_file_splice_read The Linux 6.5 commit: 'splice: Remove generic_file_splice_read()' (c6585011bc) replaces the function generic_file_splice_read() with the function filemap_splice_read(). The Linux function 'filemap_splice_read()' was introduced with the Linux 6.3 commits: 'splice: Add a func to do a splice from a buffered file without ITER_PIPE' (07073eb01c) 'splice: Export filemap/direct_splice_read()' (7c8e01ebf2) With updates in Linux 6.5: 'splice: Fix filemap_splice_read() to use the correct inode' (c37222082f) -- which fixes a problem in the code. 'splice: Make filemap_splice_read() check s_maxbytes' (83aeff881e) Due to the fact that there could be problems with splice support prior to Linux 6.5 (where filemap_splice_read()'s use was expanded to additional filesystems other than just cifs), we only want to use 'filemap_splice_read()' in Linux 6.5 and later. The LINUX/osi_vnodeops.c file is updated to use 'filemap_splice_read()', for Linux 6.5 and later, for the splice_read member of the file_operations structure. Change-Id: Ia07c66ddc5d4f836f230b5d03567803e124b93b8 Reviewed-on: https://gerrit.openafs.org/15486 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-07-13 12:06:52 -04:00
Michael Meffie	c412c75def	doc: Fix the AFS::ukernel man page title The pod2man tool determines a man page title (set in the .TH macro) from the input filename, unless the -n (--name) option is specified. Our AFS::ukernel man page input file is named AFS.ukernel.pod to avoid colons in the filename (since colon characters are not supported on Windows), so the generated man page contains the title "AFS.ukernel" instead of "AFS::ukernel". Use the pod2man -n (--name) option when converting section 3 man pages to override the automatic title naming. This fixes the .TH macro in the generated AFS::ukernel.3 file. Fortunately, the -n (--name) option is only needed for section 3 man pages. Specifying the pod2man -n (--name) option is simpler and less invasive than renaming pod3/AFS.ukernel.pod to pod3/lib/AFS/ukernel.pod (which would also fix the embedded title). Change-Id: I495ea2d30ce1b34698519ffa34a39362c449ba09 Reviewed-on: https://gerrit.openafs.org/15363 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-07-12 23:48:30 -04:00
Michael Meffie	bff2912597	backup: Make backup tests build again The test programs in bucoord, butc, and butm have bit rotted and no longer build. Update the code to fix error and warnings, and remove dead code. Update the relevant makefiles so these test programs build by default to avoid bitrot in the future. With this change the test programs will build and run, but a tape drive device is required to perform testing. Change-Id: I978b98fe63fe6051409fc160e8c7d43050bcf707 Reviewed-on: https://gerrit.openafs.org/15098 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-07-06 18:12:38 -04:00
Cheyenne Wills	f2003ed68c	gcc: Avoid false positive use-after-free in crypto Due to a bug in gcc-12 and gcc-13, several warnings are generated for a use-after-free in crypto.c, which leads to a build failure with --enable-checking: src/external/heimdal/krb5/crypto.c:1157:9: error: pointer ‘p’ may be used after ‘realloc’ [-Werror=use-after-free] 1157 \| free(p); \| ^~~~~~~ src/external/heimdal/krb5/crypto.c:1155:20: note: call to ‘realloc’ here 1155 \| result->data = realloc(p, sz); \| ^~~~~~~~~~~~~~ However, reviewing the code around these warnings shows that the use-after-free warnings are incorrectly generated (false positive). The documentation for realloc states that realloc will return a NULL and not alter the storage passed if there was an error allocating and the size passed is non-zero. There is a possible work-around for the false positive. One can use a variable that is not a member of a structure to hold and test the value returned from realloc, then update the structure member from that variable. However, the code that is producing the message is in a heimdal external file, so we cannot modify the source. So just use the compiler flag -Wno-use-after-free to avoid the warning/error. Update configure to add tests for the -Wno-use-after-free flag, update the Makefile to add the flag for CFLAGS.crypto.lo, and update CODING for the new exception. Because this is an important check, only disable the warning if the compiler exhibits this specific bug. We do this by adding specific configure tests for the compiler bug and conditionally set a CFLAG variable if the bug is present. NOTE: The false positive and work-around can be demonstrated with the following code using gcc-12 (with -O0) or gcc-13 (not sensitive to the optimization level): somestruct->somepointer = realloc(ptr, somesize); if (somestruct->somepointer == NULL && somesize != 0) { free(ptr); << gets flagged as use-after-free handle enomem... } However the following doesn't get flagged: char *tmpptr = realloc(ptr, somesize); if (tmpptr == NULL && somesize != 0) { free(ptr); handle enomem... } somestruct->somepointer = tmpptr; The GCC ticket https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110501 has been marked as confirmed. Change-Id: I28566354da4199389964210f954cda1213098088 Reviewed-on: https://gerrit.openafs.org/15471 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2023-07-05 16:40:35 -04:00
Mark Vitale	4b42b07fde	vol: Don't leak volume bitmaps Since the original IBM code import, attach2 has set the volume's index bitmaps to NULL in preparation for allocating and initalizing new bitmaps. However, the volume may already have bitmaps from previous operations, and this is much more likely with DAFS. In this case, the old bitmaps are leaked. Instead, free any existing bitmap before allocating a new one. Discovered via Solaris libumem.so.1. Change-Id: I77e42ccd36cedead060974a4fe99e37b4f262093 Reviewed-on: https://gerrit.openafs.org/15428 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-07-02 10:53:37 -04:00
Cheyenne Wills	2357ac6ae5	cf: Fix cast-function-type err w/disable-checking If the Linux kernel has CONFIG_WERROR enabled, and openafs is configured with --disable-checking, the libafs kernel module fails to build: /src/libafs/MODLOAD-5.14.0-305.el9.x86_64-MP/evp.c:501:9: error: cast between incompatible function types from ‘void ()(void )’ to ‘int ()(EVP_MD_CTX )’ {aka ‘int ()(struct hc_EVP_MD_CTX )’} [-Werror=cast-function-type] 501 \| (hc_evp_md_init)null_Init, \| ^ The openafs commit: Linux-5.17: Kernel build uses -Wcast-function-type (`6bdfa97673`) fixed above error when the Linux kernel has CONFIG_WERROR enabled and openafs is configured with --enable-checking. But we will still fail when CONFIG_WERROR is enabled and openafs is configured with --disable-checking (which is the default). Update osconf.m4 to always set CFLAGS_NOCAST_FUNCTION_TYPE, so that it can be used to avoid the above compiler check even when checking is disabled. NOTE: The only use of CFLAGS_NOCAST_FUNCTION_TYPE is to correct the warnings flagged in external/heimdal/hcrypto/evp.c and evp-algs.c. NOTE: --enable-checking=all can be used to bypass setting the define. Change-Id: I39824f38cfbcf045f59744de5109a99fd4ad041a Reviewed-on: https://gerrit.openafs.org/15417 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2023-06-29 12:54:14 -04:00
Cheyenne Wills	a939eb7ffd	cf: Use static allocated structs for cf tests The Linux kernel module build will issue a warning when a stack frame exceeds a specific size via the -Wframe-larger-than= compiler flag (with a default size of 2048 bytes on most architectures). At least one distribution, Oracle's Linux with the Unbreakable Enterprise Kernel (UEK), hardens this check by changing the warning to an error (-Werror=frame-larger-than=). Several of the openafs autoconf tests use objects that are allocated on the stack when testing for structures, or members of a structure. When the warning is changed to an error, configure fails in several locations when testing against Linux's task_struct structure, which exceeds 2048 bytes in size. openafs/conftest.dir/conftest.c:72:1: error: the frame size of 9984 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] Update the autoconf Linux tests that allocate structures to use a statically allocated structure instead of one allocated on the stack. Change-Id: Ica44373b02fb4efa81d8fcbc0337aea9719b4fb6 Reviewed-on: https://gerrit.openafs.org/15477 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2023-06-22 15:25:44 -04:00
Cheyenne Wills	df7f841f38	clang: Don't redefine printf in salvage.c The Gentoo distribution changed the default fortify_source setting for clang (to match the default being used for gcc). This change causes the following build error: src/dir/salvage.c:26:9: error: 'printf' macro redefined [-Werror,-Wmacro-redefined] ^ /usr/include/bits/stdio2.h:89:11: note: previous definition is here ^ 1 error generated. If the fortify_source level is greater than 1, glibc can define printf as a macro. The clang compiler has a default check for macro redefinitions (-Wmacro-redefined), while gcc does not provide this option. Remove the: #define printf Log in src/dir/salvage.c and update callers to use Log() instead of printf(). Clean up lines that split the call to Log and its first parameter. Fix the indentation for Log's parameters. There are no functional changes with this commit. Change-Id: I732c8345dfd4cf140883bff4b53df9ab9bea5a85 Reviewed-on: https://gerrit.openafs.org/15462 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-06-22 12:28:12 -04:00
Michael Meffie	488d6d1f1b	RedHat: Support building RPMs with alternate CellServDB By default, the makesrpm.pl script downloads a CellServDB file from grand.central.org. The URL of the file to be downloaded is specified in the openafs.spec.in file from the source distribution archive. Network access can be avoided by specifying a local file to be packaged with the 'cellservdb' positional argument. Due to implementation limitations of the makesrpm.pl script, the specified file must be in current working directory and the filename must the same as the CellServDB URL basename in the openafs.spec.in file (currently 'CellServDB.2022-05-09'). This change adds support for building packages with an alternative CellServDB, and adds support for specifying a local CellServDB file by path. Add the --cellservdb-url option to override the URL used to download the CellServDB file when the local file is not specified. Update the sed commands used to generate the openafs.spec file to update the CellServDB source directive (Source20) when an alternate URL is given with the --cellservdb-url. Replace the shell pipe line to extract the CellServDB URL from the openafs.spec.in file with perl code. Search for the 'Source20' source directive instead of searching for the string 'dl/cellservdb' to find the CellServDB source URL. (After this commit, the source directive number of the CellServDB source should not be changed without also changing the number in makesrpm.pl). Change the handling of the 'cellservdb' positional argument so the local file is copied to required name in the temporary SOURCES directory. This also allows us to specify the local file by path, instead of requiring the local CellServDB file to be in the current working directory when running makesrpm.pl. The --cellservdb-url option and the 'cellservdb' argument are not exclusive. If both are specified, the URL will be set in the generated openafs.spec file to indicate the upstream location of the packaged CellServDB file provided by the 'cellservdb' positional argument. Change-Id: If8818bcdee8500a164f992f42df3c4f9a587d0cd Reviewed-on: https://gerrit.openafs.org/15405 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-06-08 12:07:41 -04:00
Kailas Zadbuke	630d423897	afs: Check UHasTokens in afs_GCUserData It looks that EndTimestamp holds the user’s token expiration time even after function afs_MarkUserExpired() gets called from afs_pag_destroy(). So it seems that the PAGs/tokens are not being reclaimed. This will lead to pag buildup on the system. The PAG/'struct unixuser' is not being freed because afs_GCUserData() checks afs_HasUsableTokens(), which checks EndTimestamp, which says the tokens still haven't expired. So the PAG doesn't get freed. This behavior has existed since commit `ba1d050c6e` (Make unixuser's vid advisory). To fix this, change afs_GCuserData() to check for the UHasTokens flag, and ignore any tokens if UHasTokens isn't set. This causes the PAG to be freed, since we don't look at the EndTimestamp of the tokens at all. Thanks Todd DeSantis for your support. Change-Id: Ic46e2487b1971aed9ca6b6e5aa5e5d0b2fd9c31e Signed-off-by: Kailas Zadbuke <kailashsz@in.ibm.com> Reviewed-on: https://gerrit.openafs.org/15404 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-05-18 12:53:12 -04:00
Mark Vitale	915c9ec007	rxkad: Free memory used to check rxkad response Since its introduction with commit `7e4e06b87a` "Derive DES/fcrypt session key from other key types", rxkad_derive_des_key has failed to free the memory associated with its HMAC context struct. This results in a leak of at least 352 bytes for each rxkad challenge response processed by an OpenAFS server when using rxkad-kdf. Free the memory by calling HMAC_CTX_cleanup after each round of the loop. Discovered via Solaris libumem.so.1. Change-Id: Ic9f130f7229c70e4eaa68ba1f0a213b4b23229cc Reviewed-on: https://gerrit.openafs.org/15427 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-05-18 12:16:39 -04:00
Andrew Deason	d460b616eb	LINUX: Make 'fs flush*' invalidate dentry Our 'fs flush' and related commands (flushall, flushvolume) clear the relevant entries in the OpenAFS stat cache and data cache, which can fix problems if the cache ever becomes incorrect for any reason. (This can happen after bugs, repairing corrupted volumes, disaster recovery scenarios, and similar edge cases.) However, on Linux, these commands don't affect the VFS dentry cache. If someone needs to use an 'fs flush' command to fix a problem, this will fix the OpenAFS cache, but the Linux dcache can still be wrong. The only way to manually flush dcache entries is to use the global 'drop_caches' mechanism, which is a very heavweight operation, only accessible to root. For example: $ ls -l ls: cannot access foo.1: No such file or directory total 2 drwxrwxr-x. 2 bin adeason 2048 Apr 6 14:20 dir -?????????? ? ? ? ? ? foo.1 $ fs flush . $ ls -l ls: cannot access foo.1: No such file or directory total 2 drwxrwxr-x. 2 bin adeason 2048 Apr 6 14:20 dir -?????????? ? ? ? ? ? foo.1 $ sudo sysctl -q -w vm.drop_caches=3 $ ls -l total 3 drwxrwxr-x. 2 bin adeason 2048 Apr 6 14:20 dir -rw-rw-r--. 1 bin adeason 29 Sep 22 2022 foo.1 To make the 'fs flush' commands be effective in more situations, change afs_ResetVCache() to also invalidate the dcache entries associated with each vcache we reset. To make things simpler and reduce locking complexity, do this by setting d_time to 0, and don't directly run dcache-managing functions like d_invalidate or d_drop, etc. The above example now becomes: $ ls -l ls: cannot access foo.1: No such file or directory total 2 drwxrwxr-x. 2 bin adeason 2048 Apr 6 14:20 dir -?????????? ? ? ? ? ? foo.1 $ fs flush . $ ls -l total 3 drwxrwxr-x. 2 bin adeason 2048 Apr 6 14:20 dir -rw-rw-r--. 1 bin adeason 29 Sep 22 2022 foo.1 Change-Id: Ic95ef6137f5b671baf68a4890ef1562f029af794 Reviewed-on: https://gerrit.openafs.org/15391 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2023-05-12 17:36:15 -04:00
Andrew Deason	2b32b130f5	LINUX: Set .getattr for symlinks On Linux, we can define a .getattr callback for files and directories, which is called when a caller requests metadata for the file, such as during lstat(). For regular files and directories, we set this to afs_linux_getattr(), which updates the metadata for the file in question if it's out of date (CStatd not set). For symlinks, however, we don't set .getattr at all. This would seem to allow symlink metadata to become stale if another client changes it, but the metadata often happens to stay up to date via other means. For example, we can see the following happen: - Another client changes, for example, the owner of a symlink we have cached. - The fileserver sends us a callback break, and so we clear CStatd for the vcache for the symlink. - lstat() is called for the symlink, which causes afs_linux_dentry_revalidate() to be called for the cached dentry. Since CStatd is not set for the vcache, we verify the entry by calling afs_lookup(), and then update the symlink's metadata via afs_getattr() and vattr2inode(). However, if CStatd _is_ set for the symlink when lstat() is called, afs_linux_dentry_revalidate() will not update anything, and will just return success. So, if we manage to set CStatd for the symlink without updating it's Linux VFS metadata, the Linux metadata won't be updated, and we'll report the old metadata to the caller of lstat(). We can set CStatd without updating the Linux VFS info in a few different ways. A few pioctls such as PRemoveMount or PFlushMount can do this if they encounter an error, but the only code paths that call these pioctls in OpenAFS (via the 'fs' utility) also lstat() the relevant path, so this doesn't happen in practice. A more common way that this can occur is via afs_DoBulkStat(). If userspace triggers a bulkstat that includes the symlink in question, the symlink would gain the CStatd flag without any interaction with the Linux VFS. For example, say a symlink was chown'd from 'adeason' to 'root'. On another client with the symlink cached, running 'ls -l' on the symlink itself would show the updated owner, because afs_linux_dentry_revalidate() updates the metadata: $ ls -l dir.slink lrwxr-xr-x. 1 root root 3 May 5 14:48 dir.slink -> dir But if we 'ls -l' the entire directory, which contains other entries, we will bulkstat many of the entries, possibly including the symlink. And so we may see the old metadata: $ ls -l total 9 [...] lrwxr-xr-x. 1 adeason root 3 May 5 14:48 dir.slink -> dir Triggering this behavior requires a bulkstat to be triggered before we access the symlink itself, and so triggering this behavior depends on the order of the entries in the directory as well as whether the other items in the dir are cached. As such, triggering this behavior during normal operation tends to be inconsistent and confusing. The only lstat() info for symlinks that can change like this is the owner, group, and modtime; mode bits cannot change, and neither can the length/size (or the contents in general). So, stale metadata tends to not be very noticeable. To fix all of this, set .getattr to afs_linux_getattr() for symlinks, just like we do for regular files and directories. This ensures that we will update the Linux VFS metadata for the symlink when it is requested, so we won't return stale metadata to callers. This behavior appears to have existed for symlinks on Linux for quite a while, possibly since our Linux 2.6 support was added. The behavoir may have been introduced around commit `b860b359d5` (initial-linux24-support-20001105) or commit `0054374495` (linux22-fix-20040405). Before those commits, we defined a .revalidate callback for symlinks, which was called on older Linux versions before fetching file metadata, and so probably ensured that an lstat() on a symlink returned properly updated info. Change-Id: I90d751e3175e24e9f9409cea6c079b097c78b51b Reviewed-on: https://gerrit.openafs.org/15423 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-05-12 16:15:54 -04:00
Michael Meffie	c8aae4da4f	bozo: Do not create client directory and symlinks Each time the bosserver starts, it checks for the presence of the client configuration directory and the CellServDB and ThisCell files within it. When not found, the bosserver creates the client cell configuration directory. When the CellServDB and ThisCell files are not present in the client configuration directory, the bosserver creates symlinks to the server's CellServDB and ThisCell files. This feature of the bosserver was a convenience when older versions of vos and pts required a client configuration directory. However, modern deployments tend to use packaging, with separate client and server packages. The client configuration directory and files in it are typically managed by the client packaging. Currently, packagers must work around these symlinks created by the bosserver. Additionally, the CellServDB and ThisCell symlinks are hazardous since writing to the client-side configuration will overwrite the server configuration. This commit removes the creation the client configuration directory and the CellServDB and ThisCell symlinks during bosserver startup. This change is intended to decouple the server from the client, help to avoid overwriting the server configuration, and avoid requiring client artifacts on a server. Change-Id: I8aba81185a52e6792d36ae28a430ebe1c23919d2 Reviewed-on: https://gerrit.openafs.org/12586 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-05-12 13:08:34 -04:00
Michael Meffie	77eb172833	Add command fallback to server config Add an initialization retry in the bos, vos, and pts commands to fallback to the server configuration directory when initialization fails with the client configuration directory. This allows admins to run unauthenticated bos, vos, and pts commands on servers without a client configuration (including symlinks created by the bosserver) without any extra command line options. Perform the initialization retry only when the -localauth or -config options are not given. The bos, vos, and pts commands already use the server configuration path when the -localauth option is given, so there is no point in retrying the same path. The vos and pts -config option specifies the path to be used, so we do not fallback to a different directory when the user specifies the configuration path to be used. While here, change the scope of the confdir variable in vos.c from a global to a local variable, since it is only used within the MyBeforeProc() function. This change does not add a vsu_ClientInit() retry in the bos salvage command. That command always requires authorization, so when run without -localauth requires a token (and therefore a cache manager and client cell configuration). Update the bos, vos, and pts man pages to describe this new fallback method to lookup the configuration directory. (The AFSCONF environment variable and .AFSCONF files are currently undocumented in the man pages. They should be documented or removed from the code in a future change.) Change-Id: I55c3109494db744e7bc2defcb54eaee3b4e30018 Reviewed-on: https://gerrit.openafs.org/15351 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-05-12 13:07:37 -04:00
Ben Huntsman	4810cca10b	AIX: Free pinned_heap during shutdown to prevent kernel panic During shutdown, the function shutdown_osisleep is called in src/afs/afs_osi.c. The body of this function is platform- specific, and on AIX there is a call to xmfree: xmfree(tmp); However, on AIX, xmfree actually takes two arguments: int xmfree ( ptr, heap) caddr_t ptr; caddr_t heap; This is called elsewhere in the OpenAFS code correctly for kernel_heap. In src/afs/AIX/osi_sleep.c we start using the pinned_heap, but never xmfree it. Therefore, we need to do so here during the shutdown. Here we include a header file which defines pinned_heap, and then supply it as an argument to xmfree. This prevents a kernel panic during OS shutdown. The panic might go unnoticed in many environments, as during a reboot, the system will normally dump and restart anyway. However, if kdb is loaded (bosboot -a -D), the system will break into the debugger before the full shutdown procedure completes. The stack trace shows the following: KDB(0)> stack pvthread+01D200 STACK: [00023900]abend_trap+000000 () [000EFF24]xmfree_frontend+0000A4 (??, ??, ??) [F1000000C064CF1C]shutdown_osisleep@AF72_5+0000BC (??) [F1000000C064CB7C]shutdown_osi+00001C () [F1000000C064A51C]afs_shutdown+0003BC (0000000100000001) [F1000000C05A8DD4]afs_unmount+000094 (F1000A01501D4C10, 0000000000000000) [F1000000C059FCE0]vfs_unmount+0000A0 (F1000A01501D4C10, 0000000000000000, F1000A015047B07C) [00014D70].hkey_legacy_gate+00004C () [006A6AAC]vfs_unmount+00008C (??, ??, ??) [006B4228]kunmount+000228 (??, ??, ??, ??) [006B4944]uvmount+000204 (??, ??) [00003954]syscall+00024C () [100084FC]helper_UMfunc+00027C (??, ??) [10003D48]dounmount+0000C8 (??, ??, ??, ??) [100044DC]umountmain+0001BC (??, ??) [10000AD4]main+0000B4 (??, ??) [10000168]__start+000068 () Change-Id: I3131dc7d184a299e7d7806ce8a0c3d890b5c3624 Reviewed-on: https://gerrit.openafs.org/15419 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-05-12 11:20:07 -04:00
Ben Huntsman	8a2d4faa73	AIX: AFS_SMALLOCSIZ too small on newer OS releases Addresses a situation where a write in AFS space can cause a kernel panic. In src/afs/afs_osi_uio.c in the function afsio_partialcopy: size_t space_len = sizeof(struct uio) + sizeof(struct iovec) * AFS_MAXIOVCNT; /* Allocate a block that can contain both the UIO and the iovec / space = osi_AllocSmallSpace(space_len); On newer AIX systems (since at least 6.1), space_len is larger than AFS_SMALLOCSIZ. When osi_AllocSmallSpace is called, the following test in src/afs/afs_osi_alloc.c causes a kernel panic: if (size > AFS_SMALLOCSIZ) osi_Panic("osi_AllocSmallS: size=%d\n", (int)size); This is due to the following definition in src/config/afs_args.h: / * Note that the AFS_ALLOCSIZ values should be multiples of sizeof(void) to * accomodate pointer alignment. / / Used in rx.c as well as afs directory. / #if defined(AFS_AIX32_ENV) \|\| defined(AFS_HPUX_ENV) / XXX Because of rxkad_cprivate... XXX / #define AFS_SMALLOCSIZ (38sizeof(void )) / "Small" allocated size / #else #define AFS_SMALLOCSIZ (64sizeof(void )) / "Small" allocated size */ #endif All the supported AIX platforms define AFS_AIX32_ENV in src/config/param.rs_aixXX.h, where XX is the AIX version. Therefore, all the AIX platforms end up with AFS_SMALLOCSIZ = 152 bytes instead of 256. To resolve this, we will modify the preprocessor test to use the second case for AIX versions greater than 6.1. This issue may be present on earlier releases of AIX as well, but AIX 5.3 and older test systems were not available at this time. Also, a spelling error in the comment was fixed. Change-Id: Id6b094e28e1be1cfb71296c5c31e60592ba91395 Reviewed-on: https://gerrit.openafs.org/15418 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-05-11 12:25:21 -04:00
Mark Vitale	1ac7b601a6	audit: Remove redundant AFS_AIX32_ENV ifdef Commit `00fd29251e` audit-prototypes-20090315 wrapped audmakebuf with an ifdef for AFS_AIX32_ENV. However, this made redundant a pre-existing ifdef for the same symbol within audmakebuf. Remove the redundant ifdef. While here, add a concluding comment to the superseding endif. No functional change is incurred by this commit. Change-Id: I5b58538c0371ad820f6de662b8367e6e3aee1e66 Reviewed-on: https://gerrit.openafs.org/15413 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>	2023-05-06 11:25:15 -04:00
Mark Vitale	4a5eb4a764	audit: Remove vestigial osi_echo_trail logic In the original IBM code import, auditing was for AIX and fileservers only. The /usr/afs/local/Audit configuration file supported two directives: AFS_AUDIT_AllEvents enabled writing to AIX audit subsystem Echo_Trail enabled echoing to stdout (osi_echo_trail=1) However, even in the IBM AFS 3.6 Administrator Guide, only AFS_AUDIT_AllEvents was documented; Echo_Trail was undocumented. Later, commit `16d67791dc` auditlogs-for-everyone-20050702 extended auditing to more platforms and servers, but the Audit file support remained unchanged. Later, commit `7b0b6a0ce9` "Enhance audit logs to support SysV message queues" removed the last effective use of osi_echo_trail. Remove the vestigial support for echoing the audit records to stdout. Change-Id: I2c2d5d0a76c5a3a5d8560c93e2587ab6d057c8ed Reviewed-on: https://gerrit.openafs.org/15412 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2023-05-06 11:25:08 -04:00
Cheyenne Wills	795ef90d40	Linux: Clear msghdr structure before use The msghdr structure is used as a parameter to the Linux kernel functions kernel_sendmsg() and kernel_recvmsg(). Some required fields need to be set prior to calling these functions, but there are also additional structure members that may not be used by the calling code. Some of these fields may be initialized by the Linux kernel functions being used, but there may be some that are left uninitialized. To ensure that all fields in the msghdr structure are cleared, use memset to zero the entire structure. This will eliminate the need to set individual fields to 0 or NULL. Change-Id: I7cdc7539084d13efedbb20e7ebd5c242693a628c Reviewed-on: https://gerrit.openafs.org/15409 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-04-27 20:29:11 -04:00
Cheyenne Wills	5a7630acfd	Linux: Do not initialize msghdr's msg_iter/msg_iov Since the initial Linux commit (2.6.12-rc2) the Linux function kernel_recvmsg() handles the initialization of the msghdr members related to the iov (msghdr.msg_iter and for earlier kernels msghdr.msg_iov and msghdr.msg_iovlen). The code in osi_NetReceive for Linux unnecessarily initializes these structure members prior to calling kernel_recvmsg(). Remove the unnecessary code from osi_NetReceive along with the surrounding preprocessor check for STRUCT_MSGHDR_HAS_MSG_ITER. Since this is the only location that uses this define, also remove the associated autoconf test. NOTE: This was discovered while investigating changes needed for Linux 6.4, due to: "iov_iter: add iter_iovec() helper" (de4f5fed3f) which renamed the Linux structure iter_iovec's member iov to __iov. Since the openafs code that was affected by the Linux 6.4 change is being removed as unnecessary, this commit is not Linux 6.4 specific, but effects all versions of Linux. Change-Id: Id9931728ad96b07e5e1c3472d9689fe6dd2905f9 Reviewed-on: https://gerrit.openafs.org/15408 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-04-27 20:28:45 -04:00
Michael Meffie	bbb1e8adfe	xdr: Avoid xdr_string maxsize check when freeing The maxsize argument in xdr_string() is garbage when called by xdr_free(), since xdr_free() only passes the XDR handle and the xdr string to be freed. Sometimes the size check fails and xdr_string() returns early, without freeing the string and without setting the object pointer to NULL. Usually this just results in leaking the string's memory. But since commit `9ae5b599c7` (bos: Let xdr allocate rpc output strings), many callers in bos.c rely on xdr_free(xdr_string) to set the given string to NULL; if this doesn't happen, subsequent calls to BOZO_ RPCs can corrupt memory, often causing the 'bos' process to segfault. We only need the maxsize check when encoding or decoding, so avoid accessing the maxsize agument when the op mode is XDR_FREE. In general, xdr_free() can only safely be used on xdr 2-argument xdr functions, so must be avoided when freeing xdr opaque, byte, and union types. This change makes it safe to use xdr_free() to free xdr strings, but in the future, we should provide a typesafe and less fragile function for freeing xdr strings returned from RPCs. Currently, xdr_free(xdr_string) is only called by the bos client and the tests. Change-Id: I2bbd6cfe48b77f1fd826207b69f50df9c0158a10 Reviewed-on: https://gerrit.openafs.org/15343 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-04-20 12:09:36 -04:00
Andrew Deason	5aaed53f07	LINUX: Introduce afs_d_alias_foreach We have a couple of places in the code that iterate over the dentry aliases of an inode, and each of these involves a small #ifdef ladder to handle slightly different ways of traversing the relevant list. Split this logic into its own compatibility macro, afs_d_alias_foreach[_reverse], to contain this ugliness in osi_compat.h and make the callers more readable. This commit should incur no functional change; it is just code reorganization. Change-Id: I0babe22fffb23e100af5cdd50f38ab2b8f3ee2c6 Reviewed-on: https://gerrit.openafs.org/15390 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2023-04-13 21:48:20 -04:00
Cheyenne Wills	858ee34545	Linux 6.3: Use mnt_idmap for inode op functions Through a series of Linux 6.3 commits starting with: 'f2fs: project ids aren't idmapped' (64b4cdf22f) and ending with 'fs: move mnt_idmap' (3707d84c13) the inode operations functions were changed to take a mnt_idmap structure instead of a user_namespace structure. These were pulled in via the the merge commit: 'Pull vfs idmapping updates from Christian Brauner' (05e6295f7b) The commit message for the merge contains background and overall information for this conversion. The above change simply changes the functions to use a different structure (mnt_idmap instead of user_namespace). For afs, it is a simple change to swap the structures. But for some of the Linux calls (generic_fillattr(), setattr_prepare(), and inode_op->setattr()) we would like to use the Linux symbol nop_mnt_idmap, but that symbol is exported as GPL only. Instead, we will obtain its address from the current task structure at libafs initialization (much the same way as obtaining current user namespace for afs_ns). Add autoconf tests to determine if inode_operations.create() uses the mnt_idmap structure. If so set a generic configure define for "IOP_TAKES_MNT_IDMAP". Update afs's inode operations functions to take and use a mnt_idmap instead of a user_namespace. At libafs initialization, obtain the mnt_idmap from the current task and save it as an afs global variable, afs_mnt_idmap, to be used where needed. Change-Id: Ie48b3941860998ddc7f54be48959fc62119bf48e Reviewed-on: https://gerrit.openafs.org/15347 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-04-06 19:28:46 -04:00
Cheyenne Wills	6873dc925c	Linux 6.3: Include linux/filelock.h if available Linux 6.3 commit 'filelock: move file locking definitions to separate header file' (5970e15dbc) relocated file lock related declarations from 'linux/fs.h' into its own header file, 'linux/filelock.h'. Add autoconf tests to check for the header file 'linux/filelock.h' and update function checks for locks_lock_file_wait() and posix_lock_file(). Update osi_compat.h to include linux/filelock.h if it is available. Change-Id: If209730ad8483d15ad34b2242b098e7e1b352d9d Reviewed-on: https://gerrit.openafs.org/15346 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2023-04-06 11:55:33 -04:00
Michael Meffie	c9b46a6d1a	backup: Remove obsolete test programs The code for the butc_test and ttest programs look to be developer scaffolding for an earlier version of the backup server. Those programs use a version of the TC_PerformDump RPC which pre-dates OpenAFS, so the butc_test and ttest have never been buildable in OpenAFS. Remove this obsolete scaffolding code since it is not maintained and trigger errors when scanned with static analyzers. Change-Id: Ia3d0c8d3758e1ed41d285beafa64c326dc11658d Reviewed-on: https://gerrit.openafs.org/15097 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-03-16 11:42:47 -04:00
Ben Huntsman	ca67e8c848	AIX: Avoid including net/netisr.h on AIX 7.2 and above On AIX 7.2 and higher, /usr/include/net/netisr.h includes a header <sys/libsysp.h>. This is an internal kernel library header that IBM does not ship. The include of sys/libsysp.h is new in AIX 7.2. The only information we need from net/netisr.h are the definitions for NETISR_MAX and NET_KPROC. Using the macro AFS_AIX72_ENV, defined in src/config/param.rs_aix72.h and param.rs_aix73.h, we can just provide these definitions directly and avoid the include of net/netisr.h. As part of this update we also add the macro AFS_AIX72_ENV to src/config/param.rs_aix72.h as it was missing there but is required for the edit to src/rx/AIX/rx_knet.c to work. On earlier versions of AIX, we will continue to include net/netisr.h. A case has been opened with IBM and they acknowledge the problem and are working on a solution. However, we still need to be able to deal with the situation as shipped from IBM. A future AIX 7.4 will likely be able to include net/netisr.h again. Change-Id: I9a1cbe02557cee9343d7f1a016f0223ba14076fa Reviewed-on: https://gerrit.openafs.org/15312 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-03-01 23:08:50 -05:00
Ben Huntsman	b713b62b33	comerr: Update rule for compile_et Update the build rule for compile_et under new platform rs_aix73. Change-Id: I29d512c20f259935632f5aba8ff6ee28ed42b37f Reviewed-on: https://gerrit.openafs.org/15311 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2023-03-01 23:08:44 -05:00
Ben Huntsman	d03723e87a	configure: Add platform rs_aix73 Enable the configure script to detect and configure AIX 7.3. Update INSTALL to reflect new platform. Change-Id: I8efe14731f7ba3a6cf65a913a2303181567ac3f1 Reviewed-on: https://gerrit.openafs.org/15310 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2023-03-01 23:08:39 -05:00

1 2 3 4 5 ...

13793 Commits