openafs

mirror of https://git.openafs.org/openafs.git synced 2025-01-18 06:50:12 +00:00

Author	SHA1	Message	Date
Stephan Wiesand	4f78b3fdf1	Correct our contributor's code of conduct There are no races. Racism does exist though. Change-Id: I0a4cde55a5f470649eb99c5d7f30c9cec86d9baa Reviewed-on: https://gerrit.openafs.org/14320 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-09-04 10:01:28 -04:00
Andrew Deason	c4f853aa00	UKERNEL: Build linktest with COMMON_CFLAGS Currently, 'linktest' in libuafs is built with a weird custom rule that specifies several various CFLAGS and LDFLAGS, etc. One side-effect of this is that linktest is built without specifying -O, even if optimization is otherwise enabled. Normally nobody would care about the optimization of linktest, since it's never supposed to be run, but this can cause an error when building with -D_FORTIFY_SOURCE=1 on some systems (such as RHEL7): In file included from /usr/include/sys/types.h:25:0, from /.../src/config/afsconfig.h:1485, from /.../src/libuafs/linktest.c:15: /usr/include/features.h:330:4: error: #warning _FORTIFY_SOURCE requires compiling with optimization (-O) [-Werror=cpp] # warning _FORTIFY_SOURCE requires compiling with optimization (-O) ^ cc1: all warnings being treated as errors make[3]: *** [linktest] Error 1 For now, to fix this just include $(COMMON_CFLAGS) in the flags we give for linktest, so $(OPTMZ) also gets pulled in, and building linktest gets a little closer to a normal compilation step. Change-Id: I3362dcfe8407825ab88854ae59da4188ed16be9d Reviewed-on: https://gerrit.openafs.org/14324 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-09-03 23:02:00 -04:00
Jan Iven	696f2ec67b	ptserver: Remove duplicate ubik_SetLock in listSuperGroups It looks like a call to ubik_SetLock(.. LOCKREAD) was left in place in listSuperGroups after locking was moved to ReadPreamble in commit `a6d64d70` (ptserver: Refactor per-call ubik initialisation) When compiled with 'supergroups', and once contacted by "pts mem -expandgroups ..", ptserver will therefore abort() with Ubik: Internal Error: attempted to take lock twice This patch removes the superfluous ubik_SetLock. FIXES 135147 Change-Id: I8779710a6d68e4126fc482123b576690d86e4225 Reviewed-on: https://gerrit.openafs.org/14338 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-09-03 23:00:31 -04:00
Cheyenne Wills	16bae98ec5	INSTALL: document the minimum Linux kernel level The change associated with gerrit #14300 removed support for older Linux kernels (2.6.10 and earlier). The commit 'Import of code from autoconf-archive' (`d8205bbb4`) introduced a check for Autoconf 2.64. Autoconf 2.64 was released in 2009. The commit 'regen.sh: Use libtoolize -i, and .gitignore generated build-tools' (`a7cc505d3`) introduced a dependency on libtool's '-i' option. Libtool supported the '-i' option with libtool 1.9b in 2004. Update the INSTALL instructions to document a minimum Linux kernel level and the minimum levels for autoconf and libtool. Notes: RHEL4 (EOL in 2017) had a 2.6.9 kernel and RHEL5 has a 2.6.18 kernel. RHEL5 has libtool 1.5.22 and autoconf 2.59, RHEL6 has libtool 2.2.6 and autoconf 2.63, and RHEL7 has libtool 2.4.2 and autoconf 2.69. Change-Id: I235eeffa4adb152e05aab7aca839700816e62c83 Reviewed-on: https://gerrit.openafs.org/14305 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-08-28 12:24:37 -04:00
Yadavendra Yadav	b968875a34	afs: Avoid NatPing event on all connection Inside release_conns_user_server, connection vector is traversed and after destroying a connection new eligible connection is found on which NatPing event will be set. Ideally there should be only one connection on which NatPing should be set but in current code while traversing all connection of server a NatPing event is set on all connections to that server. In cases where we have large number of connection to a server this can lead to huge number of “RX_PACKET_TYPE_VERSION” packets sent to a server. Since this happen during Garbage collection of user structs, to simulate this issue below steps were tried - had one script which “cd” to a volume mount and then script sleeps for large time. - Ran one infinite while loop where above script was called using PAG based tokens (As new connection will be created for each PAG) - Instrumented the code, so that we hit above code segment where NatPing event is set. Mainly reduced NOTOKTIMEOUT to 60 sec. To fix this issue set NatPing on one connection and once it is set break from “for” loop traversing the server connection. Change-Id: Ia38cec0403fde76cdd59aa664bd261481e2edee6 Reviewed-on: https://gerrit.openafs.org/14312 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2020-08-28 12:10:40 -04:00
Mark Vitale	291bad659e	vos: avoid 'half-locked' volume after interrupted 'vos rename' Reported symptoms: If a 'vos rename' is interrupted after it has locked the volume and replaced the VLDB entry, but before it has unlocked the volume, the volume will remain locked. However, the locked volume will NOT be listed as locked in any vos commands that display locked status (see below for details). Background: Most vos write operations lock the VLDB volume entry before proceeding, then release the volume lock when finished. This is accomplished via VL_SetLock and VL_ReleaseLock, respectively. VL_SetLock always sets these members in the VLDB volume entry: - flags is modified to set the required VLOP_* code bit as specified - LockAFSid is set to 0 (never implemented) - LockTimestamp is set to the current time VL_ReleaseLock always sets them as follows: - flags is cleared of any VLOP_* code bit - LockAFSid is set to 0 (never implemented) - LockTimestamp is set to 0 VL_ReplaceEntry(N) may also optionally clear each of these members: - flags operation bits may be explicitly cleared via LOCKREL_OPCODE - LockAFSid may be explicitly cleared via LOCKREL_AFSID - LockTimestamp may be explicitly cleared via LOCKREL_TIMESTAMP When all 3 options are specified, VL_ReplaceEntry also does the functional equivalent of a VL_ReleaseLock. Most vos operations use this method. However, when no lock release options are specified on VL_ReplaceEntry(N), the VLDB entry is simply replaced with the supplied entry. This includes whatever flags values are specified in the supplied entry; therefore, this amounts to an additional, implicit way to set or modify the flags. Root cause: 'vos rename' (UV_RenameVolume) is the only vos operation that does all of the following things: - accepts a replacement volume entry that was obtained before VL_SetLock (and thus does NOT have any lock flags set) - issues VL_SetLock (which sets the lock flag in the VLDB) - issues VL_ReplaceEntry(N) with the original unlocked entry, and with no lock release options (thus with explicit intent to leave the lock flag unchanged, but inadvertently doing an implicit clear of the lock flag in the VLDB) - (performs some additional volserver work) - issues VL_ReleaseLock to release the volume lock Therefore, if 'vos rename' is cancelled or killed before reaching the final VL_ReleaseLock step, the VLDB entry is left with the lock flags cleared but the LockTimestamp still set. As we will see below, this 'half-locked' state produces confusing results from other vos commands. Detection of locked state: The 'vos lock' command (and all other vos commands that issue VL_SetLock) use the lock timestamp to determine if a volume is locked. However, several other vos commands ('vos listvldb <vol>', 'vos examine <vol>', 'vos listvldb -locked') use the VLDB entry's lock flags (not the lock timestamp) to determine if the volume is locked. Therefore, if the lock flags have been cleared but the lock timestamp is still set, these commands fail to detect that the volume is still locked. Yet an administrator's 'vos lock <volume>' will still fail with: Could not lock VLDB entry for volume <volume> VLDB: vldb entry is already locked This is the external manifestation of the 'half-locked' state. Workaround and fix: This scenario has a simple workaround: 'vos unlock <volume>'. However, to avoid this confusing outcome in the first place, modify the 'vos rename' logic so that the lock flags are no longer inadvertently cleared. Now, if the 'vos rename' is interrupted before the volume is unlocked, it will still appear locked in normal vos command output. Change-Id: I6cc16d20c4487de4e9a866c6f0c89d950efd2f7d Reviewed-on: https://gerrit.openafs.org/14157 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-08-28 11:34:28 -04:00
Mark Vitale	21cd26cb0d	rxgen: remove dead code hndle_param_tail Since the original IBM code import, hndle_param_tail has been dead code. It was later ifdef'd out in commit `8f2df21ffe` 'pull-prototypes-to-head-20020821' Remove the dead code from the tree. No functional change is incurred by this commit. Change-Id: I29128eecc93a5871f5bb9369c3983baf5b537beb Reviewed-on: https://gerrit.openafs.org/14322 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-08-28 11:34:16 -04:00
Marcio Barbosa	d5f0e16ac4	bos: suppress unnecessary warn if -noauth Commit `d008089a7` (Add interface to select client security objects) consolidated the code that selects the client security objects into a set of new interfaces. Before this commit, the "bos: running unauthenticated" message, which warns the user when an unauthenticated connection is established, used to be suppressed if the -noauth flag was specified. Similarly to commit `b3c16324e` (ubik: Make ugen_ClientInit honor noAuthFlag), recover the original behavior avoiding warn messages about unauthenticated connections if the -noauth flag is provided. Change-Id: Iaf0ac6bd91ea160256823512f060afc94b5926bf Reviewed-on: https://gerrit.openafs.org/14306 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-08-27 23:24:30 -04:00
Michael Meffie	904f5bd398	vlserver: fix missing read-only entries from ListAttributesN2 The ListAttributesN2() RPC can fail to list read-only entries under certain circumstances. This RPC is used by the `vos listvldb` command to retrieve vldb entries (unless the -name option is given). The `vos listvldb` command fails to list volume entries when run with the '-server' option for volumes that have read-only replicas, but have not been released. Consider the following example volume: $ vos create fs1.example.com a test $ vos addsite fs1.example.com a test $ vos addsite fs2.example.com a test $ vos listvldb ... test RWrite: 536870921 number of sites -> 3 server fs1.example.com partition /vicepa RW Site server fs1.example.com partition /vicepa RO Site -- Not released server fs2.example.com partition /vicepa RO Site -- Not released `vos listvldb` fails to find the volume when the search is limited to server 'fs2': $ vos listvldb -server fs2.example.com VLDB entries for server fs2.example.com Total entries: 0 Instead of the expected results: $ vos listvldb -server fs2.example.com test RWrite: 536870921 number of sites -> 3 server fs1.example.com partition /vicepa RW Site server fs1.example.com partition /vicepa RO Site -- Not released server fs2.example.com partition /vicepa RO Site -- Not released This situation makes it difficult to remove old server addresses from the vldb. In this situation, 'vos remaddrs' and 'vos changeaddr -remove' commands will complain the server addresses are still in use by volume entries, however running 'vos listvldb -server' will not show which volumes entries are in use. The entries are not listed for unreleased volumes because the ListAttributesN2() RPC is currently checking the volume VLF_ROEXISTS flag, instead of the server site flags (serverFlags) to determine when the entry is a read-only site. The volume VLF_ROEXISTS flag is set when a volume is released. To fix this, make ListAttributesN2 check for the VLSF_ROVOL site flag, instead of the VLF_ROEXISTS entry flag. Change-Id: Ib636fbe016d1d2f5b117624d9930dba83ebcef8a Reviewed-on: https://gerrit.openafs.org/14154 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-08-21 12:48:58 -04:00
Cheyenne Wills	13a49aaf0d	LINUX 5.9: Remove HAVE_UNLOCKED_IOCTL/COMPAT_IOCTL Linux-5.9-rc1 commit 'fs: remove the HAVE_UNLOCKED_IOCTL and HAVE_COMPAT_IOCTL defines' (4e24566a) removed the two referenced macros from the kernel. The support for unlocked_ioctl and compat_ioctl were introduced in Linux 2.6.11. Remove references to HAVE_UNLOCKED_IOCTL and HAVE_COMPAT_IOCTL using the assumption that they were always defined. Notes: With this change, building against kernels 2.6.10 and older will fail. RHEL4 (EOL in March 2017) used a 2.6.9 kernel. RHEL5 uses a 2.6.18 kernel. In linux-2.6.33-rc1 the commit messages for "staging: comedi: Remove check for HAVE_UNLOCKED_IOCTL" (00a1855c) and "Staging: comedi: remove check for HAVE_COMPAT_IOCTL" (5d7ae225) both state that all new kernels have support for unlocked_ioctl/compat_ioctl so the checks can be removed along with removing support for older kernels. Change-Id: Idd2716f3573ea455f8a5e1535bca584af0787717 Reviewed-on: https://gerrit.openafs.org/14300 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2020-08-21 12:32:45 -04:00
Michael Meffie	f5051b87a5	vos: avoid CreateVolume when restoring over an existing volume Currently, the UV_RestoreVolume2 function always attempts to create a new volume, even when doing a incremental restore over an existing volume. When the volume already exists, the volume creation operation fails on the volume server with a VVOLEXISTS error. The client will then attempt to obtain a transaction on the existing volume. If a transaction is obtained, the incremental restore operation will proceed. If a full restore is being done, the existing volume is removed and a new empty volume is created. Unfortunately, the failed volume creation is logged to by the volume server, and so litters the log file with: Volser: CreateVolume: Unable to create the volume; aborted, error code 104 To avoid polluting the volume server log with these messages, reverse the logic in UV_RestoreVolume2. Assume the volume already exists and try to get the transaction first when doing an incremental restore. Create a new volume if the transaction cannot be obtained because the volume is not present. When doing a full restore, remove the existing volume, if one exists, and then create a new empty volume. Change-Id: I8bdc13130d12c81cd2cd18a9484852708cac64d7 Reviewed-on: https://gerrit.openafs.org/14208 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Marcio Brito Barbosa <marciobritobarbosa@gmail.com> Tested-by: Marcio Brito Barbosa <marciobritobarbosa@gmail.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-08-20 23:02:51 -04:00
Michael Meffie	624219a1b2	tests: Accommodate c-tap-harness 4.7 The SOURCE and BUILD environment variables have been changed to C_TAP_SOURCE and C_TAP_BUILD in the new version of c-tap-harness. The runtests command syntax has changed as well. Convert all of the old SOURCE and BUILD environment variables to the new C_TAP_SOURCE and C_TAP_BUILD names. Add the required -l command line option to specify the test list. Add the new runtests -v option to run the tests in verbose mode to make it easier to see which tests failed. Change-Id: I209a6dc13d6cd1507519234fce1564fc4641e70b Reviewed-on: https://gerrit.openafs.org/14295 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-08-20 22:36:56 -04:00
Russ Allbery	3f377aa117	Import of code from c-tap-harness This commit updates the code imported from c-tap-harness to abdb66561ffd4d2f238fdb06f448ccf09d80c059 (release/4.7) Upstream changes are: Daniel Collins (1): Add is_blob() test function. Daniel Kahn Gillmor (1): LICENSE: use https for all URLs Daria Brashear (1): Add verbose mode environment variable to runtests Julien ÉLIE (2): Document -v in usage and comments of runtests Avoid realloc of zero length in tests/runtests.c Marc Dionne (1): Add test_cleanup_register_with_data Russ Allbery (115): clang --analyze cleanups for runtests Modernize POD tests Update README to my current layout Explicitly note that test programs must be executable Fix comment typo in tests/runtests.c Switch to a copyright-format 1.0 LICENSE file Flush harness output after each line Show the test count as ? when the plan is deferred More correctly backspace over test counts when aborting Refactor test list handling Allow passing tests on the runtests command line Don't allow command-line arguments if a list was given Search for tests under the name given as well Release 2.0 Fix backward incompatibility when searching for tests Document decision to ignore TAP version directives Release 2.1 Document different runtests behavior in bail handling Change exit status of bail to 255 Release 2.2 Add a new test_cleanup_register C API Add warn_unused_result attributes Add portability for warn_unsed_result attributes to tap/macros.h Minor coding style fix (spacing) in runtests.c Split the runtests usage string for ISO C90 string limits Include stddef.h Diagnose failure to register the exit handler Use diag internally in the basic C TAP library Some additional comments about cleanup functions Move repetitive printing code in the C TAP library to a macro Set a flag when bailing for more correct cleanup Change my email address to eagle@eyrie.org Release 2.3 Add diag_file_add and diag_file_remove functions Don't die for unknown files passed to diag_file_remove Release 2.4 Update comment about AIX and WCOREDUMP Don't test for NULL before calling free Be more careful about file descriptors in child processes Run cleanup functions in non-primary processes as well Release 3.0 Update collective package copyright notices at start of LICENSE Check integer overflows on memory allocation, fix string creation Switch POD spelling test to use Lancaster consensus variable Add new bnrealloc API for brealloc with checked multiplication Rename nrealloc to reallocarray Return the test status from test functions Fix the overflow check for breallocarray Fix the overflow check for xreallocarray in runtests Restructure test result reallocation in runtests Change diag and sysdiag to always return true Release 3.1 Fix typos in basic.c and basic.h Fix usage message when running runtests with no arguments Update introductory runtests comments for current syntax Add the -l flag to suggested runtests invocation in README Support comments and blank lines in test lists Release 3.2 Update licensing information Various improvements to verbose support Compile warning-free with Clang, check Autoconf macros Release 3.3 Remove unnecessary assert.h include in tap/basic.c Fix some additional -v documentation issues Rebalance usage to avoid too-long strings Fix segfault in runtests with empty test list Release 3.4 Document running autogen if starting from Git Rename autogen to bootstrap Support and prefer C_TAP_SOURCE and C_TAP_BUILD Fix comment typo in tests/runtests.c Add missing va_end to is_double Release 4.0 Fix all non-https www.eyrie.org URLs Add is_bool C test function Add DocKnot metadata and a Markdown README file Update documentation for new DocKnot standards Release 4.1 Use more defaults from DocKnot templates Fix new fall-through warning in GCC 7 Use compiler warnings from rra-c-util, fix issues Merge pull request #4 from solemnwarning/master Coding style fixes and NEWS for is_blob Re-enable -Wunknown-pragmas for GCC Avoid zero-length realloc allocations in breallocarray Update copyright date on tests/runtests.c Release 4.2 Add SPDX-License-Identifier headers to source files Add and run new check-cppcheck target Fix instructions for running one test Identify values as left and right Fix is_string comparisons with NULL pointers Add support for running tests under valgrind Replace putc with fprintf Update shared files from rra-c-util Release 4.3 Update NEWS date for 4.3 release Collapse some copyright dates NEWS and coding style for test_cleanup_register_with_data Remove unused variables caught by Clang scan-build Update to rra-c-util 8.0 Fix error checking in bstrndup Release 4.4 Add support for C++ Document that C TAP Harness can be built as C++ Release 4.5 Regenerate README files Reformat using clang-format 10 Update to rra-c-util 8.1 Release 4.6 Fix spelling errors caught by codespell Protect the test suite against C_TAP_VERBOSE Switch to GitHub Actions for CI Add NEWS entry for GCC 10 warning fixes Release 4.7 Change-Id: I5a78215bf99b53bd848f0fa6bb9092deab38f24e Reviewed-on: https://gerrit.openafs.org/14294 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-08-20 22:36:50 -04:00
Andrew Deason	eccd4b9778	afs: Always define our own osi_timeval32_t Since OpenAFS 1.0, osi_GetTime has taken a timeval-like pointer, which contains 32-bit fields (the actual type has been called either osi_timeval_t or osi_timeval32_t over time). For platforms that have a native timeval-like type with 32-bit fields, we just define osi_timeval32_t to that type, and elsewhere we define our own struct to be osi_timeval32_t. For platforms that use the native timeval, we can then define osi_GetTime() to just be, e.g., microtime(). This approach is difficult to maintain, though, because we must keep track of whether 'struct timeval' contains 32-bit fields on each platform, which can depend on many factors. It's easy to make mistakes (the current tree already contains mistakes), and there's not much benefit. To avoid all of this, just always define osi_timeval32_t to be our own struct with afs_int32 fields, and provide definitions for osi_GetTime that convert from the native time struct to our osi_timeval32_t. This does mean that for some platforms we do an unnecessary type conversion, but this is a small price to pay for more straightforward and maintainable code. To be a little more sure that our types are correct, change osi_GetTime to be defined as an inline function instead of a macro. At the same time, do a similar conversion for the KERNEL implementation of the rx clock_GetTime function. Get rid of platform-specific mess, and do a straightforward type conversion between osi_timeval32_t and struct clock in an inline function. Change-Id: I18819acb556a2a7f1b6da6994db9783c48108934 Reviewed-on: https://gerrit.openafs.org/14238 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-08-07 12:10:44 -04:00
Andrew Deason	a5c3dfe99f	afs: Move osi_GetTime out of param.h Most platforms currently #define osi_GetTime in their param.h. This is really redundant, since the definition of osi_GetTime almost never changes for a given platform, so we end up with many copies of the same osi_GetTime definition for a given platform. Move osi_GetTime out of param.h for these platforms, and define it in osi_machdep.h instead, which is where most platform-specific definitions go. For DFBSD, we don't have an osi_machdep.h at all yet, so create a new one to contain the osi_GetTime definition. Currently we don't build libafs at all on DFBSD, but do this anyway so we don't lose the existing osi_GetTime definition. For NBSD, we were providing (conflicting!) definitions for osi_GetTime in param.h and in osi_machdep.h. Just remove the definitions in param.h, since those should have been getting overridden by the osi_machdep.h definition. Change-Id: I7097d9fe2fcd38c06ecc275e8fe3a2c69c9d0436 Reviewed-on: https://gerrit.openafs.org/14237 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-08-07 11:44:40 -04:00
Cheyenne Wills	c56873bf95	afs: Avoid using logical OR when setting f_fsid Building with clang-10 produces the warning/error message warning: converting the result of '<<' to a boolean always evaluates to true [-Wtautological-constant-compare] for the expression abp->f_fsid = (AFS_VFSMAGIC << 16) \|\| AFS_VFSFSID; The message is because a logical OR '\|\|' is used instead of a bitwise OR '\|'. The result of this expression will always set the f_fsid member to a 1 and not the intended value of AFS_VFSMAGIC combined with AFS_VFSFSID. Update the expression to use a bitwise OR instead of the logical OR. Note: This will change value stored in the f_fsid that is returned from statfs. Using a logical OR has existed since OpenAFS 1.0 for hpux/solaris and in UKERNEL since OpenAFS 1.5 with the commit 'UKERNEL: add uafs_statvfs' `b822971a`. Change-Id: I3e85ba48058ac68e3e3ac7f277623f660187926c Reviewed-on: https://gerrit.openafs.org/14292 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-31 22:18:50 -04:00
Cheyenne Wills	446457a124	afs: Set AFS_VFSFSID to a numerical value Currently when UKERNEL is defined, AFS_VFSFSID is always set to AFS_MOUNT_AFS, which is a string for many platforms for UKERNEL. Update src/afs/afs.h to insure that the define for AFS_VFSFSID is a numeric value when building UKERNEL. Clean up the preprocessor indentation in src/afs/afs.h in the area around the AFS_VFSFSID defines. Thanks to adeason@sinenomine.net for pointing out a much easier solution for resolving this problem. Change-Id: I618fc4c89029a6cca2ca6f530b8f65399299a9d1 Reviewed-on: https://gerrit.openafs.org/14279 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-31 22:18:38 -04:00
Cheyenne Wills	e5f44f6e9a	clang-10: ignore fallthrough warning in generated code Clang-10 will not recognize '/* fall through /' as an indicator to turn off the fallthrough warning due to the lack of a 'break' in a case statement. Code generated by flex uses the '/ fall through */' comments to turn off compiler warnings for fallthroughs in case statements. For code generated by flex, ignore the implicit-fallthrough via pragma or disable the warning via a compile time flag. Add new env variable "CFLAGS_NOIMPLICIT_FALLTHROUGH" to selectively disable the compile check in Makefiles when checking is enabled. Change-Id: I4c054defda03daa2aeb645ae2271dfa0cb54925f Reviewed-on: https://gerrit.openafs.org/14275 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-07-27 12:23:15 -04:00
Cheyenne Wills	16f1b2f894	clang-10: use AFS_FALLTHROUGH for case fallthrough Clang-10 will not recognize '/* fallthrough /' as an indicator to turn off the fallthrough diagnostic due to the lack of a 'break' in a case statement. Clang-10 requires the '__attribute__((fallthrough))' statement to disable the diagnostic. In addition clang-10 is finding additional locations where fall throughs occur. Determine if the compiler supports '__attribute__((fallthrough))' to disable the implicit fallthrough diagnostic. Define a new macro 'AFS_FALLTHROUGH' that will disable the fallthrough diagnostic. Set it as a wrapper for the Linux kernel's 'fallthrough' macro if available, otherwise set it as a wrapper macro for '__attribute__((fallthrough))' if the compiler supports it. Update CODING to document the use of AFS_FALLTHROUGH when needing to fallthrough between case statements. Replace the '/ fallthrough */' comments with AFS_FALLTHROUGH, and add AFS_FALLTHROUGH as needed. Replace some fallthroughs with a break (or goto) if the flow was was just to a break (or goto). e.g. case x: case x: somestmt; somestmt; break; case y: case y: break; break; Correct a mis-indented brace '}' in src/WINNT/afsd/smb3.c Note, the clang maintainers have rejected the use of comments as a flag to turn off the fall through warnings. Change-Id: Ia5da10fc14fc1874baca035a3cf471e618e0d5f5 Reviewed-on: https://gerrit.openafs.org/14274 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-27 12:20:50 -04:00
Michael Meffie	e61ab9353e	redhat: Add make to the dkms-openafs pre-requirements If `make` is not installed before dkms-openafs, the OpenAFS kernel module is not built during the dkms-openafs package installation. The failure happens in the "checking if linux kernel module build works" configure step, which invokes `make` to check the linux buildsystem. configure fails when `make` is not available, and gives the unhelpful suggestion (in this case) of configuring with --disable-kernel module. Running the configure.log in the dkms build directory shows: configure:7739: checking if linux kernel module build works make -C /lib/modules/4.18.0-193.6.3.el8_2.x86_64/build M=/var/lib/dkms/openafs/... ./configure: line 7771: make: command not found configure: failed using Makefile: Avoid this build failure by adding `make` to the list of dkms-openafs package pre-requirements. Change-Id: I98b3508341eea1df4fa7b6f43e88add1bda9ee2c Reviewed-on: https://gerrit.openafs.org/14266 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-26 22:16:58 -04:00
Andrew Deason	2d01f35d05	vol: Blank opts in VOptDefaults Instead of needing to set every single field in the 'opts' structure individually, blank the whole thing to make sure the entire struct is initialized. Remove the now-redundant lines that initialize various items to 0. Change-Id: I799cdb55becd66a8f3d6ec2f81338843038d0abd Reviewed-on: https://gerrit.openafs.org/14280 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Kailas Zadbuke <kailashsz@in.ibm.com> Reviewed-by: Yadavendra Yadav <yadayada@in.ibm.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-24 12:05:51 -04:00
Andrew Deason	4498bd8179	volser: Don't NUL-pad failed pread()s in dumps Currently, the volserver SAFSVolDump RPC and the 'voldump' utility handle short reads from pread() for vnode payloads by padding the missing data with NUL bytes. That is, if we request 4k of data for our pread() call, and we only get back 1k of data, we'll write 1k of data to the volume dump stream followed by 3k of NUL bytes, and log messages like this: 1 Volser: DumpFile: Error reading inode 1234 for vnode 5678 1 Volser: DumpFile: Null padding file: 3072 bytes at offset 40960 This can happen if we hit EOF on the underlying file sooner than expected, or if the OS just responds with fewer bytes than requested for any reason. The same code path tries to do the same NUL-padding if pread() returns an error (for example, EIO), padding the entire e.g. 4k block with NULs. However, in this case, the "padding" code often doesn't work as intended, because we compare 'n' (set to -1) with 'howMany' (set to 4k in this example), like so: if (n < howMany) Here, 'n' is signed (ssize_t), and 'howMany' is unsigned (size_t), and so compilers will promote 'n' to the unsigned type, causing this conditional to fail when n is -1. As a result, all of the relevant log messages are skipped, and the data in the dumpstream gets corrupted (we skip a block of data, and our 'howFar' offset goes back by 1). So this can result in rare silent data corruption in volume dumps, which can occur during volume releases, moves, etc. To fix all of this, remove this bizarre NUL-padding behavior in the volserver. Instead: - For actual errors from pread(), return an error, like we do for I/O errors in most other code paths. - For short reads, just write out the amount of data we actually read, and keep going. - For premature EOF, treat it like a pread() error, but log a slightly different message. For the 'voldump' utility, the padding behavior can make sense if a user is trying to recover volume data offline in a disaster recovery scenario. So for voldump, add a new switch (-pad-errors) to enable the padding behavior, but change the default behavior to bail out on errors. Change-Id: Ibd6e76c5ea0dea95e3354d9b34536296f81b4f67 Reviewed-on: https://gerrit.openafs.org/14255 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-24 12:03:44 -04:00
Cheyenne Wills	37b55b30c6	butc: fix int to float conversion warning Building with clang-10 results in 2 warnings/errors associated with with trying to convert 0x7fffffff to a floating point value. tcmain.c:240:18: error: implicit conversion from 'int' to 'float' changes value from 2147483647 to 2147483648 [-Werror, -Wimplicit-int-float-conversion] if ((total > 0x7fffffff) \|\| (total < 0)) /* Don't go over 2G */ and the same conversion warning on the statement on the following line: total = 0x7fffffff; Use floating point and decimal constants instead of the hex constants. For the test, use 2147483648.0 which is cleanly represented by a float. Change the comparison in the test from '>' to '>='. If the total value exceeds 2G, just assign the max value directly to the return variable. Change-Id: I79b2afa006496a756bd7b50976050c24827aa027 Reviewed-on: https://gerrit.openafs.org/14277 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-24 11:52:32 -04:00
Cheyenne Wills	899b1af418	autoconf: fix detection for fallthrough attribute Due to bug <https://savannah.gnu.org/patch/?9949>, ax_gcc_func_attribute.m4 fails to properly detect __attribute__((fallthrough)) in clang. Until this is fixed in autoconf-archive upstream, fix our local copy of ax_gcc_func_attribute.m4, so we can detect __attribute__((fallthrough)) to make --enable-checking work with clang. Change-Id: I80a4557384f8e1438344e48bfe722e20c8773882 Reviewed-on: https://gerrit.openafs.org/14273 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-07-24 11:37:38 -04:00
Cheyenne Wills	88da6b4dfa	cf: Make local copy of ax_gcc_func_attribute.m4 Make a local copy of ax_gcc_func_attribute from autoconf-archive. This is needed in order to fix a bug in the detection of the fallthrough attribute. Remove ax_gcc_func_attribute.m4 from src/external/autoconf-archive/m4. Update LICENSE file to point to the local copy in src/cf. Change-Id: I6c4244d2cd4edab4262c1820435c00419d85303b Reviewed-on: https://gerrit.openafs.org/14272 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-24 08:35:59 -04:00
Mark Vitale	bb5397e4c4	rx: prevent leakage of non-cached rx_connections (pthread) The rxi_connectionCache (AFS_PTHREAD_ENV only) allows applications to reuse rx_connection structs. Cached rx_connections are obtained via rx_GetCachedConnection and released via rx_ReleaseCachedConnection. This feature is used most heavily by libadmin and kauth, but there are other users in the tree as well. For instance, ubikclient routines ubik_ClientInit and ubik_ClientDestroy call rx_ReleaseCachedConnections (if AFS_PTHREAD_ENV) when disposing of their rx_connections. Unfortunately, in many cases these rx_connections were obtained via rx_NewConnection, _not_ from the cache via rx_GetCachedConnection. In those cases, rx_ReleaseCachedConnection will not find the rx_connection in the rxi_connectionCache, and thus it returns without doing anything. Therefore, when ubik_ClientInit is passed an existing ubik_client (for re-initialization) that contains rx_connections NOT allocated via rx_GetCachedConnection, those connections are not destroyed, but will be silently leaked. Similarly, ubik_ClientDestroy will leak its rx_connections when it frees the ubik_client struct. For example, the fileserver host package calls ubik_ClientInit (via hpr_Initialize) and ubik_ClientDestroy (via hpr_End) to manage connections to the ptserver. However, these connections were obtained via rx_NewConnection, not rx_GetCachedConnection. If the fileserver has a failed call to the ptserver that sets prfail=1, the next RPC scheduled for that client (in CallPreamble) will refresh the thread's ubik_client (viced_uclient_key) by calling hprEnd -> ubik_ClientDestroy -> rx_ReleaseCachedConnection. The "released" connections will be leaked. This problem exists in all versions of OpenAFS going back to IBM 1.0. Starting with 1.8.x, many components that were formerly LWP-only are now pthreaded and thus susceptible to this leak. It seems difficult and error-prone to identify all possible code paths that may pass a non-cached rx_connection to rx_ReleaseCachedConnection, and convert them to obtain connections via rx_GetCachedConnection. Instead, prevent all existing and future leaks by modifying the connection cache to: - flag all rx_connections it allocates - correctly release any rx_connection it is passed, whether they came from the cache or not. Change-Id: Ibe164ccd30a8ddd799438c28fd6e1d8a0a9040dd Reviewed-on: https://gerrit.openafs.org/13042 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-07-23 23:42:20 -04:00
Mark Vitale	55fca11421	rx: fix out-of-range value for RX_CONN_NAT_PING Commit `496fb87372` ("rx: avoid nat ping until connection is attached") introduced functionality to defer turning on NAT ping for server connections until after reachability had been established for the client. Unfortunately, this feature could never work correctly because it assigned an out-of-range flag value of 256 (0x100) for the u_char flags field. Instead of calling this out as an error, both gcc and Solaris cc elide this flag so that it is never set in rx_SetConnSecondsUntilNatPing(), Furthermore, the test in rxi_ConnClearAttachWait() will always fail; therefore rxi_ScheduleNatKeepAliveEvent is never called after attach wait has ended. Fortunately, this bug is currently moot - not actually exposed in OpenAFS. (It was discovered by inspection). This is because there are currently no rx_connection objects in the tree that have both NAT ping and checkReach (rx_SetCheckReach) enabled. I also searched git history and found no time when this bug could ever have been exposed. This does raise the question of why the original commit was needed; but instead of reverting the original commit, this commit attempts to fix it. To prevent problems if NAT ping and checkReach are ever both enabled for an rx_connection, enlarge the rx_connection flags member so that the RX_CONN_NAT_PING value is no longer out of range. Change-Id: Ib667ece632f66fa5c63a76398acb3153fed6f9c3 Reviewed-on: https://gerrit.openafs.org/13041 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-23 23:06:14 -04:00
Andrew Deason	d231134aad	auth: Avoid cellconfig.c stdio renaming Since commit `35777145` (solaris-fopen-sucks-20060916), cellconfig.c has redirected fopen, fclose, and fgets to local functions on non-64bit-sparc Solaris, in order to work around that platform's stdio limitations. Commit `7c431f7571` (auth: retire writeconfig.c) moved the contents of writeconfig.c into cellconfig.c. The previous writeconfig.c contained some calls to stdio, including calling fprintf() on a pointer returned by fopen() in that file. Because fopen() was redirected to our local version, this means that afsconf_SetExtendedCellInfo() calls fopen() to get an afsconf_iobuffer, and passes that pointer to the real system fprintf() later on (instead of a native FILE). The compiler does warn about this, but this only happens on Solaris, where --enable-checking is not implemented, so the build never fails. To avoid this, remove the #defines for fopen, fgets, and fclose. Instead, change all of the old cellconfig.c callers to explicitly call afsconf_fopen, afsconf_fgets, and afsconf_fclose. On the affected Solaris platforms, we keep our local definitions, and for other platforms, we just make those functions call their system stdio equivalents. For the code that was pulled in from writeconfig.c, callers will just call the system fopen, fprintf, and fclose. We still keep our local afsconf_FILE* definition on all platforms, so the compiler will still do typechecking for our local afsconf_f* functions on all platforms. So now if we make a mistake, it should be a mistake on all platforms, so platforms with --enable-checking should flag the error. Change-Id: I4064d7f5ee82d5acab04a33b01c0603564a391e8 Reviewed-on: https://gerrit.openafs.org/14214 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-13 16:49:50 -04:00
Andrew Deason	cd65475e95	afs: Let afs_ShakeLooseVCaches run longer Currently, when afs_ShakeLooseVCaches runs osi_TryEvictVCache, we check if osi_TryEvictVCache slept (i.e. dropped afs_xvcache/GLOCK). If we sleep over 100 times, then we stop trying to evict vcaches and return. If we have recently accessed a lot of AFS files, this limitation can severely reduce our ability to keep our number of vcaches limited to a reasonable size. For example: Say a Linux client runs a process that quickly accesses 1 million files (a simple 'find' command) and then does nothing else. A few minutes later, afs_ShakeLooseVCaches is run, but since all of the newly accessed vcaches have dentries attached to them, we will sleep on each one in order to try to prune the attached dentries. This means that afs_ShakeLooseVCaches will evict 100 vcaches, and then return, leaving us with still almost 1 million vcaches. This will happen repeatedly until afs_ShakeLooseVCaches finally works its way through all of the vcaches (which takes quite a while, if we only clear 100 at once), or the dentries get pruned by other means (such as, if Linux evicts them due to memory pressure). The limit of 100 sleeps was originally added in commit `29277d96` (newvcache-dont-spin-20060128), but the current effect of it was largely introduced in commit `9be76c0d` (Refactor afs_NewVCache). It exists to ensure that afs_ShakeLooseVCaches doesn't take forever to run, but the limit of 100 sleeps may seem quite low, especially if those 100 sleeps run very quickly. To avoid the situation described above, instead of limiting afs_ShakeLooseVCaches based on a fixed number of sleeps, limit it based on how long we've been running, and set an arbitrary limit of roughly 3 seconds. Only check how long we've been running after 100 sleeps like before, so we're not constantly checking the time while running. Log a new warning if we exit afs_ShakeLooseVCaches prematurely if we've been running for too long, to help indicate what is going on. Change-Id: I65729ace748e8507cc0d5c26dec39e74d7bff5d2 Reviewed-on: https://gerrit.openafs.org/14254 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-10 01:27:45 -04:00
Andrew Deason	9ff45e73cf	afs: Skip bulkstat if stat cache looks full Currently, afs_lookup() will try to prefetch dir entries for normal dirs via bulkstat whenever multiple pids are reading that dir. However, if we already have a lot of vcaches, ShakeLooseVCaches may be struggling to limit the vcaches we already have. Entering afs_DoBulkStat can make this worse, since we grab afs_xvcache repeatedly, we may kick out other vcaches, and we'll possibly create 30 new vcaches that may not even be used before they're evicted. To try to avoid this, skip running afs_DoBulkStat if it looks like the stat cache is really full. Change-Id: I1634530170a189f32cb962dd7df28f88bc758b71 Reviewed-on: https://gerrit.openafs.org/13256 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-10 01:16:27 -04:00
Andrew Deason	0532f917f2	afs: Log warning when we detect too many vcaches Currently, afs_ShakeLooseVCaches has a kind of warning that is logged when we fail to free up any vcaches. This information can be useful to know, since it may be a sign that users are trying to access way more files than our configured vcache limit, hindering performance as we constantly try to evict and re-create vcaches for files. However, the current warning is not clear at all to non-expert users, and it can only occur for non-dynamic vcaches (which is uncommon these days). To improve this, try to make a general determination if it looks like the stat cache is "stressed", and log a message if so after afs_ShakeLooseVCaches runs (for all platforms, regardless of dynamic vcaches). Also try to make the message a little more user-friendly, and only log it (at most) once per 4 hours. Determining whether the stat cache looks stressed or not is difficult and arguably subjective (especially for dynamic vcaches). This commit draws a few arbitrary lines in the sand to make the decision, so at least something will be logged in the cases where users are constantly accessing way more files than our configured vcache limit. Change-Id: I022478dc8abb7fdef24ccc06d477b349cca759ac Reviewed-on: https://gerrit.openafs.org/13255 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-10 01:15:17 -04:00
Mark Vitale	42fb8786a8	viced: propagate return from CleanupTimedOutCallBacks_r The fileserver's FiveMinuteCheckLWP periodically calls CleanupTimedOutCallBacks, and logs an informational messages if the return code indicates that any callbacks were discarded. However, since the original IBM code import, CleanupTimedOutCallBacks has 1) ignored the return value from CleanupTimedOutCallBacks_r and 2) unconditionally returned 0. This makes the informational message essentially dead code. Instead, check the code from CleanupTimedOutCallBacks_r and pass it back to the caller. Change-Id: I631831c398e43431b79f4a3a0c6f01307ac0c05e Reviewed-on: https://gerrit.openafs.org/14256 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-10 00:53:12 -04:00
Andrew Deason	f9d20c631d	LINUX: Close cacheFp if no ->readpage in fastpath In afs_linux_readpage_fastpath, if we discover that our disk cache fs has no ->readpage function, we'll 'goto out', but we never close our cacheFp. To make sure we close it, add a filp_close() call to the 'goto out' cleanup code. Change-Id: I371c1d7ec51b03447fbcbe58fb89be7be0235022 Reviewed-on: https://gerrit.openafs.org/14252 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-07-03 18:16:46 -04:00
Cheyenne Wills	af73b9a3b1	LINUX: Don't panic on some file open errors Commit 'LINUX: Return NULL for afs_linux_raw_open error' (`f6af4a155`) updated afs_linux_raw_open to return NULL on some errors, but still panics if obtaining the dentry fails. Commit 'afs: Verify osi_UFSOpen worked' (`c6b61a451`) updated callers of osi_UFSOpen to verify whether or not the open was successful. This meant osi_UFSOpen (and routines it calls) could pass back an error indication rather than panic when an error is encountered. Update afs_linux_raw_open to return a failure instead of panic if unable to obtain a dentry. Update osi_UFSOpen to return a NULL instead of panic if unable to obtain memory or fails to open the file. All callers of osi_UFSOpen handle a fail return, though some will still issue a panic. Update afs_linux_readpage_fastpath and afs_linux_readpages to not panic if afs_linux_raw_open fails. Instead of panic, return an error. For testing, an error can be forced by removing a file from the cache directory. Note this work is based on a commit by pruiter@sinenomine.net Change-Id: Ic47e4868b4f81d99fbe3b2e4958778508ae4851f Reviewed-on: https://gerrit.openafs.org/14242 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-07-03 18:16:36 -04:00
Cheyenne Wills	d2d27f975d	afs: Avoid panics on failed return from afs_CFileOpen afs_CFileOpen is a macro that invokes the open "method" of the afs_cacheOps structure, and for disk caches the osi_UFSOpen function is used. Currently osi_UFSOpen will panic if there is an error encountered while opening a file. Prepare to handle osi_UFSOpen function returning a NULL instead of issuing a panic (future commit). Update callers of afs_CFileOpen to test for an error and to return an error instead of issuing a panic. While this commit eliminates some panics, it does not address some of the more complex cases associated with errors from afs_CFileOpen. Change-Id: I2bdd525633dd44ebf8e26fcfd7059dfdfffb6142 Reviewed-on: https://gerrit.openafs.org/14241 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-03 11:50:42 -04:00
Cheyenne Wills	7d85ce221d	LINUX 5.8: use lru_cache_add With Linux-5.8-rc1 commit 'mm: fold and remove lru_cache_add_anon() and lru_cache_add_file()' (6058eaec), the lru_cache_add_file function is removed since it was functionally equivalent to lru_cache_add. Replace lru_cache_add_file with lru_cache_add. Introduce a new autoconf test to determine if lru_cache_add is present For reference, the Linux changes associated with the lru caches: __pagevec_lru_add introduced before v2.6.12-rc2 lru_cache_add_file introduced in v2.6.28-rc1 __pagevec_lru_add_file replaces __pagevec_lru_add in v2.6.28-rc1 vmscan: split LRU lists into anon & file sets (4f98a2fee) __pagevec_lru_add removed in v5.7 with a note to use lru_cache_add_file mm/swap.c: not necessary to export __pagevec_lru_add() (bde07cfc6) lru_cache_add_file removed in v5.8 mm: fold and remove lru_cache_add_anon() and lru_cache_add_file() (6058eaec) lru_cache_add exported mm: fold and remove lru_cache_add_anon() and lru_cache_add_file() (6058eaec) Openafs will use: lru_cache_add on 5.8 kernels lru_cache_add_file from 2.6.28 through 5.7 kernels __pagevec_lru_add/__pagevec_lru_add_file on pre 2.6.28 kernels Change-Id: I79ebe4a81425bf8a8a327ddf2d3474aff9df039d Reviewed-on: https://gerrit.openafs.org/14249 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Yadavendra Yadav <yadayada@in.ibm.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-03 00:21:49 -04:00
Benjamin Kaduk	ae9ea8da69	Recode a couple files from ISO 8859-1 to UTF-8 Reported by Debian's lintian(1). The CellServDB, as an externally maintained file, is left unchanged. Change-Id: I3bf241b924cb8cd7799a4c3e799f6acd375b2e8a Reviewed-on: https://gerrit.openafs.org/14265 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-02 23:34:24 -04:00
Andrew Deason	ba8b92401b	afs: Bound afs_DoBulkStat dir scan Currently, afs_DoBulkStat will scan the entire directory blob, looking for entries to stat. If all or almost all entries are already stat'd, we'll scan through the entire directory, doing nontrivial work on each entry (we grab afs_xvcache, at least). All of this work is pretty pointless, since the entries are already cached and so we won't do anything. If many processes are trying to acquire afs_xvcache, this can contribute to performance issues. To avoid this, provide a constant bound on the number of entries we'll search through: nentries * 4. The current arbitrary limits cap nentries at 30, so this means we're capping the afs_DoBulkStat search to 120 entries. Change-Id: I66e9af5b27844ddf6cf37c8286fcc65f8e0d3f96 Reviewed-on: https://gerrit.openafs.org/13253 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-02 21:56:30 -04:00
Andrew Deason	6c808e05ad	afs: Avoid needless W-locks for afs_FindVCache The callers of afs_FindVCache must hold at least a read lock on afs_xvcache; some hold a shared or write lock (and set IS_SLOCK or IS_WLOCK in the given flags). Two callers (afs_EvalFakeStat_int and afs_DoBulkStat) currently hold a write lock, but neither of them need to. In the optimal case, where afs_FindVCache finds the given vcache, this means that we unnecessarily hold a write lock on afs_xvcache. This can impact performance, since afs_xvcache can be a very frequently accessed lock (a simple operation like afs_PutVCache briefly holds a read lock, for example). To avoid this, have afs_DoBulkStat hold a shared lock on afs_xvcache, upgrading to a write lock when needed. afs_EvalFakeStat_int doesn't ever need a write lock at all, so just convert it to a read lock. Change-Id: I5bd58b9e3a577c9e1ebf1bc3719e65a6c0af5cb8 Reviewed-on: https://gerrit.openafs.org/12656 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-02 21:46:45 -04:00
Kailas Zadbuke	e44d6441c8	util: Handle serverLogMutex lock across forks If a process forks when another thread has serverLogMutex locked, the child process inherits the locked serverLogMutex. This causes a deadlock when code in the child process tries to lock serverLogMutex, since we can never unlock serverLogMutex because the locking thread no longer exists. This can happen in the salvageserver, since the salvageserver locks serverLogMutex in different threads, and forks to handle salvage jobs. To avoid this deadlock, we register handlers using pthread_atfork() so that the serverLogMutex will be held during the fork. The fork will be blocked until the worker thread releases the serverLogMutex. Hence the serverLogMutex will be held until the fork is complete and it will be released in the parent and child threads. Thanks to Yadavendra Yadav(yadayada@in.ibm.com) for working with me on this issue. Change-Id: I191c8272825c1667bb2150146e04b1dfe36a54e4 Reviewed-on: https://gerrit.openafs.org/14239 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-06-30 00:49:21 -04:00
Andrew Deason	19cd454f11	afs: Split out bulkstat conditions into a function Our current if() statement for determining whether we should run afs_DoBulkStat to prefetch dir entries is a bit large, and grows over time. Split this logic out into a separate function to make it easier to maintain, and add some comments to help explain each condition. This commit should have no visible effects; it's just code reorganization. Change-Id: I0086189308d2f5e4b321c63f24110d74cda6433c Reviewed-on: https://gerrit.openafs.org/13254 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-25 23:37:15 -04:00
Andrew Deason	a05d5b7503	afs: Change VerifyVCache2 calls to VerifyVCache afs_VerifyVCache is a macro that (on most platforms) effectively expands to: if ((avc->f.states & CStatd)) { return 0; } else { return afs_VerifyVCache2(...); } Some callers call afs_VerifyVCache2 directly, since they already check for CStatd for other reasons. A few callers currently call afs_VerifyVCache2, but without guaranteeing that CStatd is not set. Specifically, in afs_getattr and afs_linux_VerifyVCache, CStatd could be set while afs_CreateReq drops GLOCK. And in afs_linux_readdir, CStatd could be cleared at multiple different points before the VerifyVCache call. This can result in afs_VerifyVCache2 acquiring a write-lock on the vcache, even when CStatd is already set, which is an unnecessary performance hit. To avoid this, change these call sites to use afs_VerifyVCache instead of calling afs_VerifyVCache2 directly, which skips the write lock when CStatd is already set. Change-Id: I7b75c9755af147b42a48160fa90c9849f2f03ddb Reviewed-on: https://gerrit.openafs.org/12655 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-22 22:37:44 -04:00
Mark Vitale	7c9fb44557	LINUX: replace BUG() call with osi_Panic() in osi_linux_free If osi_linux_free fails, it printf's an error message, then calls BUG(). This is the sole open-coded call to BUG() in OpenAFS; all other calls to BUG() are indirect via osi_Panic(). For consistency, eliminate this direct BUG() call by replacing the printf and BUG() with an equivalent osi_Panic(). This also ensures that the error messsage is logged as critical, and prefixed with "openafs:". Change-Id: Id319dffa859308528a66991bbbc522ca49552d51 Reviewed-on: https://gerrit.openafs.org/14250 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-19 12:55:49 -04:00
Cheyenne Wills	d8ec294534	LINUX 5.8: do not set name field in backing_dev_info Linux-5.8-rc1 commit 'bdi: remove the name field in struct backing_dev_info' (1cd925d5838) Do not set the name field in the backing_dev_info structure if it is not available. Uses an existing config test 'STRUCT_BACKING_DEV_INFO_HAS_NAME' Note the name field in the backing_dev_info structure was added in Linux-2.6.32 Change-Id: I20b80e49e8a15a2949003101f24d9ce39f63b59b Reviewed-on: https://gerrit.openafs.org/14248 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-19 12:00:57 -04:00
Cheyenne Wills	c48072b980	LINUX 5.8: Replace kernel_setsockopt with new funcs Linux 5.8-rc1 commit 'net: remove kernel_setsockopt' (5a892ff2facb) retires the kernel_setsockopt function. In prior kernel commits new functions (ip_sock_set_) were added to replace the specific functions performed by kernel_setsockopt. Define new config test 'HAVE_IP_SOCK_SET' if the 'ip_sock_set' functions are available. The config define 'HAVE_KERNEL_SETSOCKOPT' is no longer set in Linux 5.8. Create wrapper functions that replace the kernel_setsockopt calls with calls to the appropriate Linux kernel function(s) (depending on what functions the kernel supports). Remove the unused 'kernel_getsockopt' function (used for building with pre 2.6.19 kernels). For reference Linux 2.6.19 introduced kernel_setsockopt Linux 5.8 removed kernel_setsockopt and replaced the functionality with a set of new functions (ip_sock_set_) Change-Id: I517b674303c5decc19313d9de51d04ddef36b421 Reviewed-on: https://gerrit.openafs.org/14247 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-19 12:00:28 -04:00
Andrew Deason	cbc5c4b51f	tests: Modernize writekeyfile.c tests/auth/writekeyfile.c contains some code used to generate tests/auth/KeyFile, which is used to test code interpreting the old-style KeyFile format. This code currently has a few problems: - We don't check the results of afstest_mkdtemp, which could allow symlink attacks from other users on the system. - We duplicate some logic from afstest_BuildTestConfig, in order to build a temporary config dir. - writekeyfile isn't built or run by default (it only exists to generate KeyFile, so it's almost never run), so eventual bitrot is quite likely, and the existing code already generates warnings. To avoid this, change writekeyfile.c to use the existing afstest_BuildTestConfig to generate a local config dir. To ensure we avoid bitrot, build writekeyfile by default, and create a test to run it, to make sure it can generate a KeyFile as expected. Note that the KeyFile.short we test against is different than the KeyFile currently in the tree. The existing KeyFile was generated from an older OpenAFS release, which always generated 100-byte KeyFiles, even if we only have a few keys. The current codebase only writes out as much key data as needed, so the generated KeyFiles are shorter (but still understandable by older OpenAFS releases). Keep the old 100-byte KeyFile around, since that's what older OpenAFS would generate, and create a new KeyFile.short to test against, to make sure our code for generating KeyFiles doesn't change any further. Change-Id: Ibe9246c6dd808ed2b2225dd7be2b27bbdee072fd Reviewed-on: https://gerrit.openafs.org/14246 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-06-19 11:48:57 -04:00
Cheyenne Wills	22a66e7b7e	tests: Use usleep instead of nanosleep Commit "Build tests by default" `68f406436c` changes the build so tests are always built. On Solaris 10 the build fails because nanosleep is in librt, which we do not link against. Replace nanosleep with usleep. This avoids introducing extra configure tests just for Solaris 10. Note that with Solaris 11 nanosleep was moved from librt to libc, the standard C library. Change-Id: I6639f32bb8c8ace438e0092a866f06561dad54f1 Reviewed-on: https://gerrit.openafs.org/14244 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-19 10:54:13 -04:00
Cheyenne Wills	5f4a681eeb	tests: Emulate mkdtemp when not available Commit "Build tests by default" `68f406436c` changes the build so tests are always built. On Solaris 10 Update 10 and earlier the build fails because the mkdtemp function is not available. Introduce a wrapper 'afstest_mkdtemp' that uses mkdtemp if available, otherwise uses mktemp/mkdir. Change-Id: I0118f838ed9a89927e2ddac4cad822574601558a Reviewed-on: https://gerrit.openafs.org/14243 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-19 10:54:07 -04:00
Michael Meffie	188ca8bf52	make-release: Run git describe once Run git describe once at the beginning of make-release to find the version information used to derive the tarball file names and saved in the .version file. This is a cleanup and refactoring change to prepare for a future commit. Change-Id: I0debeeffa5d2c63ab1498588766cb36424d15cd5 Reviewed-on: https://gerrit.openafs.org/14150 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-18 21:15:15 -04:00
Michael Meffie	d0753c0ace	make-release: Create output directory if needed Automatically create the --dir directory if it does not already exist, which makes this script slightly easier to use. Remove the now uneeded mkdir from the top-level makefile. Change-Id: I1f4561120a70263b0b2b194e65fec55fb5666f40 Reviewed-on: https://gerrit.openafs.org/14115 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-18 20:57:54 -04:00

1 2 3 4 5 ...

13352 Commits