Switch to using the m4 macros from autoconf-archive in our
src/external mechanism, instead of manually-copied versions in src/cf.
The src/external copy of ax_gcc_func_attribute.m4 is identical to the
existing copy in src/cf, so that should incur no changes. There are
also a few new macros pulled in, but they are currently unused.
Increase our AC_PREREQ in configure.ac to 2.64, to match the AC_PREREQ
in some of the new files.
Change-Id: I8acfe4df7b9a22d9b9e69004c3438034a2dacadb
Reviewed-on: https://gerrit.openafs.org/14135
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Add autoconf-archive to the src/external mechanism, so we can more
easily import and update the AX_* m4 macros we pull in from
autoconf-archive. Commits are imported from
<git://git.savannah.gnu.org/autoconf-archive.git>.
We already have a copy of ax_gcc_func_attribute.m4 in the tree, so
include that in the list of files. While we're here, also include a
few more macros for checking compiler flags, which will be used in
subsequent commits.
Change-Id: I8c6288fc1d48a47837ca08f8b9207e0ada921af8
Reviewed-on: https://gerrit.openafs.org/14133
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Add change descriptions for commits not in a stable release.
Change-Id: Ib1d5ce9f558279660abb2473ce8a9fac4fcefa8d
Reviewed-on: https://gerrit.openafs.org/13673
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Pull in all the updates to NEWS that occurred on the 1.8.x branch
in preparation for adding entries for 1.9.0.
Change-Id: I713d1576ef96793f24824f909b26da802b21ec23
Reviewed-on: https://gerrit.openafs.org/14103
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Ever since commits 170dbb3c (rx: Use opr queues) and d9fc4890 (rx: Fix
test for end of call queue for LWP), rx_GetCall checks if the current
call is the last one on rx_incomingCallQueue by doing this:
opr_queue_IsEnd(&rx_incomingCallQueue, cursor)
But opr_queue_IsEnd checks if the given pointer is the _end_ of the
last; that is, if it's the end-of-list sentinel, not an item on the
actual list. Testing for the last item in a list is what
opr_queue_IsLast is for. This is the same convention that the old Rx
queues used, but 170dbb3c just accidentally replaced queue_IsLast with
opr_queue_IsEnd (instead of opr_queue_IsLast), and d9fc4890 copied the
mistake.
So because this is inside an opr_queue_Scan loop, opr_queue_IsEnd will
never be true, so we'll never enter this block of code (unless we are
the "fcfs" thread). This means that an incoming Rx call can get stuck
in the incoming call queue, if all of the following are true:
- The incoming call consists of more than 1 packet of incoming data.
- The incoming call "waits" when it comes in (that is, there are no
free threads or the service is over quota).
- The "fcfs" thread doesn't scan the incoming call queue (because it
is idle when the call comes in, but the relevant service is over
quota).
To fix this, just use opr_queue_IsLast here instead of
opr_queue_IsEnd.
Change-Id: I04b90b1279f81dc518eb61e7bd450e3c0be37a77
Reviewed-on: https://gerrit.openafs.org/14158
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Currently, the rx/event-t tests schedule a bunch of events up to 3
seconds in the future, and then we sleep for 3 seconds to give them a
chance to run. Since we're cutting it so close, this can rarely result
in a few events not being run (observed occasionally on FreeBSD 12.1,
where we failed to run about 3 events out of 10000).
To avoid this, just sleep for 4 seconds instead of 3. Also print out a
little more info regarding the number of fired/cancelled events, so we
can see the event count when it's wrong.
Change-Id: I6269bea2c245aeed00c129ff638423d0fa81ad23
Reviewed-on: https://gerrit.openafs.org/14160
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The format string for CM_TRACE_GMAP takes 4 substitutions, but
afs_linux_mmap only supplies 3. This results in malformed output from
fstrace:
Type mismatch, using raw print.
Gn_map vp 0x%lx addr 0x%lx len 0x%x off 0x%x (afs / zcm)raw op
701087775, time 715.322573, pid 9644
p0:0xc0a66ec0 p1:0x8b81a000 p2:131072
Repair the recording of CM_TRACE_GMAP.
Change-Id: I2b7592e68cb42f5ae490ee8771558e5cc5a2181e
Reviewed-on: https://gerrit.openafs.org/14168
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Currently, 'softsig-helper -buserror' causes a SIGBUS on most
platforms, but can result in SIGSEGV on FreeBSD by default (at least
on 11.3-RELEASE). Skip the test on FreeBSD, until we can provide a
more reliable way to generate SIGBUS.
Note that when the sysctl machdep.prot_fault_translation is set to 1,
'softsig-helper -buserror' generates a SIGBUS instead of SIGSEGV,
suggesting that generating a SIGBUS here is the old 'compat' behavior.
When machdep.prot_fault_translation is 0 (the default), the code path
in the FreeBSD kernel that dictates whether to send a SIGBUS or
SIGSEGV in this situation depends on some autodetection heuristics,
and so may produce different results depending on FreeBSD releases or
even compiler settings (due to detection of ABI based on some ELF
notes in the relevant binary).
For some details on this sysctl, see
<https://www.freebsd.org/news/status/report-2019-07-2019-09.html#Signals-delivered-on-unhandled-Page-Faults>
or the FreeBSD source code. In 11.3-RELEASE, the decision to issue a
SIGBUS or SIGSEGV can be found around sys/amd64/amd64/trap.c:355.
Change-Id: Ib75b43cc12302532ee87a3744fc364424f2a3ca6
Reviewed-on: https://gerrit.openafs.org/14145
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Currently we call vinvalbuf(9) in a few places while holding
AFS_GLOCK, but AFS_GLOCK is a non-sleepable lock (struct mtx), and
vinvalbuf can sleep. This can trigger a panic in some rare conditions,
with the message:
Sleeping thread (tid 100179, pid 95481) owns a non-sleepable lock
To avoid this, drop AFS_GLOCK around a few places that call
vinvalbuf().
Change-Id: I58acb144b6ffa007675402e7639b63ff3745dec5
Reviewed-on: https://gerrit.openafs.org/13970
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
In FBSD/osi_vnops.c, we have a few abstractions (e.g. MA_VOP_UNLOCK)
that used to expand to different things for older FreeBSD versions.
Currently, they always expand to the same thing, so just remove the
abstractions.
While we are changing these calls, also change one instance of
MA_VOP_LOCK to vn_lock (instead of VOP_LOCK), since we're not usually
supposed to call VOP_LOCK directly, according to the VOP_LOCK(9)
manpage. The MA_VOP_LOCK call was added in commit bd707fb7
(freebsd-almost-working-client-20020216), seemingly by mistake.
Change-Id: Ia0f28fe658057e87d9103a72296ab899dc762fb6
Reviewed-on: https://gerrit.openafs.org/13843
Reviewed-by: Tim Creech <tcreech@tcreech.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Currently, if we are building with -j2 or higher, we can easily fail
to build some libafs objects because vnode_if.h does not exist yet.
vnode_if.h is generated by the FreeBSD build, but none of our objects
depend on it, so during parallel builds it may not be available by the
time we build, for example, src/external/heimdal/hcrypto/sha256.c.
This results in build errors that can look like this:
--- sha256-kernel.o ---
cc -I. -I.. -I../nfs [...]/src/external/heimdal/hcrypto/sha256.c
In file included from [...]/src/external/heimdal/hcrypto/sha256.c:34:
In file included from [...]/src/crypto/hcrypto/kernel/config.h:30:
In file included from [...]/src/afs/sysincludes.h:354:
/usr/src/sys/sys/vnode.h:588:10: fatal error: 'vnode_if.h' file not found
#include "vnode_if.h"
^~~~~~~~~~~~
1 error generated.
*** [sha256-kernel.o] Error code 1
make[4]: stopped in [...]/src/libafs/MODLOAD
1 error
To avoid this, make all of our libafs objects depends on vnode_if.h.
[adeason@dson.org: Expanded commit message.]
Change-Id: I5a7a6ece8d5fbe6cf1a5b94451c8e8ae93fdc55f
Reviewed-on: https://gerrit.openafs.org/13983
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The 'perl' binary may not be /usr/bin/perl, depending on the system.
For example, on modern FreeBSD it tends to be /usr/local/bin/perl
instead.
To avoid relying on perl to be in a specific location, just run via
/usr/bin/env instead, so we pick up perl from $PATH instead.
Change-Id: Ic8dc247c82342ff79dfa80426c489ccb8e3e1450
Reviewed-on: https://gerrit.openafs.org/14144
Tested-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Currently, our afs_vop_lookup on FBSD tries to only lock 'dvp' for
ISDOTDOT requests when LOCKPARENT and ISLASTCN are set. There are a
couple of problems with this:
- The conditional locking logic involving LOCKPARENT/ISLASTCN is only
relevant in very old FreeBSD releases (per-fs checking of these
flags for parent locking went away around the FreeBSD 6 era).
- Our current logic here is wrong anyway, since we try to lock 'dvp'
twice when those flags are set. This was mostly introduced by commit
2f6be821 (FBSD: band-aid vnode locking in lookup), which added a
lock/unlock pair for 'dvp' around the lock for 'vp', even though
'dvp' was unlocked several lines earlier.
This means that if we hit the relevant code path, we will deadlock,
since we try to lock 'dvp' twice. To avoid this, just remove the
relevant logic for LOCKPARENT/ISLASTCN, since it is only relevant for
old FreeBSD releases that are not supported by us or FreeBSD.
Add and rearrange some comments around here to try to more explicitly
explain the relevant locking rules.
[adeason@dson.org: Commit message rewrite, adding comments, removing
old FreeBSD code.]
Change-Id: Iaa2c55d82c50d5a8ab42c67b0996a2b4fb6e09e6
Reviewed-on: https://gerrit.openafs.org/12578
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
In afs_vop_lookup, the 'wantparent' variable doesn't actually change
any logic in the function. In the if() clause that it's used, the
value of 'wantparent' is only ever used if cnp->cn_nameiop is RENAME
and ISLASTCN is set. But if both of those are true, then the second
half of the if() conditional will always be true, so the value of
'wantparent' doesn't matter.
So to remove this confusing unused logic, remove the 'wantparent'
local var, and all its associated logic.
Issue spotted by kaduk@mit.edu.
Change-Id: Ia63b88d67d21cc2b81a0c25aa31ea60ab202b0a7
Reviewed-on: https://gerrit.openafs.org/14143
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Commit b61eac78 (Linux: setpag() may replace credentials) changed
PSetTokens2 to call crref() after _settok_setParentPag(), since
changing the parent PAG may change our credentials structure. But that
commit did not update the old pioctl PSetTokens, so -setpag
functionality remained broken on Linux for utilities that called the
old pioctl ('klog' is one such utility).
To fix this, we could copy the same code from PSetTokens2 into
PSetTokens. But instead just move this code into _settok_setParentPag
itself, to avoid code duplication. This commit also refactors
_settok_setParentPag a little to make the platform-specific ifdefs a
little easier to read through.
Change-Id: I65a165ebb1d823e690926de31b28a7728d2561b9
Reviewed-on: https://gerrit.openafs.org/14147
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Yadavendra Yadav <yadayada@in.ibm.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Commit 48589b5d (Linux: Restore aklog -setpag functionality for kernel
2.6.32+) added code to SetToken() to copy our session keyring to the
parent process, in order to implement -setpag functionality. But this
was removed from SetToken() in commit 1a6d4c16 (Linux: fix aklog
-setpag to work with ktc_SetTokenEx), when the same code was moved to
ktc_SetTokenEx().
Add this code back to SetTokens(), so -setpag functionality can work
again with utilities that use older functions like ktc_SetToken, like
'klog'.
Change-Id: I68c9bf2e19783ea6f84b4c5ebf2ef188d1d8d6ad
Reviewed-on: https://gerrit.openafs.org/14146
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
`make` is not necessarily installed, even if when all the other build
requirements are installed.
Add `make` to the list build requirements to complete the build
requirements. With this change it is possible to build the packages
after running the `yum-builddep` to install all of the needed build
requirements.
Change-Id: I032ba1f23d08468c5e21edc5662b20cc9498d1c9
Reviewed-on: https://gerrit.openafs.org/14119
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
For our old-style "O" RPCs (e.g. VL_CreateEntry, instead of
VL_CreateEntryN), vlserver calls vldbentry_to_vlentry to convert to
the internal 'struct nvlentry' format. After all of the sites have
been copied to the internal format, we fill the remaining sites by
setting the serverNumber to BADSERVERID. For nvldbentry_to_vlentry, we
do this for NMAXNSERVERS sites, but for vldbentry_to_vlentry, we do
this for OMAXNSERVERS.
The thing is, both functions are filling in entries for a 'struct
nvlentry', which has NMAXNSERVERS 'serverNumber' entries. So for
vldbentry_to_vlentry, we are skipping setting the last few sites
(specifically, NMAXNSERVERS-OMAXNSERVERS = 13-8 = 5).
This can easily cause our O-style RPCs to write out entries to disk
that have uninitialized sites at the end of the array. For example, an
entry with one site should have server numbers that look like this:
serverNumber = {1, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255}
That is, one real serverid (a '1' here), followed by twelve
BADSERVERIDs.
But for a VL_CreateEntry call, the 'struct nvlentry' is zeroed out
before vldbentry_to_vlentry is called, and so the server numbers in
the written entry look like this:
serverNumber = {1, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0, 0}
That is, one real serverid (a '1' here), followed by seven
BADSERVERIDs, followed by five '0's.
Most of the time, this is not noticeable, since our code that reads in
entries from disk stops processing sites when we encounter the first
BADSERVERID site (see vlentry_to_nvldbentry). However, if the entry
has 8 sites, then none of the entries will contain BADSERVERID, and so
we will actually process the trailing 5 bogus sites. This would appear
as 5 extra volume sites for a volume, most likely all for the same
server.
For VL_CreateEntry, the vlentry struct is always zeroed before we use
it, so the trailing sites will always be filled with 0. For
VL_ReplaceEntry, the trailing sites will be unchanged from whatever
was read in from the existing disk entry.
To fix this, just change the relevant loop to go through NMAXNSERVERS
entries, so we actually go to the end of the serverNumber (et al)
array.
This may appear similar to commit ddf7d2a7 (vlserver: initialize
nvlentry elements after read). However, that commit fixed a case
involving the old vldb database format (which hopefully is not being
used). This commit fixes a case where we are using the new vldb
database format, but with the old RPCs, which may still be used by old
tools.
Change-Id: Ic6882d1452963ca93403748917c313068acfdaab
Reviewed-on: https://gerrit.openafs.org/14139
Tested-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Fix warnings issued by recent versions of rpmbuild:
warning: Macro expanded in comment on line 110: %{afsvers}/...
warning: extra tokens at the end of %endif directive in line 1469:
%endif # build_userspace
warning: line 331: It's not recommended to have unversioned Obsoletes:
Obsoletes: openafs-client-compat
The first two warnings are just issues with comments, which apparently
are not completely ignored by rpmbuild. The third issue is a warning
about an unversioned "Obsoletes" directive. Remove the old Obsoletes for
openafs-client-compat, which was obsoleted no later than the 1.4.x
series (more than 10 years ago).
While here clean up the spec by removing the old cvs $Revsion$ keyword
from the comments at the top of the file, and removing an old commented
out setup directive.
Change-Id: I8d7a050ea6a0cc7a2d9a6af9a91d25ce545586e7
Reviewed-on: https://gerrit.openafs.org/14118
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Currently, opr_cache_init requires that opts->n_buckets is a power of
2 (since our underlying opr_dict requires this). However, callers may
want to pick a number of buckets based on some other value. Requiring
each caller to calculate the nearest power-of-2 is annoying, so
instead just have opr_cache_init itself calculate a nearby power of 2.
That is, with this commit, opts->n_buckets is allowed to not be a
power of 2; when it's not a power of 2, opr_cache_init will calculate
the next highest power of 2 and use that as the number of buckets.
Change-Id: Icd3c56c1fe0733e3dac964ea9a98ff7b436254e6
Reviewed-on: https://gerrit.openafs.org/14122
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Our libafs build logic involves a few targets that 'cd' into a
per-kernel subdir: notably INSTDIRS and DESTDIRS (the targets to 'make
install' or 'make dest' our kernel modules) and COMPDIRS (the target
to setup/build the kernel module).
Both of these potentially 'cd' into a subdirectory (e.g. MODLOAD64),
and run some make rules. Since INSTDIRS and COMPDIRS are different
targets and don't depend on each other for many platforms, running
those rules can happen in parallel. After they 'cd' into the relevant
dir, they run a new 'make' in a subshell, and so underlying rules for
building e.g. AFS_component_version_number.c are not serialized.
So for a parallel build on, say, Solaris, we can encounter errors when
two sub-makes try to make AFS_component_version_number.c at the same
time, which looks something like this (with various lines output from
other sub-processes mixed in):
cd src && cd sys && gmake install
gmake[3]: Leaving directory '/[...]/src/libuafs'
rm -f AFS_component_version_number.c.NEW
/opt/developerstudio12.6/bin/cc [...] -D_KERNEL -DSYSV -dn -m64 -xmodel=kernel -xvector=%none -xregs=no%float -Wu,-save_args -o AFS_component_version_number.o -c AFS_component_version_number.c
mv: cannot access AFS_component_version_number.c.NEW
gmake[4]: *** [/[...]/src/config/Makefile.version:13: AFS_component_version_number.c] Error 2
gmake[4]: Leaving directory '/[...]/src/libafs/MODLOAD64'
gmake[3]: *** [Makefile:85: solaris_instdirs] Error 2
gmake[3]: *** Waiting for unfinished jobs....
To avoid this, just make INSTDIRS and DESTDIRS depend on COMPDIRS, so
we can make sure they don't run at the same time.
Change-Id: I2510e1894c44dd0864cf2eab5613b805342b6718
Reviewed-on: https://gerrit.openafs.org/14137
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The local variable tapeblocks in GetConfigParams matches a global
variable. Rename the local variable to avoid confusion with the global
name.
Change-Id: I1c30433696a35a74978ef0c23881c82054b416c5
Reviewed-on: https://gerrit.openafs.org/14128
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
This change removes the unused LINUX_PKGREL definition from the
configure.ac file.
Commit 6a27e228ba converted the setting of
the RPM package version and release values in the openafs.spec file from
autoconf to the makesrpm.pl script. That commit left LINUX_PKGREL in
configure.ac because it was still referenced by the Debian packaging,
which was still in-tree at that time.
Commit ada9dba075 removed the last trace
of the Debian packaging, but missed the removal of the LINUX_PKGREL.
Change-Id: I17aeccdb38078faa413f2cd3a935b43238982606
Reviewed-on: https://gerrit.openafs.org/14117
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Currently, 'vos remsite' always prints the message "Deleting the
replication site for volume %lu ...", and then calls VDONE if the
operation is successful. VDONE prints the trailing "done", but only if
-verbose is turned on, and so if -verbose is not specified, the output
of 'vos remsite' looks broken:
$ vos remsite fs1 vicepa vol.foo
Deleting the replication site for volume 1234 ...Removed replication site fs1 /vicepa for volume vol.foo
To fix this, unconditionally print the trailing "done", instead of
going through VDONE, so 'vos remsite' output now looks like this:
$ vos remsite fs1 vicepa vol.foo
Deleting the replication site for volume 1234 ... done
Removed replication site fs1 /vicepa for volume vol.foo
Change-Id: I0b42f4cb9b695331bf047243bf6ae4a1cdbb89c4
Reviewed-on: https://gerrit.openafs.org/14127
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
GCC 10 changed a default flag from -fcommon to -fno-common. See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85678 for some background.
The change in gcc 10 results in build link-time errors. For example:
../../src/xstat/.libs/liboafs_xstat_cm.a(xstat_cm.o):(.bss+0x2050):
multiple definition of `numCollections';
Ensure that only one definition for global data objects exist and change
references to use "extern" as needed.
To ensure that future changes do not introduce duplicated global
definitions, add the -fno-common flag to XCFLAGS when using the
configure --enable-checking setting.
Change-Id: I6780dd995fe6fb6c2102765ff3484c18e1e1cd58
Reviewed-on: https://gerrit.openafs.org/14106
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Currently, the code in 'vos status' treats the 'iflags' and 'vflags'
of a transaction like an enumerated type; that is, we only check if
'iflags' is equal to ITOffline or ITBusy, etc. But both of these flags
fields are bitfields; any combination of the relevant flags could
theoretically be set.
Practically speaking, we only ever set at most one of the flags in
'iflags', but if anything ever did set more than one flag, our output
would look broken (we'd print "attachFlags:" without any flags).
For 'vflags', multiple flags are often set at once: the most common
combination is VTDeleteOnSalvage|VTOutOfService. So currently, we
usually print "attachFlags:" without any actual flags, since the
'vflags' field isn't exactly equal to VTDeleteOnSalvage (instead it's
set to VTDeleteOnSalvage|VTOutOfService). And if we ever did see just
VTDeleteOnSalvage set by itself, the way the switch() cases fall
through to each other, we'd print out that _all_ flags are set.
To fix all of this, just test for the individual flag bits instead.
Change-Id: Ib4d207bc713f0ef8eb51b9dbeaf2af50395536ee
Reviewed-on: https://gerrit.openafs.org/14126
Tested-by: Andrew Deason <adeason@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Move our preprocessor logic around d_path into an osi_compat.h
wrapper, called afs_d_path. This just makes it a little easier to use
d_path, and moves a tiny bit of #ifdef cruft away from real code.
Change-Id: I2032eda3fef18be6e77e3bf362ec5ce641e1d76d
Reviewed-on: https://gerrit.openafs.org/13721
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Currently, afs_syscall_pioctl handles the VIOCPREFETCH pioctl as a
special case, calling into a different code path to handle
backgrounding the prefetch operation. However, we detect that we're
handling a VIOCPREFETCH operation just by looking at the lower 8 bits
of the given opcode. This means that any pioctl that ends in 0x0F will
trigger this codepath, such as if we add a 'C' or 'O' pioctl that uses
code 0x0F.
We only want to catch VIOCPREFETCH requests for this code path, so fix
the check to also check if we're processing a 'V' pioctl.
Change-Id: Ica8c2364f96aa3c8b4d2213bebd9a1e4cb6fa730
Reviewed-on: https://gerrit.openafs.org/13301
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
The auth/superuser-t test runs an Rx server and client in two child
processes. If the client process tries to contact the server before
the server has started listening on its port, some tests involving
RPCs can fail (notably test 39, "Can run a simple RPC").
Normally if we try to contact a server that's not there, Rx will try
resending its packets a few times, but on Linux with AFS_RXERRQ_ENV,
if the port isn't open at all, we can get an ICMP_PORT_UNREACH error,
which causes the relevant Rx call to die immediately with
RX_CALL_DEAD.
This means that if the auth/superuser-t client is only just a bit
faster than the server starting up, tests can fail, since the server's
port is not open yet.
To avoid this, we can wait until the server's port is open before
starting the client process. To do this, have the server process send
a SIGUSR1 to the parent after rx_Init() is called, and have the parent
process wait for the SIGUSR1 (waiting for a max of 5 seconds before
failing). This should guarantee that the server's port will be open by
the time the client starts running.
Note that before commit 086d1858 (LINUX: Include linux/time.h for
linux/errqueue.h), AFS_RXERRQ_ENV was mistakenly disabled on Linux
3.17+, so this issue was probably not possible on recent Linux before
that commit.
Change-Id: I0032a640b83c24f72c03e7bea100df5bc3d9ed4c
Reviewed-on: https://gerrit.openafs.org/14109
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Currently, when we release a lock, we set the e.g. pid_writer field to
0, to clear out any previous pid that was set. On Linux, the
pid_writer field is a pointer, and sparse(1) complains about using a
plain integer 0 in this way:
CHECK [...]/afs_axscache.c
[...]/afs_axscache.c:24:19: warning: Using plain integer as NULL pointer
[...]/afs_axscache.c:68:9: warning: Using plain integer as NULL pointer
[...]/afs_axscache.c:88:5: warning: Using plain integer as NULL pointer
[...]/afs_axscache.c:111:13: warning: Using plain integer as NULL pointer
[...]/afs_axscache.c:121:17: warning: Using plain integer as NULL pointer
[...]/afs_axscache.c:126:17: warning: Using plain integer as NULL pointer
[...]/afs_axscache.c:154:13: warning: Using plain integer as NULL pointer
[...]/afs_axscache.c:165:9: warning: Using plain integer as NULL pointer
This doesn't break anything, but it spews out quite a lot of warnings
when building with sparse(1) available. To just reduce this noise a
bit, assign these fields to actual NULL.
Since some other platforms do use a plain integer in these fields
(they are an actual pid), define 'MyPid_NULL' to use '0' or 'NULL'
depending on the platform. Define MyPid_NULL to NULL only on Linux;
this causes us to still assign 0 to a pointer on some platforms, but
Linux is the only one that complains, so only bother using NULL on
Linux for now.
Change-Id: I35fcb896ceaa346c330622cfc2913b2975295836
Reviewed-on: https://gerrit.openafs.org/14108
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Commit 13ae3de3 (Add "brief" option to rxgen) added the -b option to
rxgen, which (among other things) makes rxgen stop including the name
of an RPC-L union type within its fields. That is, instead of this:
struct foo_type {
afs_int32 foo_tag;
union {
/* ... */
} foo_type_u;
};
rxgen -b generates this:
struct foo_type {
afs_int32 foo_tag;
union {
/* ... */
} u;
};
And all of the autogenerated XDR code is altered to use the 'u' field
instead of foo_type_u. However, if a 'default:' arm is defined in the
definition for the RPC-L union, the autogenerated XDR code still tries
to reference the non-brief name (e.g. foo_type_u). This causes a build
failure when actually trying to compile the generated .xdr.c, like so:
foo.xdr.c:809:39: error: 'foo_type' has no member named 'foo_type_u'
if (!xdr_bytes(xdrs, (char **)&objp->foo_type_u.xxx, &__len, FOO_MAX)) {
^
foo.xdr.c:812:11: error: 'foo_type' has no member named 'foo_type_u'
*(&objp->foo_type_u.xxx) = __len;
This happens because the portion of emit_union() that generates the
XDR code for the default arm wasn't updated to use a different
formatting string when 'brief_flag' is set, like the rest of
emit_union.
To fix this, just check for brief_flag and use 'briefformat'
accordingly, like the other code that checks for brief_flag.
Currently nothing in the tree uses the default arm of RPC-L unions
with 'rxgen -b', but external callers could, or our future code may do
so.
Change-Id: Ifcebfc48a3a64c68fee12ba0d177ae19b0956c58
Reviewed-on: https://gerrit.openafs.org/14107
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The SVOTE_GetSyncSite RPC was intended to provide the IP address of the
current sync-site. Unfortunately, the RPC-L incorrectly defined ahost as
an input argument instead of an output argument. As a result, the IP
address in question is not returned to the callers of SVOTE_GetSyncSite.
Moreover, calls to this RPC must be made through connections associated
with the VOTE_SERVICE_ID. Sadly, the ubik_Call* functions call
SVOTE_GetSyncSite using connections associated with the USER_SERVICE_ID.
Consequently, the server getting this request returns RXGEN_OPCODE,
meaning that this RPC is not implemented by the service in question.
Since RPC arguments cannot be changed without causing compatibility
issues between different client / server versions and the RPC in
question is being called through the wrong service id, remove
SVOTE_GetSyncSite and its callers. Considering that in all versions of
OpenAFS calls to this RPC always return RXGEN_OPCODE, no behavior
change is introduced by this commit.
Also, remove the "chaseCount logic" from the ubik_Call* functions.
This logic prevents the loop counter from being moved backwards
indefinitely, resulting in an infinite loop. Fortunately, without the
VOTE_GetSyncSite() calls this counter cannot be moved backwards more
than once.
Change-Id: Idd071583e8f67109e003f7a5675de02a235e5809
Reviewed-on: https://gerrit.openafs.org/14043
Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Commit 48fbb45 (opr: Introduce opr_cache) added a new test (cache-t),
but did not update the .gitignore file for it.
Change-Id: I6de6130257a62f495ac942c05937eb109ce84a75
Reviewed-on: https://gerrit.openafs.org/14102
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
opr/softsig-t can produce a core file as part of its test.
Change-Id: I3bc7e587151e5915038e31887018889a7ffa6993
Reviewed-on: https://gerrit.openafs.org/14101
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The vos convertROtoRW command converts a RO volume into a RW volume.
Unfortunately, the RO volume in question is not set as "out of service"
during this process. As a result, accesses to the volume being converted
can leave volume objects in an inconsistent state.
Consider the following scenario:
1. Create a volume on host_b and add replicas on host_a and host_b.
$ vos create host_b a vol_1
$ vos addsite host_b a vol_1
$ vos addiste host_a a vol_1
2. Mount the volume:
$ fs mkmount /afs/.mycell/vol_1 vol_1
$ vos release vol_1
$ vos release root.cell
3. Shutdown dafs on host_b:
$ bos shutdown host_b dafs
4. Remove RO reference to host_b from the vldb:
$ vos remsite host_b a vol_1
5. Attach the RO copy by touching it:
$ fs flushall
$ ls /afs/mycell/vol_1
6. Convert RO copy to RW:
$ vos convertROtoRW host_a a vol_1
Notice that FSYNC_com_VolDone fails silently (FSYNC_BAD_STATE), leaving
the volume object for the RO copy set as VOL_STATE_ATTACHED (on success,
this volume should be set as VOL_STATE_DELETED).
7. Add replica on host_a:
$ vos addsite host_a a vol_1
8. Wait until the "inUse" flag of the RO entry is cleared (or force this
to happen by attaching multiple volumes).
9. Release the volume:
$ vos release vol_1
Failed to start transaction on volume 536870922
Volume not attached, does not exist, or not on line
Error in vos release command.
Volume not attached, does not exist, or not on line
To fix this problem, take the RO volume offline during the vos
convertROtoRW operation.
Change-Id: I1e417a026ed819fab4435e8992311fcd4f339341
Reviewed-on: https://gerrit.openafs.org/14066
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Commit 8632f23d67 introduced checks for
the return value of snprintf calls in namei_ops. On success, the value
returned by this function represents the number of written characters.
Unfortunately, the variable used to store this value is the same
variable that represents the status code returned by
namei_ConvertROtoRWvolume. Consequently, a successful execution of
namei_ConvertROtoRWvolume results in a status code different the 0 (and
equal to the number of written characters).
To fix this problem, set the status code in question back to 0 after a
successful execution of namei_ConvertROtoRWvolume.
Change-Id: Ic6fd6483f8d94fd64587f8bae249b9d911d846b4
Reviewed-on: https://gerrit.openafs.org/14065
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
In osi_probe.c, the macro 'check_result' casts a pointer to an int which
on older Linux kernels (e.g. 2.6.18) produces several lines with the C
warning:
... warning: cast from pointer to integer of different size
Change the cast from int to long int.
Linux 2.6.18 doesn't provide intptr_t or uintptr_t, and stdint.h is not
available to kernel modules. But the size of a pointer is the size of a
long (see uintptr_t in linux/types.h - Linux 2.6.24+), so
change the cast from int to long.
Note that the this code by default only gets pulled in for older Linux
kernels (e.g. 2.6.18). For newer kernels, ENABLE_LINUX_SYSCALL_PROBING
is not defined, and so most of osi_probe.c is not built.
Change-Id: If1b41e11c46f4a14ff5127ed4d602485645ddf2a
Reviewed-on: https://gerrit.openafs.org/14092
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Commit cd3221d3 (Linux: use override_creds when available) caused us
to force the current process's creds to the creds of afsd during
osi_file.c file ops, to avoid access errors in some cases.
However, in osi_UFSTruncate, one code path was missed to revert our
creds back to the original user's creds: when the afs_osi_Stat call
fails or deems the truncate unnecessary. In this case, the calling
process keeps the creds for afsd after osi_UFSTruncate returns,
causing our subsequent access-checking code to think that the current
process is in the same context as afsd (typically uid 0 without a
pag).
This can cause the calling process to appear to transiently have the
same access as non-pag uid 0; typically this will be unauthenticated
access, but could be authenticated if uid 0 has tokens.
To fix this, modify the early return in osi_UFSTruncate to go through
a 'goto done' destructor instead, and make sure we revert our creds in
that destructor.
Thanks to cwills@sinenomine.net for finding and helping reproduce the
issue.
Change-Id: I6820af675edcb7aa00542ba40fc52430d68c05e8
Reviewed-on: https://gerrit.openafs.org/14098
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Jeffrey Hutzelman <jhutz@cmu.edu>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Tested-by: Cheyenne Wills <cwills@sinenomine.net>
Ever since commit f0774acd (Introduce TAP tests of man pages for
command_subcommand), we've had tests to check that we have man pages
for every subcommand in a command suite. This was done for several
command suites, including 'bos', and 'fs', but the bos and fs tests were
never added to the TESTS file.
Add them, so the tests run by default in a 'make check'. Fortunately,
the tests still pass today.
Change-Id: I90c006845d054fa3e795203bb1deff675e558622
Reviewed-on: https://gerrit.openafs.org/14073
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Rename ubik_dbase->flags to ubik_dbase->dbFlags, to make it easier to
distinguish between other fields and variables just called 'flags'.
Change-Id: I17258f9a65e989943d066307e332550d66ca7500
Reviewed-on: https://gerrit.openafs.org/13864
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Commit e4ac552a (ubik: Introduce version lock) added UBIK_VERSION_LOCK
and version_data. The commit message mentions that holding either
UBIK_VERSION_LOCK or DBHOLD is enough to be able to read the protected
items and both locks must be held to modify them, but this isn't
mentioned in the actual code.
Add a comment explaining these locking rules, to make these rules
clearer to readers.
Change-Id: I715f89695add6d94e13d6ee1dc6addd1e748d3fd
Reviewed-on: https://gerrit.openafs.org/13863
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
The configuration test for errqueue.h fails with an undefined structure
error on a Linux 3.17 (or higher) system. This prevents setting
HAVE_LINUX_ERRQUEUE_H, which is used to define AFS_RXERRQ_ENV.
Linux commit f24b9be5957b38bb420b838115040dc2031b7d0c (net-timestamp:
extend SCM_TIMESTAMPING ancillary data struct) - which was picked up in
linux 3.17 added a structure that uses the timespec structure. After
this commit, we need to include linux/time.h to pull in the definition
of the timespec struct.
Change-Id: Ifab79f8454c771276d5fdf443c4d68400b70134a
Reviewed-on: https://gerrit.openafs.org/13950
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Tested-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Log when urecovery_CheckTid aborts/ends a running remote transaction.
This is usually a rare event, occurring when some ubik sites get
"stuck" or confused about the state of the quorum. Logging some
details when this happens can be useful when investigating issues
post-mortem, or just to see why a transaction failed.
Change-Id: If0a7cd134aaac3722fe7214a1d8f0efab550ad11
Reviewed-on: https://gerrit.openafs.org/13862
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
In OpenAFS 1.0, the way we made dbserver RPC calls was to pass the
relevant RPC and arguments to ubik_Call()/ubik_Call_New(), which
coerced all of the RPC arguments into 'long's. To make this more
typesafe, in commit 4478d3a9 (ubik-call-sucks-20060703) most callers
were converted to use ubik_RPC_name()-style calls, which used
functions autogenerated by rxgen.
This latter approach, however, only lets us use the ubik_Call-style
site selection code with RPCs processed by rxgen; we can't insert
additional code to run before or after the relevant RPC.
To make our dbserver calls more flexible, but avoid coercing all of
our arguments into 'long's again, move back to the ubik_Call()-style
approach, but use actual typed arguments with a callback function and
a rock. Call it ubik_CallRock().
With this commit rxgen still generates the ubik_RPC_name()-style
stubs, but the stubs just call ubik_CallRock with a generated callback
function, instead of spitting out the equivalent of ubik_Call() in the
generated code itself.
To try to ensure that this commit doesn't incur any unintended extra
changes, make ubik_CallRock consist of the generated code that was
inside rxgen before this commit. This is almost identical to
ubik_Call, but not quite; consolidating these two functions can happen
in a future commit if desired.
Change-Id: I0c3936e67a40e311bff32110b2c80696414b52d4
Reviewed-on: https://gerrit.openafs.org/13987
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The time_t type and the structure timeval were removed for use in kernel
space code in Linux commits:
412c53a680a97cb1ae2c0ab60230e193bee86387
y2038: remove unused time32 interfaces
c766d1472c70d25ad475cf56042af1652e792b23
y2038: hide timeval/timespec/itimerval/itimerspec types
Add an autoconf test for the time_t type.
If time_t is missing, define the time_t type when building the kernel
module.
Change the vattr structure in LINUX/osi_vfs.h to use timespec/timespec64
instead of the timeval structure.
Conditionalize the definition of gettimeofday (needed by rand-fortuna.c) in
crypto/hcrypto/kernel/config.h. It is unused by the Linux kernel module
and the function uses struct timeval that is no longer available.
Change-Id: Idc9a1ded748f833d804164d29c49c9aee26ae8f5
Reviewed-on: https://gerrit.openafs.org/14083
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Currently, we build rand-fortuna-kernel.o for libafs on all platforms,
even though we only use the fortuna RNG on AIX, DragonFlyBSD, HP-UX,
and Irix. Everywhere else, our RAND_bytes() in
src/crypto/hcrypto/kernel/rand.c uses osi_readRandom() instead of
going through heimdal.
Building rand-fortuna.c causes occasional build headaches for the
kernel on Linux (see cc7f942, "LINUX: Disable kernel fortuna large
frame errors"). The most recent instance of this is that Linux 5.6
removes the definition for struct timeval, which is referenced in
rand-fortuna.c.
The Linux kernel is constantly changing, and so trying to keep
rand-fortuna.c building on Linux seems like a waste of ongoing effort.
So, just stop building rand-fortuna-kernel.o on Linux. The original
intent of building this file on all platforms was to avoid bitrot, so
still keep building rand-fortuna-kernel.o on all other platforms even
when it's not used; just avoid it on Linux specifically, the platform
that requires the most effort.
To accomplish this, move rand-fortuna-kernel.o from AFSAOBJS to
AFS_OS_OBJS, and remove it from the Linux-only AFSPAGOBJS.
Also remove our configure tests for -Wno-error=frame-larger-than=,
since they're no longer used by anything.
Change-Id: I0d5f14f9f6ba2bdd7391391180d32383b4da89ed
Reviewed-on: https://gerrit.openafs.org/14084
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Add a simple general-purpose in-memory cache implementation, called
opr_cache. Keys and values are simple flat opaque buffers (no complex
nested structures allowed), hashing is done with jhash, and cache
eviction is mostly random with some LRU bias.
Partly based off a different implementation by
mbarbosa@sinenomine.net.
Change-Id: I16b5988947ff603dfe31613cd7be3908a69264e5
Reviewed-on: https://gerrit.openafs.org/13884
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>