Currently, 'vos remsite' always prints the message "Deleting the
replication site for volume %lu ...", and then calls VDONE if the
operation is successful. VDONE prints the trailing "done", but only if
-verbose is turned on, and so if -verbose is not specified, the output
of 'vos remsite' looks broken:
$ vos remsite fs1 vicepa vol.foo
Deleting the replication site for volume 1234 ...Removed replication site fs1 /vicepa for volume vol.foo
To fix this, unconditionally print the trailing "done", instead of
going through VDONE, so 'vos remsite' output now looks like this:
$ vos remsite fs1 vicepa vol.foo
Deleting the replication site for volume 1234 ... done
Removed replication site fs1 /vicepa for volume vol.foo
Change-Id: I0b42f4cb9b695331bf047243bf6ae4a1cdbb89c4
Reviewed-on: https://gerrit.openafs.org/14127
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
GCC 10 changed a default flag from -fcommon to -fno-common. See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85678 for some background.
The change in gcc 10 results in build link-time errors. For example:
../../src/xstat/.libs/liboafs_xstat_cm.a(xstat_cm.o):(.bss+0x2050):
multiple definition of `numCollections';
Ensure that only one definition for global data objects exist and change
references to use "extern" as needed.
To ensure that future changes do not introduce duplicated global
definitions, add the -fno-common flag to XCFLAGS when using the
configure --enable-checking setting.
Change-Id: I6780dd995fe6fb6c2102765ff3484c18e1e1cd58
Reviewed-on: https://gerrit.openafs.org/14106
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Currently, the code in 'vos status' treats the 'iflags' and 'vflags'
of a transaction like an enumerated type; that is, we only check if
'iflags' is equal to ITOffline or ITBusy, etc. But both of these flags
fields are bitfields; any combination of the relevant flags could
theoretically be set.
Practically speaking, we only ever set at most one of the flags in
'iflags', but if anything ever did set more than one flag, our output
would look broken (we'd print "attachFlags:" without any flags).
For 'vflags', multiple flags are often set at once: the most common
combination is VTDeleteOnSalvage|VTOutOfService. So currently, we
usually print "attachFlags:" without any actual flags, since the
'vflags' field isn't exactly equal to VTDeleteOnSalvage (instead it's
set to VTDeleteOnSalvage|VTOutOfService). And if we ever did see just
VTDeleteOnSalvage set by itself, the way the switch() cases fall
through to each other, we'd print out that _all_ flags are set.
To fix all of this, just test for the individual flag bits instead.
Change-Id: Ib4d207bc713f0ef8eb51b9dbeaf2af50395536ee
Reviewed-on: https://gerrit.openafs.org/14126
Tested-by: Andrew Deason <adeason@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Move our preprocessor logic around d_path into an osi_compat.h
wrapper, called afs_d_path. This just makes it a little easier to use
d_path, and moves a tiny bit of #ifdef cruft away from real code.
Change-Id: I2032eda3fef18be6e77e3bf362ec5ce641e1d76d
Reviewed-on: https://gerrit.openafs.org/13721
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Currently, afs_syscall_pioctl handles the VIOCPREFETCH pioctl as a
special case, calling into a different code path to handle
backgrounding the prefetch operation. However, we detect that we're
handling a VIOCPREFETCH operation just by looking at the lower 8 bits
of the given opcode. This means that any pioctl that ends in 0x0F will
trigger this codepath, such as if we add a 'C' or 'O' pioctl that uses
code 0x0F.
We only want to catch VIOCPREFETCH requests for this code path, so fix
the check to also check if we're processing a 'V' pioctl.
Change-Id: Ica8c2364f96aa3c8b4d2213bebd9a1e4cb6fa730
Reviewed-on: https://gerrit.openafs.org/13301
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
The auth/superuser-t test runs an Rx server and client in two child
processes. If the client process tries to contact the server before
the server has started listening on its port, some tests involving
RPCs can fail (notably test 39, "Can run a simple RPC").
Normally if we try to contact a server that's not there, Rx will try
resending its packets a few times, but on Linux with AFS_RXERRQ_ENV,
if the port isn't open at all, we can get an ICMP_PORT_UNREACH error,
which causes the relevant Rx call to die immediately with
RX_CALL_DEAD.
This means that if the auth/superuser-t client is only just a bit
faster than the server starting up, tests can fail, since the server's
port is not open yet.
To avoid this, we can wait until the server's port is open before
starting the client process. To do this, have the server process send
a SIGUSR1 to the parent after rx_Init() is called, and have the parent
process wait for the SIGUSR1 (waiting for a max of 5 seconds before
failing). This should guarantee that the server's port will be open by
the time the client starts running.
Note that before commit 086d1858 (LINUX: Include linux/time.h for
linux/errqueue.h), AFS_RXERRQ_ENV was mistakenly disabled on Linux
3.17+, so this issue was probably not possible on recent Linux before
that commit.
Change-Id: I0032a640b83c24f72c03e7bea100df5bc3d9ed4c
Reviewed-on: https://gerrit.openafs.org/14109
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Currently, when we release a lock, we set the e.g. pid_writer field to
0, to clear out any previous pid that was set. On Linux, the
pid_writer field is a pointer, and sparse(1) complains about using a
plain integer 0 in this way:
CHECK [...]/afs_axscache.c
[...]/afs_axscache.c:24:19: warning: Using plain integer as NULL pointer
[...]/afs_axscache.c:68:9: warning: Using plain integer as NULL pointer
[...]/afs_axscache.c:88:5: warning: Using plain integer as NULL pointer
[...]/afs_axscache.c:111:13: warning: Using plain integer as NULL pointer
[...]/afs_axscache.c:121:17: warning: Using plain integer as NULL pointer
[...]/afs_axscache.c:126:17: warning: Using plain integer as NULL pointer
[...]/afs_axscache.c:154:13: warning: Using plain integer as NULL pointer
[...]/afs_axscache.c:165:9: warning: Using plain integer as NULL pointer
This doesn't break anything, but it spews out quite a lot of warnings
when building with sparse(1) available. To just reduce this noise a
bit, assign these fields to actual NULL.
Since some other platforms do use a plain integer in these fields
(they are an actual pid), define 'MyPid_NULL' to use '0' or 'NULL'
depending on the platform. Define MyPid_NULL to NULL only on Linux;
this causes us to still assign 0 to a pointer on some platforms, but
Linux is the only one that complains, so only bother using NULL on
Linux for now.
Change-Id: I35fcb896ceaa346c330622cfc2913b2975295836
Reviewed-on: https://gerrit.openafs.org/14108
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Commit 13ae3de3 (Add "brief" option to rxgen) added the -b option to
rxgen, which (among other things) makes rxgen stop including the name
of an RPC-L union type within its fields. That is, instead of this:
struct foo_type {
afs_int32 foo_tag;
union {
/* ... */
} foo_type_u;
};
rxgen -b generates this:
struct foo_type {
afs_int32 foo_tag;
union {
/* ... */
} u;
};
And all of the autogenerated XDR code is altered to use the 'u' field
instead of foo_type_u. However, if a 'default:' arm is defined in the
definition for the RPC-L union, the autogenerated XDR code still tries
to reference the non-brief name (e.g. foo_type_u). This causes a build
failure when actually trying to compile the generated .xdr.c, like so:
foo.xdr.c:809:39: error: 'foo_type' has no member named 'foo_type_u'
if (!xdr_bytes(xdrs, (char **)&objp->foo_type_u.xxx, &__len, FOO_MAX)) {
^
foo.xdr.c:812:11: error: 'foo_type' has no member named 'foo_type_u'
*(&objp->foo_type_u.xxx) = __len;
This happens because the portion of emit_union() that generates the
XDR code for the default arm wasn't updated to use a different
formatting string when 'brief_flag' is set, like the rest of
emit_union.
To fix this, just check for brief_flag and use 'briefformat'
accordingly, like the other code that checks for brief_flag.
Currently nothing in the tree uses the default arm of RPC-L unions
with 'rxgen -b', but external callers could, or our future code may do
so.
Change-Id: Ifcebfc48a3a64c68fee12ba0d177ae19b0956c58
Reviewed-on: https://gerrit.openafs.org/14107
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The SVOTE_GetSyncSite RPC was intended to provide the IP address of the
current sync-site. Unfortunately, the RPC-L incorrectly defined ahost as
an input argument instead of an output argument. As a result, the IP
address in question is not returned to the callers of SVOTE_GetSyncSite.
Moreover, calls to this RPC must be made through connections associated
with the VOTE_SERVICE_ID. Sadly, the ubik_Call* functions call
SVOTE_GetSyncSite using connections associated with the USER_SERVICE_ID.
Consequently, the server getting this request returns RXGEN_OPCODE,
meaning that this RPC is not implemented by the service in question.
Since RPC arguments cannot be changed without causing compatibility
issues between different client / server versions and the RPC in
question is being called through the wrong service id, remove
SVOTE_GetSyncSite and its callers. Considering that in all versions of
OpenAFS calls to this RPC always return RXGEN_OPCODE, no behavior
change is introduced by this commit.
Also, remove the "chaseCount logic" from the ubik_Call* functions.
This logic prevents the loop counter from being moved backwards
indefinitely, resulting in an infinite loop. Fortunately, without the
VOTE_GetSyncSite() calls this counter cannot be moved backwards more
than once.
Change-Id: Idd071583e8f67109e003f7a5675de02a235e5809
Reviewed-on: https://gerrit.openafs.org/14043
Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Commit 48fbb45 (opr: Introduce opr_cache) added a new test (cache-t),
but did not update the .gitignore file for it.
Change-Id: I6de6130257a62f495ac942c05937eb109ce84a75
Reviewed-on: https://gerrit.openafs.org/14102
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
opr/softsig-t can produce a core file as part of its test.
Change-Id: I3bc7e587151e5915038e31887018889a7ffa6993
Reviewed-on: https://gerrit.openafs.org/14101
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The vos convertROtoRW command converts a RO volume into a RW volume.
Unfortunately, the RO volume in question is not set as "out of service"
during this process. As a result, accesses to the volume being converted
can leave volume objects in an inconsistent state.
Consider the following scenario:
1. Create a volume on host_b and add replicas on host_a and host_b.
$ vos create host_b a vol_1
$ vos addsite host_b a vol_1
$ vos addiste host_a a vol_1
2. Mount the volume:
$ fs mkmount /afs/.mycell/vol_1 vol_1
$ vos release vol_1
$ vos release root.cell
3. Shutdown dafs on host_b:
$ bos shutdown host_b dafs
4. Remove RO reference to host_b from the vldb:
$ vos remsite host_b a vol_1
5. Attach the RO copy by touching it:
$ fs flushall
$ ls /afs/mycell/vol_1
6. Convert RO copy to RW:
$ vos convertROtoRW host_a a vol_1
Notice that FSYNC_com_VolDone fails silently (FSYNC_BAD_STATE), leaving
the volume object for the RO copy set as VOL_STATE_ATTACHED (on success,
this volume should be set as VOL_STATE_DELETED).
7. Add replica on host_a:
$ vos addsite host_a a vol_1
8. Wait until the "inUse" flag of the RO entry is cleared (or force this
to happen by attaching multiple volumes).
9. Release the volume:
$ vos release vol_1
Failed to start transaction on volume 536870922
Volume not attached, does not exist, or not on line
Error in vos release command.
Volume not attached, does not exist, or not on line
To fix this problem, take the RO volume offline during the vos
convertROtoRW operation.
Change-Id: I1e417a026ed819fab4435e8992311fcd4f339341
Reviewed-on: https://gerrit.openafs.org/14066
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Commit 8632f23d67 introduced checks for
the return value of snprintf calls in namei_ops. On success, the value
returned by this function represents the number of written characters.
Unfortunately, the variable used to store this value is the same
variable that represents the status code returned by
namei_ConvertROtoRWvolume. Consequently, a successful execution of
namei_ConvertROtoRWvolume results in a status code different the 0 (and
equal to the number of written characters).
To fix this problem, set the status code in question back to 0 after a
successful execution of namei_ConvertROtoRWvolume.
Change-Id: Ic6fd6483f8d94fd64587f8bae249b9d911d846b4
Reviewed-on: https://gerrit.openafs.org/14065
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
In osi_probe.c, the macro 'check_result' casts a pointer to an int which
on older Linux kernels (e.g. 2.6.18) produces several lines with the C
warning:
... warning: cast from pointer to integer of different size
Change the cast from int to long int.
Linux 2.6.18 doesn't provide intptr_t or uintptr_t, and stdint.h is not
available to kernel modules. But the size of a pointer is the size of a
long (see uintptr_t in linux/types.h - Linux 2.6.24+), so
change the cast from int to long.
Note that the this code by default only gets pulled in for older Linux
kernels (e.g. 2.6.18). For newer kernels, ENABLE_LINUX_SYSCALL_PROBING
is not defined, and so most of osi_probe.c is not built.
Change-Id: If1b41e11c46f4a14ff5127ed4d602485645ddf2a
Reviewed-on: https://gerrit.openafs.org/14092
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Commit cd3221d3 (Linux: use override_creds when available) caused us
to force the current process's creds to the creds of afsd during
osi_file.c file ops, to avoid access errors in some cases.
However, in osi_UFSTruncate, one code path was missed to revert our
creds back to the original user's creds: when the afs_osi_Stat call
fails or deems the truncate unnecessary. In this case, the calling
process keeps the creds for afsd after osi_UFSTruncate returns,
causing our subsequent access-checking code to think that the current
process is in the same context as afsd (typically uid 0 without a
pag).
This can cause the calling process to appear to transiently have the
same access as non-pag uid 0; typically this will be unauthenticated
access, but could be authenticated if uid 0 has tokens.
To fix this, modify the early return in osi_UFSTruncate to go through
a 'goto done' destructor instead, and make sure we revert our creds in
that destructor.
Thanks to cwills@sinenomine.net for finding and helping reproduce the
issue.
Change-Id: I6820af675edcb7aa00542ba40fc52430d68c05e8
Reviewed-on: https://gerrit.openafs.org/14098
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Jeffrey Hutzelman <jhutz@cmu.edu>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Tested-by: Cheyenne Wills <cwills@sinenomine.net>
Ever since commit f0774acd (Introduce TAP tests of man pages for
command_subcommand), we've had tests to check that we have man pages
for every subcommand in a command suite. This was done for several
command suites, including 'bos', and 'fs', but the bos and fs tests were
never added to the TESTS file.
Add them, so the tests run by default in a 'make check'. Fortunately,
the tests still pass today.
Change-Id: I90c006845d054fa3e795203bb1deff675e558622
Reviewed-on: https://gerrit.openafs.org/14073
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Rename ubik_dbase->flags to ubik_dbase->dbFlags, to make it easier to
distinguish between other fields and variables just called 'flags'.
Change-Id: I17258f9a65e989943d066307e332550d66ca7500
Reviewed-on: https://gerrit.openafs.org/13864
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Commit e4ac552a (ubik: Introduce version lock) added UBIK_VERSION_LOCK
and version_data. The commit message mentions that holding either
UBIK_VERSION_LOCK or DBHOLD is enough to be able to read the protected
items and both locks must be held to modify them, but this isn't
mentioned in the actual code.
Add a comment explaining these locking rules, to make these rules
clearer to readers.
Change-Id: I715f89695add6d94e13d6ee1dc6addd1e748d3fd
Reviewed-on: https://gerrit.openafs.org/13863
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
The configuration test for errqueue.h fails with an undefined structure
error on a Linux 3.17 (or higher) system. This prevents setting
HAVE_LINUX_ERRQUEUE_H, which is used to define AFS_RXERRQ_ENV.
Linux commit f24b9be5957b38bb420b838115040dc2031b7d0c (net-timestamp:
extend SCM_TIMESTAMPING ancillary data struct) - which was picked up in
linux 3.17 added a structure that uses the timespec structure. After
this commit, we need to include linux/time.h to pull in the definition
of the timespec struct.
Change-Id: Ifab79f8454c771276d5fdf443c4d68400b70134a
Reviewed-on: https://gerrit.openafs.org/13950
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Tested-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Log when urecovery_CheckTid aborts/ends a running remote transaction.
This is usually a rare event, occurring when some ubik sites get
"stuck" or confused about the state of the quorum. Logging some
details when this happens can be useful when investigating issues
post-mortem, or just to see why a transaction failed.
Change-Id: If0a7cd134aaac3722fe7214a1d8f0efab550ad11
Reviewed-on: https://gerrit.openafs.org/13862
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
In OpenAFS 1.0, the way we made dbserver RPC calls was to pass the
relevant RPC and arguments to ubik_Call()/ubik_Call_New(), which
coerced all of the RPC arguments into 'long's. To make this more
typesafe, in commit 4478d3a9 (ubik-call-sucks-20060703) most callers
were converted to use ubik_RPC_name()-style calls, which used
functions autogenerated by rxgen.
This latter approach, however, only lets us use the ubik_Call-style
site selection code with RPCs processed by rxgen; we can't insert
additional code to run before or after the relevant RPC.
To make our dbserver calls more flexible, but avoid coercing all of
our arguments into 'long's again, move back to the ubik_Call()-style
approach, but use actual typed arguments with a callback function and
a rock. Call it ubik_CallRock().
With this commit rxgen still generates the ubik_RPC_name()-style
stubs, but the stubs just call ubik_CallRock with a generated callback
function, instead of spitting out the equivalent of ubik_Call() in the
generated code itself.
To try to ensure that this commit doesn't incur any unintended extra
changes, make ubik_CallRock consist of the generated code that was
inside rxgen before this commit. This is almost identical to
ubik_Call, but not quite; consolidating these two functions can happen
in a future commit if desired.
Change-Id: I0c3936e67a40e311bff32110b2c80696414b52d4
Reviewed-on: https://gerrit.openafs.org/13987
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The time_t type and the structure timeval were removed for use in kernel
space code in Linux commits:
412c53a680a97cb1ae2c0ab60230e193bee86387
y2038: remove unused time32 interfaces
c766d1472c70d25ad475cf56042af1652e792b23
y2038: hide timeval/timespec/itimerval/itimerspec types
Add an autoconf test for the time_t type.
If time_t is missing, define the time_t type when building the kernel
module.
Change the vattr structure in LINUX/osi_vfs.h to use timespec/timespec64
instead of the timeval structure.
Conditionalize the definition of gettimeofday (needed by rand-fortuna.c) in
crypto/hcrypto/kernel/config.h. It is unused by the Linux kernel module
and the function uses struct timeval that is no longer available.
Change-Id: Idc9a1ded748f833d804164d29c49c9aee26ae8f5
Reviewed-on: https://gerrit.openafs.org/14083
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Currently, we build rand-fortuna-kernel.o for libafs on all platforms,
even though we only use the fortuna RNG on AIX, DragonFlyBSD, HP-UX,
and Irix. Everywhere else, our RAND_bytes() in
src/crypto/hcrypto/kernel/rand.c uses osi_readRandom() instead of
going through heimdal.
Building rand-fortuna.c causes occasional build headaches for the
kernel on Linux (see cc7f942, "LINUX: Disable kernel fortuna large
frame errors"). The most recent instance of this is that Linux 5.6
removes the definition for struct timeval, which is referenced in
rand-fortuna.c.
The Linux kernel is constantly changing, and so trying to keep
rand-fortuna.c building on Linux seems like a waste of ongoing effort.
So, just stop building rand-fortuna-kernel.o on Linux. The original
intent of building this file on all platforms was to avoid bitrot, so
still keep building rand-fortuna-kernel.o on all other platforms even
when it's not used; just avoid it on Linux specifically, the platform
that requires the most effort.
To accomplish this, move rand-fortuna-kernel.o from AFSAOBJS to
AFS_OS_OBJS, and remove it from the Linux-only AFSPAGOBJS.
Also remove our configure tests for -Wno-error=frame-larger-than=,
since they're no longer used by anything.
Change-Id: I0d5f14f9f6ba2bdd7391391180d32383b4da89ed
Reviewed-on: https://gerrit.openafs.org/14084
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Add a simple general-purpose in-memory cache implementation, called
opr_cache. Keys and values are simple flat opaque buffers (no complex
nested structures allowed), hashing is done with jhash, and cache
eviction is mostly random with some LRU bias.
Partly based off a different implementation by
mbarbosa@sinenomine.net.
Change-Id: I16b5988947ff603dfe31613cd7be3908a69264e5
Reviewed-on: https://gerrit.openafs.org/13884
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Currently, afs_osi_suser is declared with a void* argument, even
though its only argument is always effectively a afs_ucred_t*. This
allows us to call afs_osi_suser with any pointer type without the
compiler complaining. Currently, some callers call afs_osi_suser with
an incorrectly-typed afs_ucred_t** instead, like so:
func(afs_ucred_t **credpp)
{
afs_ucred_t **acred = *acredpp; /* incorrect assignment */
if (afs_osi_suser(acred)) {
/* ... */
}
}
The actual code in the tree hides this to some degree behind various
function calls and layers of indirection (e.g. afs_suser()), but this
is effectively what we do. This causes compiler warnings because we
are doing incorrect pointer assignments, but the end result works
because afs_osi_suser actually uses an afs_ucred_t*.
The type confusion makes it very easy to accidentally give the wrong
type to afs_osi_suser. This only really matters on SOLARIS, since that
is the only platform that actually uses its argument to
afs_osi_suser().
To fix all of this, just declare afs_osi_suser as taking an
afs_ucred_t*, and fix all of the relevant functions to handle the
right type.
Change-Id: I1366aedf0f3d7689735a9424c5272233931e3bf2
Reviewed-on: https://gerrit.openafs.org/14085
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
When OpenAFS kernel module gets loaded, it will create certain entries
in "proc" filesystem. One of those entries is "CellServDB", in case
we read "/proc/fs/openafs/CellServDB" without starting "afsd" it will
result in crash with NULL pointer deref. The reason for crash is
CellLRU has not been initialized yet (since "afsd" is not started)
i.e afs_CellInit is not yet called, because of this "next" and "prev"
pointers will be NULL. Inside "c_start()" we do not check for NULL
pointer while traversing CellLRU and this causes crash.
To avoid this initialize CellLRU during module intialization.
Change-Id: I21cbc0e016b384f0ab456c05087384b6ed986b0d
Reviewed-on: https://gerrit.openafs.org/14093
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Remove traces of the old shlibrpc and shlibafsauthent build directories,
which are no longer needed since the conversion to libtool for building
shared libraries.
Change-Id: I8dbfdf9908b4a5527470b7cb4b969e7a160cdd51
Reviewed-on: https://gerrit.openafs.org/14045
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Replace the old and poorly maintained "SOURCE-MAP" file with a markdown
formatted README.md file. Try to organize the directories in sections
to hopefully make a more useful guide to the source code and build
directories.
Thanks to Cheyenne Wills and Benjamin Kaduk for suggestions.
Change-Id: I50f58aa99453bc3412b60a7591d6957cfa83b5b1
Reviewed-on: https://gerrit.openafs.org/14003
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Commit 93b26c6f55 added the cellservDB
field to the afsconf_dir structure to track the CellServDB pathname.
This commit also changed the afsconf_SetCellInfo() and
afsconf_SetExtendedCellInfo() functions to use the new cellservDB member
to open the CellServDB file.
Unfortunately, the bosserver intentionally calls afsconf_SetCellInfo()
with a NULL afsconf_dir pointer when attempting to create the default
CellServDB and ThisCell files (e.g., "localcell"), which causes the
bosserver to crash on startup when the cell configuration is not present.
Fix this by calling the static function to lookup the CellServDB
pathname when a afsconf_dir data object is not given.
Change-Id: I8d36f7c8afe6b4e13bfd04c421bf1109d1eb4238
Reviewed-on: https://gerrit.openafs.org/14061
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Change the signature of the _afsconf_CellServDBPath() static function to
take just the base directory name of the CellServDB file instead of the
entire afsconf_dir data object. This makes it clear we do not need other
members of the afsconf_dir structure to compose the CellServDB path.
Change-Id: I57509b2ca09123e78df5533d63494c66b5b24cdf
Reviewed-on: https://gerrit.openafs.org/14076
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Move the afsconf_SetCellInfo() and afsconf_SetExtendedCellInfo() to the
cellconfig.c file with the other afsconf_dir functions.
Retire the now empty writeconfig.c file. At one point in the distant
past afsconf_SetCellInfo() did not have a afsconf_dir argument, so it
probably made sense to have a separate file to write the configuration.
Later, the afsconf_dir argument was added to afsconf_SetCellInfo() and
afsconf_SetExtendedInfo() to reset the auth cache, so these functions
are now better placed in cellconfig.c.
Note the contents of writeconfig.c were moved verbatim (including
comments), so this commit should have no functional changes.
Change-Id: Idff76f0d2dfa2383a8617373f0e38235a94f20f1
Reviewed-on: https://gerrit.openafs.org/14075
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Like we do for opr_cv_t, define an opr_mutex_t to be a plain int, to
allow opr mutexes to be defined easily without ifdef guards.
Change-Id: Ib90017ac098ebc68ffd89890d448aabb2321f63e
Reviewed-on: https://gerrit.openafs.org/13886
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Linux may perform some access control checks at the time of an I/O
operation, rather than relying solely on checks done when the file is
opened. In some cases (e.g. AppArmor), these checks are done based on
the current tasks's creds at the time of the I/O operation, not those
used when the file was open.
Because of this, we must use override_creds() / revert_creds() to make
sure we are using privileged credentials when performing I/O operations
on cache files. Otherwise, cache I/O operations done in the context of
a task with a restrictive AppArmor profile will fail.
Change-Id: Icbe60874c348d6cd92b0a186d426918b0db9b0f9
Reviewed-on: https://gerrit.openafs.org/13751
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The server processes will happily start without keys and then fail all
authenticated access, including database synchronization and local
commands with -localauth. At least issue warnings to let admins know
the keys are missing and that akeyconvert or asetkey needs to be run.
The situation is not helped by fact the filenames of the key files have
changed between versions. In 1.6.x the (non-DES) keys were in the
rxkad.keytab file and in later versions they are in the KeyFile* files,
so if you are used to 1.6.x it is not obvious what is wrong.
Change-Id: Iff7fe9a5a5a0f5ea1f4e227d3f6129658f8eb598
Reviewed-on: https://gerrit.openafs.org/13911
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
The command-line help for several OpenAFS servers lists an inaccurate
description for the --enable_peer_stats option:
"enable RX transport statistics"
Improve the help description to be more clear and consistent with the
description for --enable-process-stats.
Introduced by the following commits:
cd3492d volser: Convert command line parsing to cmd
a5effd9 viced: Use libcmd for command line options
461603e vlserver: Use libcmd for command line parsing
0b9986c ptserver: Use libcmd for command line parsing
Change-Id: Ibe23c61d4b838f3a3185390b18d25494fffde2ca
Reviewed-on: https://gerrit.openafs.org/14072
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The Linux commit d56c0d45f0e27f814e87a1676b6bdccccbc252e9
(proc: decouple proc from VFS with "struct proc_ops") was merged into
Linux 5.6rc1. The commit replaces the 'file_operations' parameter for
proc_create with a new structure 'proc_ops'.
Conditionally initialize and use proc_ops structures instead of
file_operations structures for calls to proc_create.
Notes:
* proc_ops.proc_ioctl is equivalent to file_operations.unlocked_ioctl
* The macros HAVE_UNLOCKED_IOCTL and HAVE_COMPAT_IOCTL are both
hardcoded to 1 in linux's fs.h
* proc_ops.compat_ioctl is conditional on Linux's CONFIG_COMPAT macro
which is a separate test from the HAVE_COMPAT_IOCTL macro
Change-Id: I8570ca499696b4c31b381543107453fbfe355376
Reviewed-on: https://gerrit.openafs.org/14063
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The grep pattern that checks if /etc/synthetic.conf already has an entry
for afs is intended to check if this file holds a single column entry
named afs. Unfortunately, the current version does not completely
enforce this restriction. To fix this problem, add anchors to the grep
pattern in question.
Change-Id: I15a1fa1c250027b7d3ab67e686cbfbae853251a2
Reviewed-on: https://gerrit.openafs.org/14062
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Yadavendra Yadav <yadayada@in.ibm.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Commit 3be5880d1d 'afs: Avoid panics in
afs_InvalidateAllSegments' is correct, but at least one compiler (gcc
4.3.4 on SLES 11.3) is fooled into issuing a warning:
[...]/afs_segments.c: In function 'afs_InvalidateAllSegments_once':
[...]/afs_segments.c:506: error: 'dcListCount' may be used uninitialized in this function
To silence the bogus warning, initialize dcListCount when defined.
Change-Id: I5938c85c71d08ed61ec1f69a50afb19c9b31fa82
Reviewed-on: https://gerrit.openafs.org/14048
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The UV_RenameVolume() function first updates the volume name in the
VLDB, then read-write volume header and backup volume header, and
finally all of the read-only volume headers. If this function is
interrupted or a remote site is not reachable, the names in some of the
volume headers will be out of sync with name in the VLDB entry.
The implementation of UV_RenameVolume() is idempotent, so can be safely
called with the same name as in the volume's VLDB entry. This could be
used to bring all the names in the volume headers in sync with the name
in the VLDB.
Unfortunately, due to the check of the -newname parameter, vos
rename will not invoke UV_RenameVolume() when the name in the VLDB has
already been changed. The vos rename command attempts to verify the
desired name (-newname) is available before invoking UV_RenameVolume()
by simply checking if a VLDB entry exists with that name, and
incorrectly assumes when a VLDB entry exists with that name it is an
entry for a different volume.
Change the -newname check to allow vos rename to proceed when name has
already been set in the VLDB entry of the volume being renamed. This
allows admins to run vos rename command to complete a previously
incomplete rename operation and bring the names in the volume headers in
sync with the name in the VLDB entry.
Note: Before this commit, administrators could workaround this vos
rename limitation by renaming the volume twice, first to an unused
volume name, then to the actual desired volume name.
Remove the useless checks of the code1 return code after exit in
the RenameVolume() function. These checks for code1 are never performed
since the function exits early when the first VLDB_GetEntryByName()
fails for any reason.
Update the vos rename man page to show vos rename can be used to fix
previously interrupted/failed rename. Also document the -oldname
parameter accepts a numeric volume id to specify the volume to be
renamed.
Change-Id: Ibb5dbe3148e9b8295347925a59cd7bdbccbe8fe0
Reviewed-on: https://gerrit.openafs.org/13720
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
uss_procs_PickADir needs a larger buffer to avoid a truncation warning.
While here, replace some magic numbers with existing symbols.
Change-Id: If981dddfa50bdbc8c4730cf8038429f071b1d5be
Reviewed-on: https://gerrit.openafs.org/14049
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The vos tests start a temporary vlserver process, which is problematic
when the local system already has an installed vlserver. Attempt to
temporarily bind a socket to the vlserver port, and if unable to bind
with an EADDRINUSE error, assume the vlserver is already running and
skip these tests.
Change-Id: I1dd3bc4c7ebcd2c7bffc8aca422222a50058090e
Reviewed-on: https://gerrit.openafs.org/14021
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The function osi_VMDirty_p is mentioned in a few places in src/afs,
but it has always been ifdef'd or commented out, ever since OpenAFS
1.0. Remove the dead code.
Change-Id: Ia7cad718114d91adf9e403e29f9ac976c3f08bfd
Reviewed-on: https://gerrit.openafs.org/14023
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
This weird write() call exists to work around some old AIX-specific
bug. The ifdef looks like it is intended to restrict this to pre-5
AIX, but it also turns this on for all non-AIX platforms.
Make this area AIX-specific, to avoid this weird write on other
platforms that have nothing to do with the relevant workaround.
Change-Id: I092bcadb4ecc6277ae01e44e6a957e6bacc0cf2d
Reviewed-on: https://gerrit.openafs.org/14022
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The vos-t test adds a set of 10.* test addresses to a test vlserver and
runs vos to read them back. When the test is run in an environment
where hosts have been assigned in the 10.* internal network, vos will resolve
the addresses to hostnames and the test fails. Pass the -noresolve
option to vos for this test when checking for the expected list of
addresses.
Example test output before this commit:
./vos-t
...
# seen: 10.0.0.0
10.0.0.1
myhost.example.com
10.0.0.3
...
not ok 5 - vos output matches
Change-Id: Ief43fe180a0dfff211f28d5f47be6224270907a3
Reviewed-on: https://gerrit.openafs.org/14020
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Declare our vnode and vfs operations as static functions, since they
are not referenced outside of osi_vfsops.c/osi_vnodeops.c. Shuffle
around the definitions in osi_vnodeops.c so that we don't need forward
declarations for the functions.
Change-Id: Idbbe05a8b248ac29c2795c365be6a4e99da536dd
Reviewed-on: https://gerrit.openafs.org/13973
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
According to <https://www.freebsd.org/security/unsupported.html>,
FreeBSD 8.x EoL was on August 1, 2015, and FreeBSD 9.x EoL was on
December 31, 2016. Remove our support for these versions, since they
haven't been supported by FreeBSD itself for a while.
FreeBSD 10.x EoL was on October 31, 2018, which has passed, but was
less than a year ago. So keep 10.x in for now.
Adjust our preprocessor checks accordingly:
- In FBSD-specific dirs, assume AFS_FBSD100_ENV and lower is always
true. Assume __FreeBSD_version is always at least 1000000.
- In non-FBSD dirs, convert AFS_FBSD100_ENV and lower to AFS_FBSD_ENV.
Change-Id: I965e65d3b95573bb374661217b24b686c7b68ed2
Reviewed-on: https://gerrit.openafs.org/13842
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Commit 68f40643 (Build tests by default) added new targets in our
top-level Makefile, that caused us to effectively run
'cd tests && make' as part of the default build. Since no explicit
target is provided, 'make' tries to build the first target in the
given Makefile. On some platforms (such as *BSD), 'make' finds the
first defined target as a pattern rule (%.c) from our included
makefiles, and tries to build the target %.c, which it cannot do. This
causes the build to fail with:
cd tests && make
make[3]: don't know how to make %.c. Stop
To fix this, just explicitly build the 'all' target when we build our
tests by default.
Change-Id: I319271482685ec35087c470d95fdcaec6e1d8c47
Reviewed-on: https://gerrit.openafs.org/13993
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Currently, if we encounter an error and 'goto out' after starting the
test vlserver, we'll exit without stopping the test vlserver. This can
confuse the test harness, causing 'runtests' to hang forever.
To avoid this, move the afstest_StopServer() call to also run when
we're bailing out, but only if the server has actally started of
course.
Change-Id: Ice5a56c20bc8d2eac85b3e760850c4d85e4601a8
Reviewed-on: https://gerrit.openafs.org/13992
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Currently, in tests/volser/vos-t.c we call afs_com_err as
"authname-t", which is clearly a mistake during some code refactoring
(introduced in commit 2ce3fdc5, "tests: Abstract out code to produce a
Ubik client").
We could just change this to "vos-t", but instead of specifying
constant strings everywhere, change this to figure out what the
current command is called, and just use that. Put this code into a new
function, afstest_GetProgname, and convert existing tests to use that
instead of hard-coding the program name given to afs_com_err.
Change-Id: I3ed02c89f93798568783c7d717e8fb2e39dcce14
Reviewed-on: https://gerrit.openafs.org/13991
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>