I'd like the source tree to stop deadnaming me, so, sharing this change to do it
Change-Id: Iee65d1c8e7e695ea939485db5b148615e052f953
Reviewed-on: https://gerrit.openafs.org/12394
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Some libc implementations will crash when NULL string arguments are given to
*printf. Avoid passing NULL string arguments in the make check tests that did
so, and pass the string "(null)" instead.
Change-Id: I65f11a3eef88d1c7b210c867ae0c40018160f55a
Reviewed-on: https://gerrit.openafs.org/12377
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
afsd crashes after the usage is displayed with the -help option.
$ afsd -help
Usage: ./afsd [-blocks <1024 byte blocks in cache>] [-files <files in cache>]
...
Segmentation fault (core dumped)
The backtrace shows the crash occurs when calling afsconf_Open() with an
invalid pointer argument, even though afsconf_Open() is not even needed
when -help is given.
(gdb) bt
#0 __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:32
#1 0x00007ffff726fc36 in *__GI___strdup (s=0x0) at strdup.c:42
#2 0x0000000000408383 in afsconf_Open (adir=0x0) at cellconfig.c:444
#3 0x00000000004054d5 in afsd_run () at afsd.c:1926
#4 0x0000000000407dc5 in main (argc=2, argv=0x7fffffffe348) at afsd_kernel.c:577
afsconf_Open() is called with an uninitialized pointer because commit
d72df5a18e changed the libcmd
cmd_Dispatch() to return 0 after displaying the command usage when the
-help option is specified. (That fix was needed for scripts which use
the -help option to inspect command options with the -help option.)
The afsd_kernel main function then incorrectly calls the afsd_run()
function, even though mainproc() was not called, which sets up the afsd
option variables. The afsconf_Open() is the first function we call in
afsd_run().
Commit f77c078a29 split afsd into afsd.c
and afsd_kernel.c to support libuafs (and fuse). This split the parsing
of the command line arguments and the running of the afsd command into
two functions. The mainproc(), which originally did both, was split
into two functions; one (still called mainproc) to check the option
values given and setup/auto-tune values, and another (called afsd_run)
to do the actual running of the afsd command. The afsd_parse() function
was introduced as a wrapper around cmd_Dispatch() which "dispatches"
mainproc.
With this fix, take the opportunity to rename mainproc() to the now more
accurately named CheckOptions() and change afsd_parse() to parse the
command line options with cmd_Parse(), instead of abusing
cmd_Dispatch().
Change the main fuction to avoid running afsd_run() when afsd_parse()
returns the CMD_HELP code which indicates the -help option was given.
afsd.fuse splits the command line arguments into afsd recognized options
and fuse options (everything else), so only afsd recognized arguments
are passed to afsd_parse(), via uafs_ParseArgs(). The -help argument is
processed as part of that splitting of arguments, so afsd.fuse never
passes -help as an argument to afsd_parse(). This means we to not need
to check for CMD_HELP as a return value from uafs_ParseArgs(). But
since this is all a bit confusing, at least check the return value in
uafs_ParseArgs().
Change-Id: If510f8dc337e441c19b5e28685e2e818ff57ef5a
Reviewed-on: https://gerrit.openafs.org/12360
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Commit fd23587a5d was done to fix an oops
when parent_vcache_dv() was called without the GLOCK held. Since the
lockless code paths have been removed, and parent_vcache_dv() is always
called with the GLOCK held, revert the extra locked flag argument and
the calls obtain and release the GLOCK within parent_vcache_dv().
Change-Id: I21c3272ec4ed5d4fa1a746a0f783cccfc14e0c22
Reviewed-on: https://gerrit.openafs.org/12354
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
This reverts commit 3ecd65d337.
This commit made it possible to execute afs_linux_dentry_revalidate
without taking the GLOCK under some circumstances. However, it
achieved this by examining structure members outside of the GLOCK that
were previously only examined under the GLOCK (such as vcp->f.states
and vcp->f.m.DataVersion).
While that does of course improve performance, it is not known to be
completely safe. Revert this commit so we may implement a fastpath
through afs_linux_dentry_revalidate using more trusted lockless
techniques (atomics, RCU, etc).
Change-Id: Ia3ca2cf53f97244e4e548db7c1caf218c16aca5c
Reviewed-on: https://gerrit.openafs.org/11793
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Add a static assert macro, for asserting that certain build-time
expressions are true.
Change-Id: I33b0e7168f041e8e8406710d05689e044af45fad
Reviewed-on: https://gerrit.openafs.org/11792
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Several different places in the codebase change avc->f.m.DataVersion
for a particular vcache, when we've noticed that the DV for the vcache
has changed. Consolidate all of these occurrences into a single
afs_SetDataVersion function, to make it easier to change what happens
when we notice a change in DV number.
This should incur no behavior change; it is just simple code
reorganization.
Change-Id: I5dbf2678d3c4b5a2fbef6ef045a0b5bfa8a49242
Reviewed-on: https://gerrit.openafs.org/11791
Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com>
Reviewed-by: Daria Phoebe Brashear <shadow@your-file-system.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Thomas Keiser <tkeiser@gmail.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Currently, when we need to contact all other servers in the ubik
quorum (to create a write transaction, and send db changes, etc), we
call the ContactQuorum_* family of functions. To contact each server,
those functions follow an algorithm like the following pseudocode:
{
int rcode = 0;
int code;
int okcalls = 0;
for (ts = ubik_servers; ts; ts = ts->next) {
if (ts->up) {
code = contact_server(ts);
if (code) {
rcode = code;
} else {
okcalls++;
}
}
}
if (okcalls + 1 >= ubik_quorum) {
return 0;
} else {
return rcode;
}
}
This means that if we successfully contact a majority of ubik sites,
we return success, even if some sites returned an error. If most sites
fail, then we return an error (we arbitrarily pick the last error we
got).
This means that in most situations, a successful write transaction is
guaranteed to have been transmitted to a majority of ubik sites, so
the written data cannot be lost (at least one of the sites that got
the new data will be in a future elected quorum).
However, if a site is already known to be down (ts->up is 0), then we
skip trying to contact that site, but we also don't set any errors.
This means that if a majority of sites are already known to be down
(ts->up is 0), then we can indicate success for a write transaction,
even though the relevant data has not been written to a majority of
sites. In that situation, it is possible to lose data.
Most of the time this is not possible, since a majority of sites must
be 'up' for the sync site to be elected and to allow write
transactions at all. There are a few ways, though, in which we can get
into a situation where most other sites are 'down', but we still let a
write transaction go through.
An example scenario:
Say we have sites A, B, and C. All 3 sites come up at the same time,
and A is the lowest IP so it starts an election (after around BIGTIME
seconds). Right after A is elected the sync site, sites B and C will
have 'lastYesState' set to 0, since site A hasn't yet sent out a
beacon as the sync site.
A client can then start a write to the ubik database on site A, which
site A will allow since it's the sync site (and presumably all the
relevant recovery flags are set). Site A will try to contact sites B
and C for a DISK_Begin call, but lastYesState is set to 0 on those
sites. This will cause DISK_Begin to return UNOQUORUM
(urecovery_AllBetter will return 0, because uvote_HaveSyncAndVersion
will return 0, because lastYesState is not set).
So site A will get a UNOQUORUM error from sites B and C, and so site A
will set 'ts->up' to 0 for sites B and C, and will return UNOQUORUM to
the client. The client may then try to retry the call (because
UNOQUORUM is not treated as a 'global' error in ubikclient.c's
ubik_Call_New), or another client write request could come in. Now
that 'ts->up' is unset for both sites B and C, we skip trying to
contact any remote sites, and the ContactQuorum functions will return
success. So the ubik write will go through successfully, but the new
data will only be on site A.
At this point, if site A crashes, then sites B and C will elect a
quorum, and will not have the modifications that were written to site
A (so the data written to site A is lost). If site A stays up, then it
will go through database recovery, sending the entire database file to
sites B and C.
In addition, it's very possible in this scenario for a client to write
to the database, and then try to read back data and confusingly get a
different result. For example, if someone issues the following two
commands while triggering the above scenario:
$ pts createuser testuser
$ pts examine testuser
If the second command contacts site B or C, then it will always fail,
saying that the user doesn't exist (even though the first command
succeeded). This is because sites B and C don't have the new data
written to site A, at least temporarily. While this confusing behavior
is not completely avoidable in ubik (this can always happen
'sometimes' due to network errors and such), with the scenario
described here, it happens 100% of the time.
The general scenario described above can also happen if sites B and C
are suddenly legitimately unreachable from site A, instead of throwing
the UNOQUORUM error. All of the steps are pretty much the same, but
there is a bit of a delay while we wait for the DISK_Begin call to
fail.
To fix this, do not let 0 be returned if a quorum has not been
reached. In some sense, UNOQUORUM could *always* be returned in
that case, but it is more in keeping with historical behavior to
return a "real" error if there is one available.
It is somewhat questionable whether we should even be propagating
errors received from calls like DISK_Begin/DISK_Commit to the ubik
client (e.g. if we get a -1 from trying to contact a remote site, we
return -1 to the client, so the client may think it couldn't reach the
site at all). But this commit does not change any of that logic, and
should only change behavior when a majority of sites have 'ts->up'
unset. A later commit might effect the change to always return
UNOQUORUM and ignore the actual error values from the DISK_ calls,
but that is not needed to fix the immediate issue.
An important note:
Before this commit, there was a window of about 15 seconds after a
sync site is elected where a write to the ubik db would appear to be
successful, but would only modify the ubik db on the sync site.
(Details described above.) With this commit, writes during that
15-second window will instead fail, because we cannot guarantee that
we won't lose that data. If someone relies on 'udebug' data from the
sync site to let them know when writes will go through successfully,
this commit could appear to cause new errors.
[kaduk@mit.edu: transfer long commit message describing the issue
from an alternative fix, and tidy up accordingly]
Change-Id: If6842d7122ed4d137f298f0f8b7f20350b1e9de6
Reviewed-on: https://gerrit.openafs.org/12289
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
In numerous different places in the code, we do something like this to
mark a vcache as stale:
ObtainWriteLock(&afs_xcbhash, somenumber);
avc->f.states &= ~CStatd;
afs_DequeueCallback(avc);
ReleaseWriteLock(&afs_xcbhash);
if (avc->f.fid.Fid.Vnode & 1 || (vType(avc) == VDIR))
osi_dnlc_purgedp(avc);
There are some variations here and there, but all locations usually
involve at least some code like that. But they all do the same general
thing: invalidate a vcache so we hit the net the next time we need
that vcache.
In order to make it easier to modify what happens when we invalidate a
vcache, and just to improve the code, take all of these instances and
put the functionality in a single function, called afs_StaleVCache,
which marks the vcache as 'stale'.
To handle a few different situations that must be handled, we have
some flags that can also be passed to the new function. These are
primarily necessary to handle variations in the circumstances under
which we hit this code path; for instance, we may already have
afs_xcbhash locked, or we may be invalidating the entire osidnlc (if
we're invalidating vcaches in bulk, for example).
This should result in the same general behavior in all cases. The only
slight differences in a few cases is that we hold locks for a few more
operations than we used to; for example, we may clear an osidnlc entry
while holding the vcache lock. But these are minor and shouldn't
result in any actual differences in behavior.
So, this commit should just be code reorganization and should incur no
behavior change. However, this reorganization is complex, and should
not be considered a simple risk-free refactoring.
[kaduk@mit.edu: implement Tom Keiser's suggestion of a third argument
to afs_StaleVCacheFlags, add AFS_STALEVC_CLEARCB and
AFS_STALEVC_SKIP_DNLC_FOR_INIT_FLUSHED]
Change-Id: I2b2f606c56d5b22826eeb98471187165260c7b91
Reviewed-on: https://gerrit.openafs.org/11790
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Update the style guide with a declaration of the prevailing and
preferred brace style for one-line if statements and loops. Provide an
example and counter-example.
Change-Id: Iafeea977203b76c0e67385779fb4ed57f3c6699a
Reviewed-on: https://gerrit.openafs.org/12370
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The vldb is rechecked when the fileserver returns certain error codes,
such as VMOVED. When the vldb is rechecked, update the volume
setupTime to reflect the most recent time the volume vldb information
is known to be correct.
Be sure the VRecheck flag is cleared after checking the vldb, since
the volume write lock was dropped after finding the volume.
Change-Id: I0ba389ee408de602e0059fbe8013012501c337d3
Reviewed-on: https://gerrit.openafs.org/11897
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
The functionality in AFS_LINUX26_ONEGROUP_ENV does not really need to
be Linux-specific (it's just only implemented for Linux right now).
Rename it to AFS_PAG_ONEGROUP_ENV, and remove some Linux-specific
checks when checking for "onegroup" PAG GIDs.
[mmeffie@sinenomine.net: Move AFS_PAG_ONEGROUP_ENV to param.h]
Change-Id: I01d29fff309337ae95b9b6c65db3d2212cf4bf89
Reviewed-on: https://gerrit.openafs.org/11978
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Define the number of groups per PAG in one place. Prefix the define
with AFS_ to avoid name conflicts in the future (unlikely as it may be).
Fix the misnamed AFSPAGGGROUPS symbol in linux implementation of two
groups per PAG.
Change-Id: I78bb42913f2a5d84c9f323f17dc36d800d8acb84
Reviewed-on: https://gerrit.openafs.org/12382
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
This commit adds the afsd -inumcalc command line switch to specify the
inode number calculation method in a platform neutral way.
Inode numbers reported for files within the AFS filesystem are generated
by the cache manager using a calculation which derives a number from a
FID. Long ago, a new type of calculation was added which generates inode
numbers using a MD5 message digest of the FID. The MD5 inode number
calculation variant is computationally more expensive but greatly
reduces the chances for inode number collisions.
The MD5 calculation can be enabled on the Linux cache manager using the
Linux sysctl interface. Other than the sysctl method of selecting the
inode calculation type, the MD5 inode number calculation method is not
specific to Linux.
This change introduces a command-line option which accepts a value to
indicate the calculation method, instead of a simple flag to enable MD5
inode numbers. This should allow for new inode calculation methods
in the future without the need for additional afsd command-line flags.
Two values are currently accepted for -inumcalc. The value of 'compat'
specifies the legacy inode number calculation. The value 'md5' indicates
that the new MD5 calculation is to be used.
Change-Id: I0257c68ca1a32a7a4c55ca8174a4926ff78ddea4
Reviewed-on: https://gerrit.openafs.org/11855
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Document the preferred style for multiple line comment blocks and give an
example.
Change-Id: I73d6183da9014a943316e5aea1d43be2acc81ad7
Reviewed-on: https://gerrit.openafs.org/12361
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Pick some fairly old versions of clang and gcc and document them
as the minimum supported version. This will let us make assumptions
about compiler features that are available when using those compilers.
Change-Id: Ibb8df72c9b12cc7adff39ece9708a428975ba703
Reviewed-on: https://gerrit.openafs.org/12331
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Linux v4.7-rc1~124^2~2^2^2~9 adds an eighth optional argument
restrict_link. The same commit adds a KEY_ALLOC_BYPASS_RESTRICTION
macro, which we test so we can avoid adding another configure test.
Change-Id: I83e27b54ba5711124dccaa41de7155be77054f47
Reviewed-on: https://gerrit.openafs.org/12345
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Anders Kaseorg <andersk@mit.edu>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Many Solaris programs and utilities (notably mdb and cp) use mmap() in
their implementation. When AFS files exceeding 4GiB are mmap'd, the
contents of the file will be incorrectly mapped into memory. Starting at
4GiB + 1, the first 4GiB will be repeated for the remainder of the file.
If the mmap'd file is written back to storage (AFS or otherwise), the
newly created file will also be corrupted.
This is due to a bug in the afs_map() routine that supports mmap() of
AFS files on Solaris. The segvn_crarg.offset passed to the Solaris
virtual memory APIs is incorrectly cast to u_int, causing it to wrap at
4GiB.
Although Solaris passes the offset from fop_map() to afs_map() as type
offset_t, the destination segvn_crargs.offset is actually type
u_offset_t. Existing examples of other Solaris filesystems (e.g.
zfs_map() ) cast the offset from offset_t to u_offset_t when assigning to
segvn_crargs.offset. If it's good enough for ZFS, it's good enough for
AFS.
Correctly cast the offset to u_offset_t.
Thanks to Robert Milkowski for the report and diagnosis.
Change-Id: Id25363255ec011f2ad7e003ca3e4a1385bebff7e
Reviewed-on: https://gerrit.openafs.org/12292
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
When mmap() is issued for exactly 4GiB of a large AFS-resident file,
mmap() fails with ENOMEM. This is because the AFS code is handling the
requested length as u_int instead of size_t, resulting in a 0 being
passed back to the caller.
When mmap() is issued for non-multiples of 4GiB, the subsequent mapping
will not contain all the requested pages, and for the same reason - the
mapped size has been truncated to 32 bits. This results in SIGSEGV when
accessing the non-mapped page(s).
Fix the signature of afs_map() to specify the correct type for the length.
Thanks to Robert Milkowski for the report and diagnosis.
Change-Id: I8a9f0cb04ff9b80de5516e14d0679b06ef0b3f9a
Reviewed-on: https://gerrit.openafs.org/12291
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The automatically generated pkgbuild.sh file should not be tracked by
git. To fix this problem, add the name of this file to the proper
.gitignore file.
Change-Id: I9bdbad8e7cc02926de61e337ccb94d8a2c27ae43
Reviewed-on: https://gerrit.openafs.org/12343
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>
The brief description was identical to the one for afs_Analyze.
Update it to accurately describe afs_ClearStatus.
Change-Id: I70ceca41342c1b47950c35f567f8ae5a2566f925
Reviewed-on: https://gerrit.openafs.org/12005
Reviewed-by: Perry Ruiter <pruiter@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Currently, the ubik recovery code will always set UBIK_RECFOUNDDB
during recovery, after asking all other sites for their dbversions.
This happens regardless of how many sites we were actually able to
successfully contact, even if we couldn't contact any of them.
This can cause problems when we are unable to contact a majority of
sites with DISK_GetVersion. Since, if we haven't contacted a majority
of sites, we cannot say with confidence that we know what the best db
version available is (which is what UBIK_RECFOUNDDB represents; that
we've found which database is the one we should be using). This can
also result in UBIK_RECHAVEDB in a similar situation, indicating that
we have the best db version locally, even though we never actually
asked anyone else what their db version was.
For example, say site A is the sync site going through recovery, and
DISK_GetVersion fails for the only other sites B and C. Site A will
then set UBIK_RECFOUNDDB, and will claim that site A has the best db
version available (UBIK_RECHAVEDB). This allows site A to process ubik
write transactions (causing the db to be labelled with a new epoch),
or possibly to send the db to the other sites via DISK_SendFile, if
they quickly become available during recovery. Ubik write transactions
can succeed in this situation, because our ContactQuorum_* calls will
succeed if we never try to contact a remote site ('rcode' defaults to
0).
This situation should be rather rare, because normally a majority of
sites must be reachable by site A for site A to be voted the sync site
in the first place. However, it is possible for site A to lose
connectivity to all other sites immediately after sync site election.
It is also possible for site A to proceed far enough in the recovery
process to set UBIK_RECHAVEDB before it loses its sync site status.
As a result of all of this, if a site with an old database comes
online and there are network connectivity problems between the other
sites and a ubik write request comes in, it's possible for the "old"
database to overwrite the "new" database. This makes it look as if the
database has "rolled back" to an earlier version.
This should be possible with any ubik database, though how to actually
trigger this bug can change due to different ubik servers setting
different network timeouts. It is probably the most likely with the
VLDB, because the VLDB is typically the most frequently written
database.
If a VLDB reverts to an earlier version, it can result in existing
volumes to appear to not exist in the VLDB, and can result in new
volumes re-using volume IDs from existing volumes. This can result in
rather confusing errors.
To fix this, ensure that we have contacted a majority of sites with
DISK_GetVersion before indicating that we have located the best db
version. If we've contacted a majority of sites, then we are
guaranteed (under ubik assumptions) that we've found the best version,
since previous writes to the database should be guaranteed to hit a
majority of sites (otherwise they wouldn't be successful).
If we cannot reach a majority of sites, we just don't set
UBIK_RECFOUNDDB, and the recovery process restarts. Presumably on the
next iteration we'll be able to contact them, or we'll lose sync site
status if we can't reach the other sites for long enough.
Change-Id: I84f745b5e017bb62d93b538dbc9c7de845bee1bd
Reviewed-on: https://gerrit.openafs.org/12281
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Currently, vlserver calls rx_SetRxDeadTime to set the default rx
deadtime to 50 seconds, but it does so after calling
ubik_ServerInitByInfo. ubik_ServerInitByInfo creates several rx
connections before it returns, and so these connections get the
default rx deadtime (12 seconds), instead of the 50 seconds vlserver
tries to set.
When ubik detects that a remote site is down, ubik recreates the rx
connections for that site, and this new connection gets the new
deadtime of 50 seconds.
This means that ubik behavior can have different timings in the
vlserver, depending on if any remote sites have ever been detected as
being 'down' or not. This can result in seemingly-inconsistent or
confusing behavior, since some sequences of operations that appear
identical can produce different results, depending on if the 12-second
timeout or the 50-second timeout is being used.
This behavior is not directly to blame for any problems, but it can be
very confusing, especially when trying to diagnose or reproduce bugs.
So to make things more consistent, just call rx_SetRxDeadTime earlier,
so all conns always get the 50-second timeout.
In order to do this, though, we must also ensure that rx_Init is
called before rx_SetRxDeadTime (otherwise, rx_Init will overwrite our
configured deadtime). So also call rx_Init earlier; rx_Init is
idempotent, so it's okay that it may be called again after or before
this.
Note that vlserver is currently the only ubik server that sets a
deadtime of 50 seconds, and it's not clear why. Another way to solve
this is to just remove the call to rx_SetRxDeadTime, to make vlserver
behave more similar to ptserver. But this commit takes a conservative
approach to result in a deadtime that is probably the most common in
current use. Since, most long-running vlservers will probably
eventually lose contact with remote sites at one time or another, and
so will eventually use a deadtime of 50 seconds.
Change-Id: I49430144d9a62eb8cad1509c1aeafc9fcc927f8e
Reviewed-on: https://gerrit.openafs.org/12285
Tested-by: Andrew Deason <adeason@dson.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
PackageMaker is no longer part of OS X. As a result, it
is not possible to build the package on OS X 10.10 and
OS X 10.11 using the existing code.
To solve this problem, a new script, along with a couple
of new files, are provided.
- pkgbuild.sh
This script uses the command line tools pkgbuild and
productbuild to build the package on OS X 10.10 and
OS X 10.11. By default, the package built by this
script will not be signed. Optionally, the package
might be signed.
- Distribution.xml
This file is nothing more than an XML file used by
productbuild. It is mainly used to configure how the
installer will look and behave.
- conclusion.txt
Contains the text that is displayed by Installer at
the end of the installation process. Only used by
El Capitan and further.
- Uninstall.14.15
This script can be used by OS X 10.10/10.11 users
to uninstall OpenAFS.
Notes:
- This work is based on a patch made by Brandon Allbery
<ballbery@sinenomine.net> with fixes and updates from
Andrew Deason <adeason@dson.org>.
- El Capitan and further prevent us from touching
/usr/bin directly. As a result, /opt is used.
- If the package is not signed, the user will have
to disable the OS X security protections. Otherwise,
the client will not work.
- Now we have two different scripts to build the
package on OS X. For OS X 10.10 and newer versions,
pkgbuild.sh will be used. For older versions,
the existing buildpkg.sh will be used.
Change-Id: If8320666c553b82af450c0263f5e80a00c33e3b8
Reviewed-on: https://gerrit.openafs.org/12239
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
In order to avoid some warning messages, do not
ignore the code returned by some functions.
Change-Id: Ie01fa98b54010d566fb5b980b001d58989ef9a67
Reviewed-on: https://gerrit.openafs.org/12298
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
The prname type is represented in XDR as a vector[PR_MAXNAMELEN]
of char, not as a string, which means that the XDR (de)serializer
will not guarantee null-termination. Guarantee that all buffers
used in the public protection server API are in fact valid strings
by disallowing any names that are exactly PR_MAXNAMELEN (64)
characters long. DO NOT silently truncate names that are even
longer than this. Consistently use the prname typedef in
declarations to reinforce the length limitation to those reading
the header file. Introduces a new protection error code,
PRNAMETOOLONG, which will be returned if either IN or OUT parameters
would exceed the limit.
[kaduk@mit.edu convert macro to static_inline function and expand
at call sites; add string_ wrapper to add checking to viced and libadmin;
export the string_ wrapper from libafsauthent for the windows build]
Change-Id: I65f850afcfea2fd2bc0110ca7b7f6ecca247dd58
Reviewed-on: https://gerrit.openafs.org/7896
Reviewed-by: Chas Williams <3chas3@gmail.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
This is an automatic patch generated by Coccinelle (spatch) from the commit message of the linked commit:
09cbfeaf1a
We will not add an autoconfig test because the PAGE_{...} macros should exist
where the PAGE_CACHE_{...} were previously.
The spatch used:
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Change-Id: Iabe29b1349ab44282c66c86eced9e5b2056c9efb
Reviewed-on: https://gerrit.openafs.org/12297
Reviewed-by: Michael Laß <lass@mail.uni-paderborn.de>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Tested-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
By default, makesrpm.pl will use wget to retrieve the CellServDB
as specified in the spec file. Even though the script need not and
thus should not be run by a privileged UID, make this a bit more
secure by specifying an https URL.
Change-Id: I0f14bbac35e7dc30a6e194f8706f7f3674d15a3f
Reviewed-on: https://gerrit.openafs.org/12329
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>
The value assigned to HAVE_PAM should not be capitalized.
If so, the PAM source files will not be compiled.
To fix this problem, convert to lowercase one of the values
assigned to HAVE_PAM.
Change-Id: I4973394f8d398bbea0f578fadb04aedee6fd1fc0
Reviewed-on: https://gerrit.openafs.org/12296
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Since OpenAFS 1.0, the struct volume accessTime member has been the time
time the volume structure is setup, not the last time the volume was
used (as indicated by the comments). This time stamp is only used to
find the oldest available volume slot in the disked backed volume cache.
(Perhaps in pre-OpenAFS this was updated each time the volume was
referenced.)
Rename this structure member and update the comments for it.
Change-Id: I33a6371e8800b2d0f7b2700db0785fc365a8649e
Reviewed-on: https://gerrit.openafs.org/11896
Reviewed-by: Perry Ruiter <pruiter@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Commit a0f416e350 unconditionally turned
on the new ubik_BeginTransReadAnyWrite functionality for the vlserver,
which allows us to read data from ubik during a conflicting ubik write
lock.
This feature is not ready for production use. Make it a build time
option, marked as experimental, until more testing can be done.
Change-Id: If64702e7a7ed2340066df5faf82ce8b0875fc610
Reviewed-on: https://gerrit.openafs.org/12240
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Don't mention it in the man pages.
Change-Id: I8a6d706f055545642116af5a98fa8c04f533b990
Reviewed-on: https://gerrit.openafs.org/11529
Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The dynroot volumes are synthetic, so do not need to be reset every time
the background daemon checks the volumes.
The results of osi_Time() is a signed 32-bit integer, and the volume
expireTime is an signed 32-bit integer, so use signed 32-bit integers
for the expiry check.
Change-Id: Ib92157686c1d8b84a63d409cb148155705953b6d
Reviewed-on: https://gerrit.openafs.org/11895
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
rxconn was missing from the comments; add it.
Change-Id: I8c0cf212ca2952d3a23c3bb5db1857dfd9a8f41e
Reviewed-on: https://gerrit.openafs.org/12004
Reviewed-by: Perry Ruiter <pruiter@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
While here, de-conflict the numbers for 10.0/10.1 and 7.2/7.3
Change-Id: I87697587359a26258298f4710c7232bea417f807
Reviewed-on: https://gerrit.openafs.org/12321
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The ability to set the size of the volume hash table was added
at the same time that DAFS was introduced, and got caught up
in the same preprocessor conditional. However, -vhashsize can
be useful for the traditional fileserver as well (even though
we recommend DAFS over the traditional fileserver), so let it
be used in that case.
Update the man pages accordingly and fix some grammar while here.
Noted by Mark Vitale.
Change-Id: Ic3282c9d661d60cf36f9ffb197e723a3f71da167
Reviewed-on: https://gerrit.openafs.org/12287
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The fs getserverprefs command displays preference
ranks for file / volume location server machine
interfaces. In order to get the complete set of
preference ranks, the VIOC_GETSPREFS system call
might have to be called several times. If so, the
memory previously allocated should be released.
Change-Id: I8491117ead626e70aac40343923d52284f274efd
Reviewed-on: https://gerrit.openafs.org/12315
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Linux commit 5955102c, in preparation for future work, introduced
wrapper functions to lock/unlock inode mutexes. This is to
prepare for converting it to a read-write semaphore, so that
lookup can be done with only the shared lock held.
Adopt the afs_linux_*lock_inode() functions accordingly, and
convert afs_linux_fsync() to using those wrappers, since the
FOP_FSYNC_TAKES_RANGE case appears to be the current case.
Amusingly, afs_linux_*lock_inode() already have a branch to
handle the case when inode serialization is protected by a
semaphore; it seems that this is going to come full-circle.
Change-Id: Ia5a194acc559de21808655ef066151a0a3826364
Reviewed-on: https://gerrit.openafs.org/12268
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Joe Gorse <jhgorse@gmail.com>
Tested-by: Joe Gorse <jhgorse@gmail.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
In linux commit 6b255391, the follow_link inode operation was
replaced by the get_link operation, which is basically the same
but takes the inode and dentry separately, allowing for the
possibility of staying in RCU mode.
For now, only support this if page_get_link is available and we are
using the USABLE_KERNEL_PAGE_SYMLINK_CACHE
The previous test for USABLE_KERNEL_PAGE_SYMLINK_CACHE used a bogus,
undefined configure variable (ac_cv_linux_kernel_page_follow_link).
Remove it, as it was not needed
Change-Id: I2d7851d31dd4b1b944b16fad611addb804930eca
Reviewed-on: https://gerrit.openafs.org/12265
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Joe Gorse <jhgorse@gmail.com>
Tested-by: Joe Gorse <jhgorse@gmail.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Symlink bodies in the pagecache should not be in highmem, as
upstream converted in commit 21fc61c73.
Change-Id: I1e4c3c51308df096cdfa4d5e7b16279e275e7f41
Reviewed-on: https://gerrit.openafs.org/12264
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Joe Gorse <jhgorse@gmail.com>
Tested-by: Joe Gorse <jhgorse@gmail.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Add a -s2scrypt option to the volume server, with possible options:
* never -- the existing behavior
* always -- switch to using afsconf_ClientAuthSecure, which uses
rxkad_crypt, for ForwardVolume calls.
* inherit -- encrypt inter-server traffic if the causal client
connection is encrypted. This has the effect of "inheriting" the
"-encrypt" flag given to "vos release", for example.
Thanks to Jeffrey Altman for pointers and to Andrew Deason for noting
the existence of rxkad_GetServerInfo.
[mmeffie@sinenomine.net fix assertion and style update.]
Change-Id: Ia295ba3f29a8494c8250a480fb26594468d2116a
Reviewed-on: https://gerrit.openafs.org/11349
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Thomas Keiser <tkeiser@gmail.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Though it's very unlikely that someone would actually want to
set up a new kaserver installation, if we have documentation for
it, it ought to at least do what it claims to do.
Thus, change kinit to klog where it was intended.
Reported by Karl-Philipp Richter.
FIXES 133043
Change-Id: I478a42931fa863c11b4acca7624bcabc14e561b1
Reviewed-on: https://gerrit.openafs.org/12286
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Changes to salvageserver logging in commit
24fed351fd
introduced a new bug in SalvageLogCleanup; the test for calloc() failure
was inadvertently inverted.
Fix the sense of the test.
Change-Id: Id0ee4ac3e60d7285163a9ab0b32bd7d48e570ac0
Reviewed-on: https://gerrit.openafs.org/12284
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
A typo in the recent logging changes for salvageserver
ad455347bc
caused a bad address to be passed to memset.
Correctly memset the log options as intended.
Change-Id: Ifef46defcc6da56df4e58f8ed9029717a77c0b39
Reviewed-on: https://gerrit.openafs.org/12282
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
In uvote_Init, we set ubik_lastYesTime to the current time just a few
lines before. It is important to set ubik_lastYesTime to the current
time, since that prevents us from voting for anyone in an ubik
election for at least BIGTIME seconds.
If we clear ubik_lastYesTime to 0, that means restarting a ubik server
could cause it to immediately start voting for a different site than
it was voting for before it started. This violates one of the ubik
invariants; as mentioned in the comments in SVOTE_Beacon, we cannot
promise sync site support to more than one site within BIGTIME
seconds. So initializing ubik_lastYesTime to 0 could cause two
different sites to be voted sync site simultaneously, if our restart
caused a premature change in vote.
Change-Id: I410fbefa8d699aac1c900d1fdd4e355b87917ad7
Reviewed-on: https://gerrit.openafs.org/12279
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
Reviewed-by: Jeffrey Hutzelman <jhutz@cmu.edu>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Add the ability to specify a range of addresses in both NetInfo and
NetRestrict.
Change-Id: Iecdcca8587aa2e6e7cd56cbbebb63eb41b5d6f40
Reviewed-on: https://gerrit.openafs.org/11313
Reviewed-by: Daria Phoebe Brashear <shadow@your-file-system.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
These functions are used, so they should be in the library's
export list.
Even though no one should be using kauth anymore.
Change-Id: I3ad936c5b898f38194a461c7147792e2fe6f36b2
Reviewed-on: https://gerrit.openafs.org/11139
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: Michael Meffie <mmeffie@sinenomine.net>
Now that support for linux 2.4 has been sunset, as of commit
ccf353ede6, it is no longer necessary to
put conditional compilation checks around the linux wait-for-completion
functions, which were introduced sometime during the linux 2.4 series
and have been available since.
Also, remove the remnant LINUX_COMPLETION_H_EXISTS autoconf macro, which
was removed from use in commit ef8bd5a29b.
Change-Id: Iea974236f73eef8c567a897d6a473254edf95379
Reviewed-on: https://gerrit.openafs.org/12278
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The cell info setup was moved to the beginning of the startup sequence
and an unnecessary sleep commented out in the syscall in which the cell
info was set in commit 3fa5f389b2.
Clean up afs_call.c a bit by removing this commented out code.
Change-Id: I8ef0ddce4e1d327032b54ecebb48e9fdfe7767b4
Reviewed-on: https://gerrit.openafs.org/12277
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>