If an rx call has the RX_CALL_PEER_BUSY flag set, but the call's
conn->lastBusy is not set, we can easily cause an rx caller to loop
infinitely. rx_NewCall will see that lastBusy for a call channel is
not set, and will use that call channel, but rxi_CheckBusy will note
that the call appears busy and that there are non-busy call channels
on the same conn, and so will return RX_CALL_BUSY.
This can currently happen in rxi_ResetCall, since we set
RX_CALL_PEER_BUSY on the call again if the call had that flag set when
rxi_ResetCall was called. If we are calling rxi_ResetCall with
'newcall' set, the passed in call is unrelated to the new call, since
it was obtained from the free list. Thus, the busy-ness of the call
should be ignored. Fix this by only paying attention to the incoming
RX_CALL_PEER_BUSY flag if 'newcall' is not set.
Also prevent this from happening by clearing RX_CALL_PEER_BUSY in
rx_NewCall when we select a call and clear lastBusy for that call.
Reviewed-on: http://gerrit.openafs.org/6707
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 2a4c6c3b9e1dc30d5599e67e02237a1aeef8a0f0)
Change-Id: I60d76469bc3dcf764e67524f39b3c55894e7ce99
Reviewed-on: http://gerrit.openafs.org/6763
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
allow writing of data where it's not user data we're changing
(e.g. allow a vnode to be marked cloned in the vnode index)
Reviewed-on: http://gerrit.openafs.org/6251
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 4b93c42513785d1094c5336b5c9cc4add1b89c5e)
Change-Id: I9849897ae69a426026f6d030ca4e50e8cd7066b2
Reviewed-on: http://gerrit.openafs.org/6762
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
effectively the same functionality that reclone already uses, but
for some reason we artificially limit it out of clone despite
the interface being there for it. it used to be there. put it back.
Reviewed-on: http://gerrit.openafs.org/6250
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 641c67473615e80cfb8cf1e67636a82e42e5c899)
Change-Id: I31df948a21639bd93c573c77207f0f6c9e43deed
Reviewed-on: http://gerrit.openafs.org/6761
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Currently, the openafs-client RPM init script ignores any error
reported by rmmod. If 'umount /afs' succeeds but rmmod does not, the
client may panic the machine if the client is started again (from e.g.
running the 'restart' init script method), since afsd will try to
initialize AFS with a libafs that has been shut down.
So, do not ignore errors from 'rmmod', and instead fail the 'stop'
method from the init script if we get an error.
Reviewed-on: http://gerrit.openafs.org/6709
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 12e2a3abe7ca640a7cef2630039c06964f779f17)
Change-Id: I31256abac839c9011754445efa09960f061fdbb0
Reviewed-on: http://gerrit.openafs.org/6757
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Currently in h_Alloc_r, we h_Lock_r the host, so we have it locked on
return. However, h_Lock_r drops the host glock, which is bad in this
situation since we have already added the host to the global hash
table, so other threads may see it. This can mean that by the time
h_Alloc_r returns, the returned host may have HOSTDELETED set, and/or
the addresses associated with the host may be completely different.
h_Alloc_r's caller, h_GetHost_r, seems to assume that the host is
still associated with the address of the passed-in connection. When
this is not true, this can result in the host structure getting into a
strange state, such as the primary addr/port may not be hashed. The
host may also have HOSTDELETED set, in which case we're not supposed
to be dealing with it at all.
To avoid these problems, lock host->lock directly in h_Alloc_r,
without going through h_Lock_r and dropping H_LOCK. Also do it as one
of the first things we do to initialize the host, just to make sure
that if anybody else happens to see the host, it is locked by us when
they do.
Reviewed-on: http://gerrit.openafs.org/6389
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit d6f977830c164ee079c68101595c28ff1847f88f)
Change-Id: Ib0916f3a92c4a34555ee3fa2880dec10041bf047
Reviewed-on: http://gerrit.openafs.org/6754
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
For replication writes at the remote site, we will want to call
this without a host structure.
Reviewed-on: http://gerrit.openafs.org/6674
Reviewed-by: Simon Wilkinson <simonxwilkinson@gmail.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 01301d0a5323a836efaae30cac325c25f6a7577a)
Change-Id: I1fb0dff655515fedd7dfb41139f1fb6c85599377
Reviewed-on: http://gerrit.openafs.org/6753
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
On machines lacking a libintl, _intlize() currently fails to initialize
the output error string--leading to tools (e.g., translate_et) returning
a null string; make afs_com_err fall back to returning the en/US canonical
error text when we don't have any i18n support...
Reviewed-on: http://gerrit.openafs.org/6638
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit ef63547e955edc60e2d074ef825b091e1c43882e)
Change-Id: Id138e48826aa855bd87e47f201ed6840399aa640
Reviewed-on: http://gerrit.openafs.org/6752
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Commit 267934d0e6910c8d8166a6e78f93c1bab40857b8 introduced
probing code to deal with the renameing of simple_fsync
inside the linux-kernel.
This test does not take different parameter-lists
for noop_fsync or simple_fsync resp. into account.
Fix this.
Reviewed-on: http://gerrit.openafs.org/6628
Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 20e82cecd9008f9b3467c9a323c5c3abf27f3021)
Change-Id: I478a1ea15150ca120c8f85e9696d8bdedfc974d1
Reviewed-on: http://gerrit.openafs.org/6751
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
With newer Solaris Studio (sometime in the 12.* series), cc started
adding SSE instructions to optimized x86 code, which is invalid for
kernel code and can generate panics. There appears to be no way to
turn this off currently (-xvector=%none is non-functional), so default
to not optimizing kernel code.
Reviewed-on: http://gerrit.openafs.org/6671
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 80592c53cbb0bce782eb39a5e64860786654be9f)
Change-Id: If1539dd88d4d28771a7eafcdaff30a75cb230917
Reviewed-on: http://gerrit.openafs.org/6683
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
For many vfs ops to the cache, we currently pass &afs_osi_cred for our
credentials, which is a mostly zeroed-out credential structure. In
some modern versions of Solaris (Solaris 11), at least some parts of
this structure need to not be NULL (cr_zone), or we will panic.
The Solaris kernel provides a 'kcred' credentials structure for the
purpose of using "kernel" credentials for i/o. So just use that
instead for Solaris 8 and beyond, since kcred has existed at least
since Solaris 8.
Reviewed-on: http://gerrit.openafs.org/6669
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit dc6beb3ea29a64bcf59807fd451a573aa54e1122)
Change-Id: I6fd0ce4a890c2e6d9377cad39f47303aa1687a6b
Reviewed-on: http://gerrit.openafs.org/6682
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
An undercounted afs_conn can easily cause a panic and/or memory
corruption later on, since we put an rx_connection reference with each
afs_conn reference. Panic as soon as we detect this, as this indicates
a serious bug.
Reviewed-on: http://gerrit.openafs.org/6413
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 8a574ba16a80fc2b8b703ddcfc99486b977e6071)
Change-Id: Ibd60dafdf1a800349b73754dae18666fa0edd300
Reviewed-on: http://gerrit.openafs.org/6642
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reset black-listed servers on a request when retrying due to a
hard-mount retry. When hard-mounts are in effect, a request may
retry indefinitely. If all the servers have been black-listed
due to a transient error, the request may never complete.
Reviewed-on: http://gerrit.openafs.org/6330
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit faa58c9f60a158481bdfee27e175a37c5fcd64aa)
Change-Id: I1ecc3fa78c064c46849dec47c77f2fc405f2ee7f
Reviewed-on: http://gerrit.openafs.org/6641
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Derrick Brashear <shadow@dementix.org>
Pull up some more of 3f7d8ec219e1aa04b6c0417ecf5e730d40b4f149 to
handle changes that have made it into 1.6 since the last pullup:
* Exclude the aklog_dynamic_auth man page, since it is AIX-only
* Add new files that have appeared in the distribution, such as the
'afsio' binary.
Reviewed-on: http://gerrit.openafs.org/4814
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry-picked from 3f7d8ec219e1aa04b6c0417ecf5e730d40b4f149)
Change-Id: Ib702f39d930057d92eca4d157fddb633cccf9fae
Reviewed-on: http://gerrit.openafs.org/6640
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Oracle Solaris 11 no longer supports x86 (amd64 is required). If we
try to build the x86 module, /usr/include/sys/kobj.h complains that
the ISA is unsupported, and refuses to go on. So, just remove
MODLOAD32 from the libafs directories to build on sunx86_511.
Reviewed-on: http://gerrit.openafs.org/5835
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit c6a22d67ff9787ace2249d528eb9db99c5b19427)
Change-Id: I00f9f19653a2f98276c236d7e2331bc81f7c4f13
Reviewed-on: http://gerrit.openafs.org/6643
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
when we are going to hit the backend storage, disable keepalives.
the net effect of this is that no idle dead time is needed; instead,
the normal dead time will result in a connection with no activity
simply dying naturally if i/o blocks forever.
it's important that keepalives be enabled during callback breaks,
so that is done.
(cherry picked from commit 05f3a0d1e0359f604cc6162708f3f381eabcd1d7)
Reviewed-on: http://gerrit.openafs.org/6515
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Change-Id: If2ee7f3ad7f2dc835dd350bb9558fde0aa179240
Reviewed-on: http://gerrit.openafs.org/6613
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Allocate new Rx error codes for Idle and Busy calls but do not
send these errors on the wire. They are only intended for local
use.
RX_CALL_IDLE is an indication to an application that requests it
that the rx peer is maintaining an open call channel but has not
sent any actual data for the length of the registered idle dead
timeout.
RX_CALL_BUSY is an indication to an application that requests it
that the rx peer believes the selected call channel is in use by
a pre-existing call.
When either RX_CALL_IDLE or RX_CALL_BUSY are assigned as the call
error and an abort must be sent to the rx peer, the errors are
translated to RX_CALL_TIMEOUT. This is necessary because it is
not possible to add new Rx error values in a method that is safe
for peers that are not expecting them.
This patchset also documents which Rx errors defined in rx.h are
used on the wire and which are not.
The Unix and Windows cache managers are updated to build with
these new error codes.
eviewed-on: http://gerrit.openafs.org/6128
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
(cherry picked from commit c7673f4fad8e8b9390564e3cbfa11d5f1b52ba2f)
Change-Id: I4c7d6733ddae03bda5a31fe4486ada090dcfd0b3
Reviewed-on: http://gerrit.openafs.org/6612
Reviewed-by: Derrick Brashear <shadow@dementix.org>
When we encounter a "busy" call channel (indicated by receiving
RX_PACKET_TYPE_BUSY packets), we can error out a call with
RX_CALL_TIMEOUT to try and get the application code to retry the call.
However, many RX applications are not aware of this, and will just
fail with an error upon receiving a single busy packet.
So instead, make this behavior optional, and only do it if the
application tells us what specific error it expects to receive when a
busy call channel is detected. Enable this behavior for the Unix cache
manager, as it can cope with receiving an RX_CALL_TIMEOUT error in
this scenario.
(cherry picked from commit eddcee3ad518dff9fbfda790640c5bfd2e97ef5a)
Reviewed-on: http://gerrit.openafs.org/4159
Reviewed-by: Jeffrey Altman <jaltman@openafs.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
Change-Id: I3938e79ab009f14f5421a4a45e2a099276c49f24
Reviewed-on: http://gerrit.openafs.org/6611
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
keep pool of connections to use for replicated volumes,
so we can have a separate idle time setting
(cherry picked from commit cd1f72649650404581cfcdcf3beeeaf2bb960bd6)
Reviewed-on: http://gerrit.openafs.org/6546
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Change-Id: I056ba28d11313c9925df63869e0c55a1a4f132da
Reviewed-on: http://gerrit.openafs.org/6610
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Currently SYNC clients will "disable" themselves on certain error
patterns. For example, if the server end closes its file descriptor
too many times, or takes too long and then closes the fd, the SYNC
client will return an error and set fatal_error. On any subsequent
SYNC requests, the request will immediately fail without contacting
the server, often making SYNC client programs effectively useless
until they are restarted.
There isn't really any reason to cause future requests to fail.
Transient problems in the fileserver can easily make this situation
possible (e.g. a fileserver can crash but still take several minutes
to close the SYNC fd while the core is written to disk), and so while
we may return an error for a specific problematic request, future
requests may be fine.
So, just remove everything related to fatal_error, so future SYNC
requests can continue to be attempted. Adjust some log messages to
reflect the new behavior.
Reviewed-on: http://gerrit.openafs.org/6548
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 40bf6dee2409197f7494c3d09bf2dea7c248d185)
Change-Id: I0f7a1792afd1ace3beabe238107d0a5069ccbb44
Reviewed-on: http://gerrit.openafs.org/6609
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
The intention of this condition is to check if the current call
being considered is the last one on the queue, but the test is
incorrect. A null next pointer indicates a removed item, not
the end of the queue.
Use the queue_IsLast macro instead to correctly determine that
this is the last item in the queue and that a call has to be
selected, either the current one or a previously seen good choice.
This can cause calls to get permanently stuck in the call queue
and never get assigned to a thread, even when all threads are
idle.
Reviewed-on: http://gerrit.openafs.org/6564
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 6ad3d646e62801cb81a3c9efeac320daa44936e1)
Change-Id: Ic9d0ff51c79115960ebb4634fc35a5e9da21c380
Reviewed-on: http://gerrit.openafs.org/6570
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
A generic macro exists to test for functions in the kernel, use
it for set_nlink.
Reviewed-on: http://gerrit.openafs.org/6566
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 64bd0b728ca95ba7bb4f1fdd909386fde3ce81e1)
Change-Id: I93d169bec8f476d5e692f7f5a7fe31002af7ce1e
Reviewed-on: http://gerrit.openafs.org/6569
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
In order to dec the relevant special inodes, we need to know the
parent vol id in addition to the vol id itself. Use the appropriate
volume IDs when IH_DEC'ing special inodes after we fail to create the
volume, so we don't leave behind special inodes.
(cherry picked from commit 627cfb1d4e7b32b4342c59b162a36ba9beb8a066)
Reviewed-on: http://gerrit.openafs.org/6529
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Change-Id: I9f40f170cd6a0fffe2e17fc199af99e087066902
Reviewed-on: http://gerrit.openafs.org/6550
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
we get a conn, check it for eligibility, and if not,
just abandon it. "oops"
(cherry picked from commit 26fc0cda94c24a1c5f0bef109bca920456c25265)
Change-Id: I8e4f762b5170f07d6abc3508e88f001ca147c3a7
Reviewed-on: http://gerrit.openafs.org/6521
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
libjafs is surprisingly close to being buildable. Fix a few misc
things which have bitrotted over the years so it is possible to
actually build:
- Add -I$SRC/config to the cflags, so we can include afsconfig.h
- Remove references to the nonexistant rxkstats.o
- Do not link with UAFS' AFS_component_version_number.o, since this
gives us duplicate version number symbols
- Include afs_vosAdmin.h in Group.c, to satisfy some missing symbols
Reviewed-on: http://gerrit.openafs.org/6524
Reviewed-by: Steven Jenkins <steven@synaptian.com>
Tested-by: Steven Jenkins <steven@synaptian.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 967d7201ee5c27db6d75d5efafcad9458e2b5167)
Change-Id: I0cb510e3f115c2c35f06cf9cbddaf31835704eea
Reviewed-on: http://gerrit.openafs.org/6527
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
A few changes to allow a "make all ; sudo make install ; make all..."
workflow to work without manually removing files in between.
Make the rebuilding of the h directory dependent on the source
files scanned to build it. This prevents it from being rebuilt
for every "make install".
While we're here, use -f when removing linktest for the clean target.
This allows "make clean" to remove it without prompting when the user
doesn't have write access to the file, as is the case when make install
rebuilds it as root.
Reviewed-on: http://gerrit.openafs.org/6519
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 2caf0778ddeb6eeb854360cac20c6b3f0894f3eb)
Change-Id: Id4ccad953669538072b834a6aa49b8beaeeeed35
Reviewed-on: http://gerrit.openafs.org/6526
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
in the event we got a network error, we don't know if the server
completed (or will complete) our operation. we can assume nothing.
a more complicated version of this could attempt to verify that the
state is what we expect it to be, but in extended callbacks universe
this is potentially easier to solve anyway. for now, return the
error to the caller, and mark the vcache unstat'd.
(cherry picked from commit c2fc7e0f66621fc97f5b4dc389d379260638315c)
Change-Id: Ic38cf16e47664e6f36ad614735b42d3f4e5a6ce2
Reviewed-on: http://gerrit.openafs.org/6520
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
VGetFreeVnode_r pulls a vnode off of the vnode LRU, and removes the
vnode from the vnode hash table. In DAFS, we may drop the volume glock
immediately afterwards in order to close the ihandle for the old vnode
structure.
While we have the glock dropped, another thread may try to
VLookupVnode for the new vnode we are creating, find that it is not
hashed, and call VGetFreeVnode_r itself. This can result in two
threads having two separate copies of the same vnode, which bypasses
any mutual exclusion ensured by per-vnode locks, since they will lock
their own version of the vnode. This can result in a variety of
different problems where two threads try to write to the same vnode at
the same time. One example is calling CopyOnWrite on the same file in
parallel, which can cause link undercounts, writes to the wrong vnode
tag, and other CoW-related errors.
To prevent all this, make VGetFreeVnode_r atomically remove the old
vnode structure from the relevant hashes, and add it to the new hashes
before dropping the glock. This ensures that any other thread trying
to load the same vnode will see the new vnode in the hash table,
though it will not yet be valid until the vnode is loaded.
Note that this only solves this race for DAFS. For non-DAFS, the vol
glock is held over the ihandle close, so this race does not exist.
The comments around the callers of VGetFreeVnode_r indicate that
similar extant races exist here for non-DAFS, but they are unsolvable
without significant DAFS-like changes to the vnode package.
Reviewed-on: http://gerrit.openafs.org/6385
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 8e15e16c9e6a5768f31976cc21b48d5bb10617b7)
Change-Id: I915d18c4252b698f39fdf65793311a39381096b4
Reviewed-on: http://gerrit.openafs.org/6495
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
We can drop GLOCK in several places in afs_icl_Event4 and the
afs_icl_AppendRecord callee. To ensure that the given afs_icl_set does
not get freed while we have GLOCK dropped, grab a reference to the
set.
Thanks to Ryan C. Underwood for reporting an issue triggered by this.
Reviewed-on: http://gerrit.openafs.org/6431
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 7461fa11939556d3b6f3ea38da7ff65607805579)
Change-Id: I7a33cf96d2031dd1798f7598918396eb8fbde611
Reviewed-on: http://gerrit.openafs.org/6494
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
The kernel space cm xstat time structures are implemented as 32
bit values in memory and on the wire. Define the client side
xstat userspace structures as 32 bit time values as well to avoid
size mismatches on systems with native 64 bit time values.
Reviewed-on: http://gerrit.openafs.org/5237
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@openafs.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 130144850c6d05bc69e06257a5d7219eb98697d8)
Change-Id: I8726efdd7123e9a1e0e4536bf2766c441964475d
Reviewed-on: http://gerrit.openafs.org/6386
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
it's actually important this be more than the rx call dead time
so timing out server callbacks to clients don't result in us idle deading
a call to the server when callbacks need to be broken
FIXES 130327
Reviewed-on: http://gerrit.openafs.org/6497
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 0f4da13137612a9b0c0c3b57aa939d6661fb67f8)
Change-Id: I181d89c36175be93ed59226b401d48903fb5f584
Reviewed-on: http://gerrit.openafs.org/6518
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Directory writes are synchronous, so this is fine. There's a
mostly-convenient function in fs/libfs.c that returns 0 that we can use
to do what we want ("mostly" because it was renamed in 2.6.35).
FIXES 130425
Reviewed-on: http://gerrit.openafs.org/6491
Reviewed-by: Simon Wilkinson <sxw@inf.ed.ac.uk>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 267934d0e6910c8d8166a6e78f93c1bab40857b8)
Change-Id: Iaeb8a699673b6144c186b470f6d877fb54f1e319
Reviewed-on: http://gerrit.openafs.org/6493
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
systemd is actually rather capable of leaving the OpenAFS client in an
incredibly broken state, thanks to its willingness to track services and
kill their processes. We should not attempt to restart the client on
upgrade, whether a normal upgrade or a migration from SysV initscripts.
In the former case, it's fine (and correct) for the old AFS to keep
running; in the latter case, the unit file is capable of correctly
shutting down an initscript-launched client. The same is true for the
OpenAFS server.
This brings the packaging in line with the SysV initscript code in the
specfile, which does not attempt to restart the service, as well as with
e.g. Debian's packaging, which uses --no-restart-on-upgrade.
While we're here, clean up a redundant BuildRequires on systemd-units.
Reviewed-on: http://gerrit.openafs.org/6247
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit dee93ff1d114da711df345e06b5e1a682c877315)
Change-Id: I4ecf3b2f307a81549e0bd568ab5e4585a2ef1f2d
Reviewed-on: http://gerrit.openafs.org/6492
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Clear dirHeader->hashTable via memset instead of via a loop. This is
more efficient, and avoids the loop getting optimized into an unusable
_memset call on recent versions of Solaris Studio when building for
the kernel.
Thanks to Jeff Blaine for reporting the issue with Solaris Studio.
Reviewed-on: http://gerrit.openafs.org/4829
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit f091ace32e3045da396d577055dafd67888ff7ea)
Change-Id: Ia098730c3e83429ce4f886b1427159d13eff4c4e
Reviewed-on: http://gerrit.openafs.org/6414
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
afsconfig.h can define various preprocessor symbols that can affect
how system headers behave. For example, the presence of the
_POSIX_PTHREAD_SEMANTICS symbol changes the number of arguments to
getpwnam_r on at least Solaris 8. So, we must include afsconfig.h
before including anything else, to ensure consistency.
FIXES 130413
Reviewed-on: http://gerrit.openafs.org/6387
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
(cherry picked from commit 37f537a21db6d560dd16a53ff5e0d2f0456d4c48)
Change-Id: I64970fd06af9a13d91acaf03b80a2baf224754ff
Reviewed-on: http://gerrit.openafs.org/6388
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
aklog makes use of the setenv and unsetenv functions, which do not
exist (at least) on HP-UX earlier than around 11i v3, and do not exist
on Solaris earlier than Solaris 10. Add replacement functions for
setenv and unsetenv when they are not present. Note that these
implementations are copied from libroken, and setenv was modified to
not use asprintf.
This is 1.6-specific. On the master branch, libroken takes care of
these for us. On the master branch, setenv and unsetenv from libroken
were added in 70e8451acd0426024c152073e53bc6606e0189e1.
Change-Id: I35546f1add7f4f87c6ffc484059057825887499f
Reviewed-on: http://gerrit.openafs.org/6376
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
afs_Analyze sets VHardMount on a volume struct when a hard-mount
scenario is encountered, and clears it after sleeping. However, if the
volume struct has VRecheck set, or if it's not in memory, afs_Analyze
cannot retrieve the volume struct in order to clear VHardMount again.
For the VRecheck case, this can results in VHardMount never getting
cleared, and so hard-mount messages for the volume seem to disappear.
So, clear VHardMount when we set VRecheck so this does not occur.
For the case where the volume struct is not in memory, this is not a
problem, since when we allocate a volume struct again, the VHardMount
state will not be retained.
Reviewed-on: http://gerrit.openafs.org/6335
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit f469be407789e696c0b9e9a431b4879798a00e2a)
Change-Id: If13769445f20336dfba9755f3af0a1499ce16a6d
Reviewed-on: http://gerrit.openafs.org/6348
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>
Limit how often we log "hard-mount waiting for XXX" messages. Without
this, it is possible for a client with hard-mounts enabled to spam the
kernel log rather excessively (in extreme cases this can even panic
the machine on at least some Linux).
To keep things simple, just log approximately one message per volume
per hard-mount interval.
Reviewed-on: http://gerrit.openafs.org/5060
Tested-by: Derrick Brashear <shadow@dementia.org>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 530b5ecac51cc7ce61ccddd50868c632c4a47298)
Change-Id: I566aa3d411ff100ccc6afa9a5273fb84e6172dd0
Reviewed-on: http://gerrit.openafs.org/6347
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
VResort and VMoreReps are not referenced anywhere in the tree, so
remove their definitions.
Reviewed-on: http://gerrit.openafs.org/5059
Tested-by: Derrick Brashear <shadow@dementia.org>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 6cae7c554e917a26b197167e177bd3eb22bce71a)
Change-Id: I0a282dac3a9e31bff4ff37c61275cc7c08456cad
Reviewed-on: http://gerrit.openafs.org/6346
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>