8901 Commits

Author SHA1 Message Date
Jeffrey Altman
15f3a7ed4e Windows: afslogon network provider debug registry value
create a new TransarcAFSDaemon\NetworkProvider "Debug" value
to be used for activating the network provider debugging.
The overlapping use of TransarcAFSDaemon\Parameters "TraceOption"
is just too confusing.

Permit both methods to be used.

Reviewed-on: http://gerrit.openafs.org/5316
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 3d4e111dd6c4201476e7447fdfaa27ed630032c5)

Change-Id: Ibc8b56d64aa843076b191afa42c4a3e93cf7a26f
Reviewed-on: http://gerrit.openafs.org/6802
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-25 22:30:24 -08:00
Jeffrey Altman
91f98123ae Windows: afslogon.dll is not a file system interface
Do not return a file system network type that corresponds
to a real file system inter since afslogon is in fact not
associated with a file system interface.  We can't return
WNNC_NET_NONE (0) because that prevents NPLogonNotify()
from being executed.  However, if we return an in use
file system value that can confuse the system when the
actual file system's network provider is also installed.

Reviewed-on: http://gerrit.openafs.org/5313
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 9052974812e33d186613c31e318673f9268467c6)

Change-Id: I60bc66440b548c3901914df8193c3999c3388abc
Reviewed-on: http://gerrit.openafs.org/6801
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-25 22:30:14 -08:00
Jeffrey Altman
1f5f00810a Windows: torture error reporting
When LeaveThread() is called and GetLastError() has already
been called, pass the last error value to LeaveThread().  Otherwise,
the GetLastError() call in LeaveThread() may return an inaccurrate
result.

Reviewed-on: http://gerrit.openafs.org/5312
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 209d59a61ba9614a8b9d231d828f74a3e9bdaa27)

Change-Id: I8f1b5b6431bad4413e7d81c95835ed852fbba16f
Reviewed-on: http://gerrit.openafs.org/6800
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-25 22:30:02 -08:00
Jeffrey Altman
d1630cabac Windows: change buf_Find*() signature to accept cm_fid_t
The buf_Find*() functions require a cm_fid_t to match with the
cm_buf_t objects not a cm_scache_t.  Change the signature so
that the cm_scache_t is not required.  It should be possible to
search for a buffer even if the cm_scache_t is not present in
the cache.

Reviewed-on: http://gerrit.openafs.org/5304
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@openafs.org>
Tested-by: Jeffrey Altman <jaltman@openafs.org>
(cherry picked from commit c23b27a69322f4c9963a532d5cbcb136b23bb20c)

Change-Id: Ie4546de582e0ffe9103f1bb01e05cf387265da49
Reviewed-on: http://gerrit.openafs.org/6799
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-25 22:29:08 -08:00
Jeffrey Altman
d3360cbbe6 Windows: do not drop lock unnecessarily
do not drop cm_serverLock for a cm_PutServer call since
it will only reacquire it.  use cm_PutServerNoLock() instead.

Reviewed-on: http://gerrit.openafs.org/5302
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@openafs.org>
Tested-by: Jeffrey Altman <jaltman@openafs.org>
(cherry picked from commit b804e027f1a9d8dfaad3d348390a83493b53a6c7)

Change-Id: Ic9c4f1550636555568e3c67b2bb5f9e772116e9f
Reviewed-on: http://gerrit.openafs.org/6798
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-25 22:28:47 -08:00
Jeffrey Altman
73ce89e84f Windows: cm_serverLock read required not write
Reviewed-on: http://gerrit.openafs.org/5301
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@openafs.org>
Tested-by: Jeffrey Altman <jaltman@openafs.org>
(cherry picked from commit bca64c70467afd00ca02290a4236bc295ec4633c)

Change-Id: I9c0c04ce619f2f85ae821621f9468715ba7deefe
Reviewed-on: http://gerrit.openafs.org/6797
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-25 22:28:34 -08:00
Jeffrey Altman
3abed6bdbb Windows: be explicit when mapping sharing violation
Only one lock acquistion failure should be mapping to
CM_ERROR_SHARING_VIOLATION.  That is CM_ERROR_LOCK_NOT_GRANTED.
Make it clear that is what we are doing.

Reviewed-on: http://gerrit.openafs.org/5299
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@openafs.org>
Tested-by: Jeffrey Altman <jaltman@openafs.org>
(cherry picked from commit a576ff1e53a539e88b9f3fa6b8132d4f161b0bd4)

Change-Id: I558c6989a2a8f4042129e2a60bcd340a7863222c
Reviewed-on: http://gerrit.openafs.org/6796
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-25 22:28:14 -08:00
Jeffrey Altman
4078a38995 Windows: avoid duplicate volume update queries
If multiple volume update queries have stacked up in
cm_UpdateVolumeLocation() and the active query failed,
do not re-issued the blocked queries.  Instead, prevent new
queries for 60 seconds and fail those that blocked during
the active query.

Reviewed-on: http://gerrit.openafs.org/5296
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@openafs.org>
Tested-by: Jeffrey Altman <jaltman@openafs.org>
(cherry picked from commit 21acdd92c8510a9f99243588388a2a1078547533)

Change-Id: I7f0bc97ca7c194624ac854558bbed6b93a13ce63
Reviewed-on: http://gerrit.openafs.org/6795
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-25 22:28:04 -08:00
Jeffrey Altman
36120d812b Windows: fix condition calls to osi_Log
The osi_Log macro is if(foo) osi_AddLog()

If osi_Log macros will be conditionally called, the conditonal
needs to have bracing.

Reviewed-on: http://gerrit.openafs.org/5160
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@openafs.org>
Tested-by: Jeffrey Altman <jaltman@openafs.org>
(cherry picked from commit 4e42d6fd18097d0c8d2e4b455d3c540743d7dbda)

Change-Id: Ic8063144a5069736c95a57965a28d6a101749b3e
Reviewed-on: http://gerrit.openafs.org/6794
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-25 22:27:55 -08:00
Andrew Deason
344f7fc608 Rewrite make_h_tree.pl in shell script
The current usage of make_h_tree.pl adds a build requirement of
/usr/bin/perl that we did not have prior to commit
1d6593e952ce82c778b1cd6e40c6e22ec756daf1. Do the same thing in a
bourne shell script instead, so we don't need perl.

Note that this is not as generalized as make_h_tree.pl, but it doesn't
need to be. Specifically, this does not strip a leading ../ from found
include directives (nothing in the tree that includes h/* files uses
this), and header filenames containing whitespace almost certainly do
not work correctly.

The h => sys mapping is also much more hardcoded, but that's all we
were using this for anyway.

Reviewed-on: http://gerrit.openafs.org/6790
Reviewed-by: Russ Allbery <rra@stanford.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit fb03b1380f82a6bdc8a78ad92069da38b4e98c26)

Change-Id: If6bfedea0b563dce6135fbf2f4554ee602ee822c
Reviewed-on: http://gerrit.openafs.org/6793
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-25 05:10:32 -08:00
Andrew Deason
bea2a94610 salvager: Do not abort on large volume IDs
We have already checked that 'vid' is valid; no need to check if it is
negative. Also change vid to be a VolumeId.

This is partially cherry-picked from
0884e9d0fddf2be81abf6468209048331efa8a1e. The commit
4d691ae10903e01db4d6b24a4eb02da536cadf7c is comprised of changes from
both ce5e263b488f8cb85662031ee08eea448dab2d27 and
0884e9d0fddf2be81abf6468209048331efa8a1e, but it missed a few things
from 0884e9d0fddf2be81abf6468209048331efa8a1e. This commit brings in
the rest of the changes from 0884e9d0fddf2be81abf6468209048331efa8a1e.

Change-Id: I8e001bfe81128b2e2214b3b2fa83e4797374022b
Reviewed-on: http://gerrit.openafs.org/6778
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 12:16:34 -08:00
Derrick Brashear
8a0bc72693 libafs: retry retriable RPCs instead of abandoning
if we get e.g. an idle dead error we should retry
retriable actions, namely data stores. in order
for writing files to work correctly given how
the writeback code is structured it's important that
this not interfere with analyze's shouldRetry decision
on those RPCs

Reviewed-on: http://gerrit.openafs.org/6749
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 75a3dabe66a9fbc232b05e2f744ad5b867e18262)

Change-Id: I9c611eeb9a71298e9725268392cdf94074324bf1
Reviewed-on: http://gerrit.openafs.org/6777
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 10:01:08 -08:00
Derrick Brashear
fcdd20e389 libafs: ensure one nat ping connection per srvAddr
track the natping conn with the srvAddr, ensuring exactly one.

Reviewed-on: http://gerrit.openafs.org/6706
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 2378895fc66a19a050f302711f2e18dbbf2e3d6f)

Change-Id: I5e74ec3f46f9af335653b6910d2c31c788181c5c
Reviewed-on: http://gerrit.openafs.org/6772
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 09:58:44 -08:00
Marc Dionne
3cc6da964e volser: Remove unused variable
tid is now unused - remove it to avoid a warning.

Reviewed-on: http://gerrit.openafs.org/6743
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit fd19b39b151e3dddd18b4280252ac3e0fdf3964d)

Change-Id: Ib402c84689d61baefed3b76138f7fac7c2b36de0
Reviewed-on: http://gerrit.openafs.org/6771
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 09:36:04 -08:00
Andrew Deason
a498e1ff26 viced: Relax "h_TossStuff_r failed" warnings
Currently, h_TossStuff_r bails out and logs a message if we detect
that somebody grabbed a reference or locked the host while we tried to
h_NBLock_r. The reasoning for this is that it is not legal for anyone
to h_Hold_r a host that has HOSTDELETED set (but the error is
detectable and recoverable); callers are supposed to check for
HOSTDELETED and not hold a host in that case.

However, HOSTDELETED may not be set when h_TossStuff_r is called,
since we call it if either HOSTDELETED _or_ CLIENTDELETED are set. If
CLIENTDELETED is set and HOSTDELETED is not, it's perfectly fine (and
necessary) for callers to grab a reference to the host. So, if that's
what is going on, don't log a message, since that's normal behavior.

Check for HOSTDELETED before we h_NBLock_r, since it is technically
possible (and legal) for someone to grab a reference to the host and
somehow set HOSTDELETED while we wait for h_NBLock_r to return. Also
log the flags when we see this message.

Reviewed-on: http://gerrit.openafs.org/6733
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit fe4e52655ce7e5a8e5f6c23cde678fc66c3db490)

Change-Id: Ic1b72c808aec158d99f088a3144e86adf969efcc
Reviewed-on: http://gerrit.openafs.org/6770
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 09:35:19 -08:00
Andrew Deason
d68f9d8342 viced: Remove extraneous h_AHTAHT_r in h_GetHost_r
We added this address to the host with an addInterfaceAddr_r call just
a few lines before, which adds the host to the address hash table.
Another call to h_AddHostToAddrHashTable_r is pure overhead and
confusing.

Reviewed-on: http://gerrit.openafs.org/6729
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit f52c33ea10de8d1d07a9c4805366283e6ca635dc)

Change-Id: Ib97718a42f9997a1fa257533296c62f3d618e1a7
Reviewed-on: http://gerrit.openafs.org/6769
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 09:35:08 -08:00
Andrew Deason
dbaaa4c8f9 viced: Set h_GetHost_r probefail if MPAA_r fails
Currently, in h_GetHost_r, if we get a connection whose address does
not match an extant host, but the reported uuid does, we ProbeUuid the
old host. If it fails, we call MultiProbeAlternateAddress_r and set
'probefail'. Later on, if 'probefail' is set, we always add the
connection address to the host, and remove the host->host,host->port
address from the host.

However, this is not always correct. Consider the following situation.

We have an existing host that has primary address 1.1.1.1, and also
has addresses 1.1.1.2 and 1.1.1.3 on the interface list but not on the
hash table. Say that host A stops responding on 1.1.1.1, and a
connection comes in from 1.1.1.2. We ProbeUuid 1.1.1.1 and get a
failure, so we call MultiProbeAlternateAddress_r.
MultiProbeAlternateAddress_r probes via rx_Multi the addresses 1.1.1.2
and 1.1.1.3. Say that 1.1.1.3 responds first, and responds
successfully, so MultiProbeAlternateAddress_r sets 1.1.1.3 to be the
primary address for the host.

After MultiProbeAlternateAddress_r returns, 'probefail' is set. A few
lines down, we see that oldHost->host does not match haddr, and
'probefail' is set, so we add 1.1.1.2 to the interface list, and
remove 1.1.1.3, and set 1.1.1.2 to be the primary address, even though
1.1.1.3 is the address we most recently 'know' is correct.

To fix this, only set 'probefail' if MultiProbeAlternateAddress_r also
fails after the failed ProbeUuid call. Conceptually this makes sense,
since if MultiProbeAlternateAddress_r succeeds, it found an address
that responds successfully to ProbeUuid, and it sets that address to
be the primary address. Therefore, after MultiProbeAlternateAddress_r
returns success, the situation is the same as if the 'good' address
was already the primary address, and the ProbeUuid call succeeded, so
'probefail' should be cleared.

Reviewed-on: http://gerrit.openafs.org/6728
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 3c803580bb503c7650f7b138c1b3f2eafd92b985)

Change-Id: I6554688447e7e62874e45a00a4c1faf957e29cb6
Reviewed-on: http://gerrit.openafs.org/6768
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 09:34:56 -08:00
Andrew Deason
f44fb4f5f8 viced: Correctly update addrs on alt addr probe
The functions MultiBreakCallBackAlternateAddress_r and
MultiProbeAlternateAddress_r try to find a valid address in a host's
interface list of addrs. If they find one, they update host->host and
host->port. However, they do so just by changing those fields directly
and by calling h_DeleteHostFromAddrHashTable_r and
h_AddHostToAddrHashTable_r. This leaves the old host->host, host->port
on the interface list, and leaves it marked as 'valid'. Similarly, the
new host and port may still be marked as not 'valid'.

This can result in the host being on the addr hash table via an
address that is not on the host's interface list. After the above
situation occurs, we may call

  removeInterfaceAddr_r(host, host->host, host->port);

and then update host->host and host->port, which happens in a variety
of places. Since host->host, host->port is not marked as valid in the
interface list, it is not removed from the addr hash table, but it is
removed from the interface list. Eventually, this can cause the host
to be referenced from the addr hash table even after it has been
freed.

Since this can result in hash table entries pointing to the 'wrong'
host, this can result in FileLog messages such as:

Sun Feb  5 03:16:35 2012 Removing address that does not belong to host 0xdeadbeefdead (1.2.3.4:7001).

And bogus instances of the message:

Sun Feb  5 03:16:36 2012 CB: new identity for host 0xdeadbeefdead (1.2.3.4:7001), deleting(1 baadcafe 12345678-9abc-def0-12-34-456789abcdef fedcba98-76543210f-ed-cb-a9876543210f)

To fix this, make MultiBreakCallBackAlternateAddress_r and
MultiProbeAlternateAddress_r update the address list the same way as
all of the code in host.c does; by adding the new address with
addInterfaceAddr_r, removing it with removeInterfaceAddr_r, and
updating host->host and host->port.

Reviewed-on: http://gerrit.openafs.org/6727
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 7a6efc9bfcd955901d19274cc96f9a1b67f54f95)

Change-Id: I3bf82f116bc2dd979e1e93cea58a4c74b0a2023d
Reviewed-on: http://gerrit.openafs.org/6767
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 09:34:43 -08:00
Andrew Deason
b6ec630bda viced: Delete dup host before probing old host
Currently, when the fileserver gets a new connection from an address
not on the addr hash table, we allocate a new host structure and add
that host to the addr hash table. If we then find that that host's
uuid matches the uuid of an extant host, we do the following:

 - probe the old host with the uuid, and MultiProbeAlternateAddress_r
   if the probe fails

 - mark the duplicate host as HOSTDELETED

 - manipulate the interface lists

Consider, for example, that we have an extant host ('oldHost') with
address 1.2.3.4:7001, but with 5.6.7.8:7001 on its alternate interface
list. At some point, the 1.2.3.4:7001 interface goes away or becomes
unreachable. A new connection comes in from that same host on
5.6.7.8:7001.

What will happen is we create a new host for address 5.6.7.8:7001, and
then detect the uuid collision. When we try to probe the old address
of 1.2.3.4:7001, it will fail, and we will try to
MultiProbeAlternateAddress_r. MultiProbeAlternateAddress_r will
determine that the alternate address 5.6.7.8:7001 responds
successfully to the probe, and it tries to set 5.6.7.8:7001 to be the
primary address of 'oldHost', and add 'oldHost' to the addr hash table
under 5.6.7.8:7001.

But the "new" host from the incoming connection is already hashed on
the address hash table under 5.6.7.8:7001, so the
h_AddHostToAddrHashTable_r call in MultiProbeAlternateAddress_r fails.
Since we later delete the new duplicate host, this results in
5.6.7.8:7001 being the primary address for the host, but that address
is not anywhere in the address hash table.

This behavior can be seen by the following pair of FileLog messages:

Wed Feb  1 11:02:38 2012 CB: ProbeUuid for 0xdeadbeefdead (1.2.3.4:7001) failed -01
Wed Feb  1 11:02:38 2012 h_AddHostToAddrHashTable_r: refusing to hash host beefdead, baadcafe (5.6.7.8:7001) already hashed

While those message do not necessarily indicate this problem, this
problem will result in those messages.

To fix this, mark the duplicate host as HOSTDELETED before we do any
probing on 'oldHost'. This way, if MultiProbeAlternateAddress_r tries
to add 'oldHost' to the addr hash table under 5.6.7.8:7001, it will be
able to do so successfully, since the old duplicate host is deleted.

Reviewed-on: http://gerrit.openafs.org/6726
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 9754c4e15fb9073ed9f95d5d4242d311eb65d717)

Change-Id: I35d41c91e496086377065f862021a5bb3fd221ef
Reviewed-on: http://gerrit.openafs.org/6766
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 09:34:33 -08:00
Derrick Brashear
8b25f9cc99 vos: allow releases without offline time
allow releases using dumps to clones to avoid offline time

Reviewed-on: http://gerrit.openafs.org/6254
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 13a4f2b18bb84d05773529a794371d29f64570ab)

Change-Id: Iec0f2d882dc2ac9a11ed4ca282cb2424db052803
Reviewed-on: http://gerrit.openafs.org/6765
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 09:34:18 -08:00
Derrick Brashear
f4e73067cd vos: refactor code
change vos to remove lots of duplicated code for volume deletes and clones

Reviewed-on: http://gerrit.openafs.org/6253
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 8d618dceeefacbeb37c4ef3b1f9a8e80552311aa)

Change-Id: I2c26dce796f93c8c993148a94d21dce8608e8c43
Reviewed-on: http://gerrit.openafs.org/6764
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 09:34:09 -08:00
Andrew Deason
390d0108b3 Rx: Avoid lastBusy/PEER_BUSY discrepancy
If an rx call has the RX_CALL_PEER_BUSY flag set, but the call's
conn->lastBusy is not set, we can easily cause an rx caller to loop
infinitely. rx_NewCall will see that lastBusy for a call channel is
not set, and will use that call channel, but rxi_CheckBusy will note
that the call appears busy and that there are non-busy call channels
on the same conn, and so will return RX_CALL_BUSY.

This can currently happen in rxi_ResetCall, since we set
RX_CALL_PEER_BUSY on the call again if the call had that flag set when
rxi_ResetCall was called. If we are calling rxi_ResetCall with
'newcall' set, the passed in call is unrelated to the new call, since
it was obtained from the free list. Thus, the busy-ness of the call
should be ignored. Fix this by only paying attention to the incoming
RX_CALL_PEER_BUSY flag if 'newcall' is not set.

Also prevent this from happening by clearing RX_CALL_PEER_BUSY in
rx_NewCall when we select a call and clear lastBusy for that call.

Reviewed-on: http://gerrit.openafs.org/6707
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 2a4c6c3b9e1dc30d5599e67e02237a1aeef8a0f0)

Change-Id: I60d76469bc3dcf764e67524f39b3c55894e7ce99
Reviewed-on: http://gerrit.openafs.org/6763
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 09:33:19 -08:00
Derrick Brashear
7a42c8f7ec vol: allow clones of readonly volumes
allow writing of data where it's not user data we're changing
(e.g. allow a vnode to be marked cloned in the vnode index)

Reviewed-on: http://gerrit.openafs.org/6251
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 4b93c42513785d1094c5336b5c9cc4add1b89c5e)

Change-Id: I9849897ae69a426026f6d030ca4e50e8cd7066b2
Reviewed-on: http://gerrit.openafs.org/6762
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 09:33:09 -08:00
Derrick Brashear
2766185772 volser: allow clonevol purge id to be new id
effectively the same functionality that reclone already uses, but
for some reason we artificially limit it out of clone despite
the interface being there for it. it used to be there. put it back.

Reviewed-on: http://gerrit.openafs.org/6250
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 641c67473615e80cfb8cf1e67636a82e42e5c899)

Change-Id: I31df948a21639bd93c573c77207f0f6c9e43deed
Reviewed-on: http://gerrit.openafs.org/6761
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 09:33:01 -08:00
Derrick Brashear
6b66b3b705 volser: allow cloning non-rw volumes
remove EROFS error which is the only thing preventing a working clone
on a non-RW.

Reviewed-on: http://gerrit.openafs.org/6249
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit f1de04f3b35e91923efddca57e744b2138619223)

Change-Id: Ieb02a2d2c4d59681f5d6f372c7cd77a181d214dd
Reviewed-on: http://gerrit.openafs.org/6760
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 09:32:51 -08:00
Derrick Brashear
94e836ede7 libafs: kill rxevent daemon even in upcall mode
the switch from rxk listener env to upcall env could leave the event
daemon running. fix that.

Reviewed-on: http://gerrit.openafs.org/6713
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit a4d9fbaa8036cc78ae0119330314f6deab159c90)

Change-Id: I2e87c692ee2003a24590f700accc30704899db8b
Reviewed-on: http://gerrit.openafs.org/6759
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-23 09:32:42 -08:00
Ken Dreyer
d12024581e doc: refer to aklog instead of klog
klog (and kaserver) is deprecated. In generic examples, refer to the Kerberos
5 equivalents.

Reviewed-on: http://gerrit.openafs.org/6721
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 07d9b18e36fff6fc96c629ac2bebe8bb43f6b9dd)

Change-Id: I3e00b5d6acbdae35ac9ea645f094ebe46d391776
Reviewed-on: http://gerrit.openafs.org/6758
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-22 18:21:25 -08:00
Andrew Deason
887cffa07e RedHat: Fail openafs-client 'stop' on rmmod error
Currently, the openafs-client RPM init script ignores any error
reported by rmmod. If 'umount /afs' succeeds but rmmod does not, the
client may panic the machine if the client is started again (from e.g.
running the 'restart' init script method), since afsd will try to
initialize AFS with a libafs that has been shut down.

So, do not ignore errors from 'rmmod', and instead fail the 'stop'
method from the init script if we get an error.

Reviewed-on: http://gerrit.openafs.org/6709
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 12e2a3abe7ca640a7cef2630039c06964f779f17)

Change-Id: I31256abac839c9011754445efa09960f061fdbb0
Reviewed-on: http://gerrit.openafs.org/6757
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-22 17:56:24 -08:00
Jeffrey Altman
145ca8e5a7 doc: fix AdminGuide
The AdminGuide was broken by e99224f2fe049bc339e87c8b6c195de67dca2f08.

Reviewed-on: http://gerrit.openafs.org/6703
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
(cherry picked from commit aaab21e7a123ce701a8d5b2144032739fe177d6f)

Change-Id: I350186c617b3b39829c9af1ff6a4aa2835abbdc2
Reviewed-on: http://gerrit.openafs.org/6756
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-22 17:46:37 -08:00
Ken Dreyer
ba04fc5858 doc: add section on direct volume access
Provide examples of the direct volume access syntax, using the
fictitious example.com cell.

Reviewed-on: http://gerrit.openafs.org/6691
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
(cherry picked from commit e99224f2fe049bc339e87c8b6c195de67dca2f08)

Change-Id: I5b2ac3b6f255d5918eeea4a63d4c7bb6164961d5
Reviewed-on: http://gerrit.openafs.org/6755
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-22 17:46:04 -08:00
Andrew Deason
26a22e65ea viced: Keep H_LOCK while locking host in h_Alloc_r
Currently in h_Alloc_r, we h_Lock_r the host, so we have it locked on
return. However, h_Lock_r drops the host glock, which is bad in this
situation since we have already added the host to the global hash
table, so other threads may see it. This can mean that by the time
h_Alloc_r returns, the returned host may have HOSTDELETED set, and/or
the addresses associated with the host may be completely different.

h_Alloc_r's caller, h_GetHost_r, seems to assume that the host is
still associated with the address of the passed-in connection. When
this is not true, this can result in the host structure getting into a
strange state, such as the primary addr/port may not be hashed. The
host may also have HOSTDELETED set, in which case we're not supposed
to be dealing with it at all.

To avoid these problems, lock host->lock directly in h_Alloc_r,
without going through h_Lock_r and dropping H_LOCK. Also do it as one
of the first things we do to initialize the host, just to make sure
that if anybody else happens to see the host, it is locked by us when
they do.

Reviewed-on: http://gerrit.openafs.org/6389
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit d6f977830c164ee079c68101595c28ff1847f88f)

Change-Id: Ib0916f3a92c4a34555ee3fa2880dec10041bf047
Reviewed-on: http://gerrit.openafs.org/6754
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-22 17:45:54 -08:00
Marc Dionne
3edb993a8d viced: Allow null host for BreakCallBack
For replication writes at the remote site, we will want to call
this without a host structure.

Reviewed-on: http://gerrit.openafs.org/6674
Reviewed-by: Simon Wilkinson <simonxwilkinson@gmail.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 01301d0a5323a836efaae30cac325c25f6a7577a)

Change-Id: I1fb0dff655515fedd7dfb41139f1fb6c85599377
Reviewed-on: http://gerrit.openafs.org/6753
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-22 17:45:40 -08:00
Tom Keiser
c0184656b6 com_err: correctly deal with lack of libintl
On machines lacking a libintl, _intlize() currently fails to initialize
the output error string--leading to tools (e.g., translate_et) returning
a null string; make afs_com_err fall back to returning the en/US canonical
error text when we don't have any i18n support...

Reviewed-on: http://gerrit.openafs.org/6638
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit ef63547e955edc60e2d074ef825b091e1c43882e)

Change-Id: Id138e48826aa855bd87e47f201ed6840399aa640
Reviewed-on: http://gerrit.openafs.org/6752
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-22 17:45:15 -08:00
Christof Hanke
e0e23418bb linux: fix probing for noop_fsync
Commit 267934d0e6910c8d8166a6e78f93c1bab40857b8 introduced
probing code to deal with the renameing of simple_fsync
inside the linux-kernel.
This test does not take different parameter-lists
for noop_fsync or simple_fsync resp. into account.
Fix this.

Reviewed-on: http://gerrit.openafs.org/6628
Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 20e82cecd9008f9b3467c9a323c5c3abf27f3021)

Change-Id: I478a1ea15150ca120c8f85e9696d8bdedfc974d1
Reviewed-on: http://gerrit.openafs.org/6751
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-22 17:44:52 -08:00
Jeffrey Altman
4e1813248e viced: lockcount only valid if not expired
locks are issued on a lease.  If the lock is expired, the lock
count is zero.

Reviewed-on: http://gerrit.openafs.org/6740
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
(cherry picked from commit 4603057d99a1501275f14f6d5aba089364785e09)

Change-Id: I784bdccae6d5fb01c76590ccd34fb9efa417747e
Reviewed-on: http://gerrit.openafs.org/6750
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-22 17:44:38 -08:00
Andrew Deason
0329984e29 Disable kernel opt by default on Solaris 10 and 11
With newer Solaris Studio (sometime in the 12.* series), cc started
adding SSE instructions to optimized x86 code, which is invalid for
kernel code and can generate panics. There appears to be no way to
turn this off currently (-xvector=%none is non-functional), so default
to not optimizing kernel code.

Reviewed-on: http://gerrit.openafs.org/6671
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 80592c53cbb0bce782eb39a5e64860786654be9f)

Change-Id: If1539dd88d4d28771a7eafcdaff30a75cb230917
Reviewed-on: http://gerrit.openafs.org/6683
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-08 13:51:21 -08:00
Andrew Deason
72d2ef7fa2 SOLARIS: Use kcred instead of afs_osi_cred
For many vfs ops to the cache, we currently pass &afs_osi_cred for our
credentials, which is a mostly zeroed-out credential structure. In
some modern versions of Solaris (Solaris 11), at least some parts of
this structure need to not be NULL (cr_zone), or we will panic.

The Solaris kernel provides a 'kcred' credentials structure for the
purpose of using "kernel" credentials for i/o. So just use that
instead for Solaris 8 and beyond, since kcred has existed at least
since Solaris 8.

Reviewed-on: http://gerrit.openafs.org/6669
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit dc6beb3ea29a64bcf59807fd451a573aa54e1122)

Change-Id: I6fd0ce4a890c2e6d9377cad39f47303aa1687a6b
Reviewed-on: http://gerrit.openafs.org/6682
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-08 13:51:12 -08:00
Andrew Deason
f0e648bfa3 afs: Panic on afs_conn refcount imbalance
An undercounted afs_conn can easily cause a panic and/or memory
corruption later on, since we put an rx_connection reference with each
afs_conn reference. Panic as soon as we detect this, as this indicates
a serious bug.

Reviewed-on: http://gerrit.openafs.org/6413
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 8a574ba16a80fc2b8b703ddcfc99486b977e6071)

Change-Id: Ibd60dafdf1a800349b73754dae18666fa0edd300
Reviewed-on: http://gerrit.openafs.org/6642
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-07 06:39:06 -08:00
Michael Meffie
bd5acb7f79 Unix CM: reset blacklist on hard-mount retry
Reset black-listed servers on a request when retrying due to a
hard-mount retry. When hard-mounts are in effect, a request may
retry indefinitely. If all the servers have been black-listed
due to a transient error, the request may never complete.

Reviewed-on: http://gerrit.openafs.org/6330
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit faa58c9f60a158481bdfee27e175a37c5fcd64aa)

Change-Id: I1ecc3fa78c064c46849dec47c77f2fc405f2ee7f
Reviewed-on: http://gerrit.openafs.org/6641
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Derrick Brashear <shadow@dementix.org>
2012-02-07 06:38:28 -08:00
Jonathan Billings
b49edd57c1 Linux: rpm: Update openafs.spec.in to include changes to installed files
Pull up some more of 3f7d8ec219e1aa04b6c0417ecf5e730d40b4f149 to
handle changes that have made it into 1.6 since the last pullup:

* Exclude the aklog_dynamic_auth man page, since it is AIX-only
* Add new files that have appeared in the distribution, such as the
'afsio' binary.

Reviewed-on: http://gerrit.openafs.org/4814
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>

(cherry-picked from 3f7d8ec219e1aa04b6c0417ecf5e730d40b4f149)

Change-Id: Ib702f39d930057d92eca4d157fddb633cccf9fae
Reviewed-on: http://gerrit.openafs.org/6640
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
2012-02-05 10:07:35 -08:00
Andrew Deason
1bbb4cdb55 SOLARIS: Do not build x86 kernel module on 5.11
Oracle Solaris 11 no longer supports x86 (amd64 is required). If we
try to build the x86 module, /usr/include/sys/kobj.h complains that
the ISA is unsupported, and refuses to go on. So, just remove
MODLOAD32 from the libafs directories to build on sunx86_511.

Reviewed-on: http://gerrit.openafs.org/5835
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit c6a22d67ff9787ace2249d528eb9db99c5b19427)

Change-Id: I00f9f19653a2f98276c236d7e2331bc81f7c4f13
Reviewed-on: http://gerrit.openafs.org/6643
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-02-05 10:06:55 -08:00
Derrick Brashear
b9faf2ecb1 make openafs 1.6.1pre2
prerelease for 1.6.1

Change-Id: I3dbef9e4d360314cd4c789268d7b0d5c5011f6fc
Reviewed-on: http://gerrit.openafs.org/6614
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
openafs-stable-1_6_1pre2
2012-01-23 18:50:17 -08:00
Derrick Brashear
88afb6b0b1 viced: disable rx keepalives during disk io
when we are going to hit the backend storage, disable keepalives.
the net effect of this is that no idle dead time is needed; instead,
the normal dead time will result in a connection with no activity
simply dying naturally if i/o blocks forever.

it's important that keepalives be enabled during callback breaks,
so that is done.

(cherry picked from commit 05f3a0d1e0359f604cc6162708f3f381eabcd1d7)
Reviewed-on: http://gerrit.openafs.org/6515
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Change-Id: If2ee7f3ad7f2dc835dd350bb9558fde0aa179240
Reviewed-on: http://gerrit.openafs.org/6613
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-01-23 18:49:54 -08:00
Jeffrey Altman
0b872544d0 rx: RX_CALL_IDLE and RX_CALL_BUSY
Allocate new Rx error codes for Idle and Busy calls but do not
send these errors on the wire.  They are only intended for local
use.

RX_CALL_IDLE is an indication to an application that requests it
that the rx peer is maintaining an open call channel but has not
sent any actual data for the length of the registered idle dead
timeout.

RX_CALL_BUSY is an indication to an application that requests it
that the rx peer believes the selected call channel is in use by
a pre-existing call.

When either RX_CALL_IDLE or RX_CALL_BUSY are assigned as the call
error and an abort must be sent to the rx peer, the errors are
translated to RX_CALL_TIMEOUT.  This is necessary because it is
not possible to add new Rx error values in a method that is safe
for peers that are not expecting them.

This patchset also documents which Rx errors defined in rx.h are
used on the wire and which are not.

The Unix and Windows cache managers are updated to build with
these new error codes.

eviewed-on: http://gerrit.openafs.org/6128
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
(cherry picked from commit c7673f4fad8e8b9390564e3cbfa11d5f1b52ba2f)
Change-Id: I4c7d6733ddae03bda5a31fe4486ada090dcfd0b3
Reviewed-on: http://gerrit.openafs.org/6612
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-01-23 18:49:44 -08:00
Andrew Deason
afce83a967 RX: Avoid timing out non-kernel busy channels
When we encounter a "busy" call channel (indicated by receiving
RX_PACKET_TYPE_BUSY packets), we can error out a call with
RX_CALL_TIMEOUT to try and get the application code to retry the call.
However, many RX applications are not aware of this, and will just
fail with an error upon receiving a single busy packet.

So instead, make this behavior optional, and only do it if the
application tells us what specific error it expects to receive when a
busy call channel is detected. Enable this behavior for the Unix cache
manager, as it can cope with receiving an RX_CALL_TIMEOUT error in
this scenario.

(cherry picked from commit eddcee3ad518dff9fbfda790640c5bfd2e97ef5a)
Reviewed-on: http://gerrit.openafs.org/4159
Reviewed-by: Jeffrey Altman <jaltman@openafs.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>

Change-Id: I3938e79ab009f14f5421a4a45e2a099276c49f24
Reviewed-on: http://gerrit.openafs.org/6611
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-01-23 18:49:34 -08:00
Derrick Brashear
05fb252977 libafs: add replicated connection pool
keep pool of connections to use for replicated volumes,
so we can have a separate idle time setting

(cherry picked from commit cd1f72649650404581cfcdcf3beeeaf2bb960bd6)
Reviewed-on: http://gerrit.openafs.org/6546
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Change-Id: I056ba28d11313c9925df63869e0c55a1a4f132da
Reviewed-on: http://gerrit.openafs.org/6610
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-01-23 18:49:19 -08:00
Andrew Deason
428400fb83 vol: remove SYNC fatal_error processing
Currently SYNC clients will "disable" themselves on certain error
patterns. For example, if the server end closes its file descriptor
too many times, or takes too long and then closes the fd, the SYNC
client will return an error and set fatal_error. On any subsequent
SYNC requests, the request will immediately fail without contacting
the server, often making SYNC client programs effectively useless
until they are restarted.

There isn't really any reason to cause future requests to fail.
Transient problems in the fileserver can easily make this situation
possible (e.g. a fileserver can crash but still take several minutes
to close the SYNC fd while the core is written to disk), and so while
we may return an error for a specific problematic request, future
requests may be fine.

So, just remove everything related to fatal_error, so future SYNC
requests can continue to be attempted. Adjust some log messages to
reflect the new behavior.

Reviewed-on: http://gerrit.openafs.org/6548
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 40bf6dee2409197f7494c3d09bf2dea7c248d185)

Change-Id: I0f7a1792afd1ace3beabe238107d0a5069ccbb44
Reviewed-on: http://gerrit.openafs.org/6609
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-01-23 18:30:24 -08:00
Marc Dionne
339438c847 rx: Correctly test for end of call queue
The intention of this condition is to check if the current call
being considered is the last one on the queue, but the test is
incorrect.  A null next pointer indicates a removed item, not
the end of the queue.

Use the queue_IsLast macro instead to correctly determine that
this is the last item in the queue and that a call has to be
selected, either the current one or a previously seen good choice.

This can cause calls to get permanently stuck in the call queue
and never get assigned to a thread, even when all threads are
idle.

Reviewed-on: http://gerrit.openafs.org/6564
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 6ad3d646e62801cb81a3c9efeac320daa44936e1)

Change-Id: Ic9d0ff51c79115960ebb4634fc35a5e9da21c380
Reviewed-on: http://gerrit.openafs.org/6570
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-01-23 07:26:04 -08:00
Marc Dionne
39f0511911 Linux: use standard macro for set_nlink configure test
A generic macro exists to test for functions in the kernel, use
it for set_nlink.

Reviewed-on: http://gerrit.openafs.org/6566
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 64bd0b728ca95ba7bb4f1fdd909386fde3ce81e1)

Change-Id: I93d169bec8f476d5e692f7f5a7fe31002af7ce1e
Reviewed-on: http://gerrit.openafs.org/6569
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-01-23 07:25:27 -08:00
Andrew Deason
e6ddb2a188 vol: Fix VCreateVolume special inode cleanup
In order to dec the relevant special inodes, we need to know the
parent vol id in addition to the vol id itself. Use the appropriate
volume IDs when IH_DEC'ing special inodes after we fail to create the
volume, so we don't leave behind special inodes.

(cherry picked from commit 627cfb1d4e7b32b4342c59b162a36ba9beb8a066)
Reviewed-on: http://gerrit.openafs.org/6529
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Change-Id: I9f40f170cd6a0fffe2e17fc199af99e087066902
Reviewed-on: http://gerrit.openafs.org/6550
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-01-15 21:16:59 -08:00