Commit Graph

9905 Commits

Author SHA1 Message Date
Jeffrey Altman
865f2442e6 Windows: fix cm_DirOpDelBuffer assert
In cm_DirOpDelBuffer() the data version field for a buffer
in cm_dirOp_t.buffers[] can be CM_BUF_VERSION_BAD if the buffer
was added to the buffer list but was never fetched from the file
server.  If the buffer was recycled by buf_Get() an attempt to
remove an entry from the directory will be failed as opposed to
fetching the buffer from the file server and performing the local
removal.

Change-Id: Id9af5180f2176c2a90ef9907ae84139e66ffe5d6
Reviewed-on: http://gerrit.openafs.org/6650
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-02-03 13:36:45 -08:00
Jeffrey Altman
25142a9c2d Windows: buffer DV ranges do not work for directories
In cm_MergeStatus, always set cm_scache_t.bufDataVersionLow
to the new data version because the cm_dir package does not
support version ranges.   All modified dir buffers have their
dataVersion field set to the current data version value.

Failure to update the bufDataVersionLow field can result in
B+ Trees being constructed from out of date directory information.

Change-Id: Ic6bb6f78275de9c6c7960f2fc7c06c507b1144c1
Reviewed-on: http://gerrit.openafs.org/6649
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-02-03 13:36:33 -08:00
Jeffrey Altman
09ab91bf9d Windows: update btree debugging code
B+Tree key strings were changed to wchars for unicode support,
the debugging printf format patterns were not updated to match.
Do so now.

Change-Id: I70619d2e3fbc007f3f21eaf56cc5d61503203818
Reviewed-on: http://gerrit.openafs.org/6648
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-02-03 13:36:20 -08:00
Jeffrey Altman
4224dc5c28 Windows: Do not open file if shutdown in progress
Perform the shutdown check earlier in AFSCommonCreate() to prevent
a request from being processed after the service indicates that
a shutdown has begun.

Change-Id: I8959141b5e2161ffe960e93a500b1153d9594a28
Reviewed-on: http://gerrit.openafs.org/6647
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-02-03 13:36:08 -08:00
Jeffrey Altman
209df87d08 Windows: AFSRedir DebugFlags Turn on BugCheck
Turn on bug checking by default via the installation.
This permits sites to disable the functionality but will allow
us to capture more meaningful minidump output.

Change-Id: I62b6d0ce5deed2c8798c9afb09565a8846c32a8c
Reviewed-on: http://gerrit.openafs.org/6646
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-02-03 13:35:45 -08:00
Jeffrey Altman
fe952116f3 Windows: Improve AFSNotifyDelete
Do not call AFSNotifyDelete after the reference count on the
DirEntry->ObjectInformation is given up.

Log the Parent FID and file name since that is what are passed
to the service to perform a  delete.  Log the actual FID of the
object being deleted and not the address of the FID fields.

Change-Id: Ic02e2cec625258356d1b08e03a02a7a9c4eb4ce7
Reviewed-on: http://gerrit.openafs.org/6645
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-02-03 13:35:24 -08:00
Jeffrey Altman
9a1d7518b6 Windows: do not lower case direct volume references
Not all volumes are lower case.  Do not lowercase the string.

Change-Id: Icb5f5ee9865bd856775486dffb1849f17f9b23f7
Reviewed-on: http://gerrit.openafs.org/6644
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-02-03 13:34:58 -08:00
Tom Keiser
ef63547e95 com_err: correctly deal with lack of libintl
On machines lacking a libintl, _intlize() currently fails to initialize
the output error string--leading to tools (e.g., translate_et) returning
a null string; make afs_com_err fall back to returning the en/US canonical
error text when we don't have any i18n support...

Change-Id: I333745fb0a16e5bc9adb0755591d80de010d4d31
Reviewed-on: http://gerrit.openafs.org/6638
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-02-01 13:50:50 -08:00
Christof Hanke
20e82cecd9 linux: fix probing for noop_fsync
Commit 267934d0e6 introduced
probing code to deal with the renameing of simple_fsync
inside the linux-kernel.
This test does not take different parameter-lists
for noop_fsync or simple_fsync resp. into account.
Fix this.

Change-Id: Ib490f0bb7e8098acc83fce001a43c08f478ad582
Reviewed-on: http://gerrit.openafs.org/6628
Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>
2012-02-01 04:18:19 -08:00
Jeffrey Altman
87049b873b man-pages: add fs_getverify and fs_setverify
Add man pages for two new Windows only commands

  fs getverify
  fs setverify -verify {on, off}

Change-Id: Id784608fba35147a4e33f22e43c7cd50a2307b9e
Reviewed-on: http://gerrit.openafs.org/6632
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-29 13:54:36 -08:00
Jeffrey Altman
7b3f5df6dc Windows: do not panic if afsredir not ready during shutdown
Change-Id: I0de6ad0f799e2acf1c02c6d53cfd9b1b437328fc
Reviewed-on: http://gerrit.openafs.org/6630
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-29 12:41:24 -08:00
Jeffrey Altman
5e08628da2 Windows: Increase size of worker thread pools
The size of the afs redirector worker thread pools should be
made configurable but for now just increase the pool size to
be in parity with the default worker pool created by the
afsd service.

Change-Id: Ib3c9356783162620112041582fa3d9dbaf8fce37
Reviewed-on: http://gerrit.openafs.org/6627
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-29 10:40:49 -08:00
Jeffrey Altman
0f65600b67 Windows: Run Workers until empty task queue
Do not allow a worker thread to sleep until the task queue is
empty.  It is better for the running thread to pick up and process
a task then to sleep this thread and wait for another one to wake
up to perform the work.

Change-Id: I776bb9408ab054b045acb9bc003b88436cc4266b
Reviewed-on: http://gerrit.openafs.org/6626
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-29 10:40:10 -08:00
Jeffrey Altman
55af3387ef Windows: Release Notes for 1.7.5
Release notes updates for 1.7.5.

Change-Id: Ie44441150fc077cc4ca7924c67322a1aed4cb9af
Reviewed-on: http://gerrit.openafs.org/6624
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-28 21:33:07 -08:00
Jeffrey Altman
de4d12dd53 Windows: Stop the thundering herd
The afs redirector used notification events to wake up worker
threads when a task was added to a work queue.  Notification
events when signalled wake up all threads instead of just one.

Instead, use synchronization events to wake up a single thread at
a time and restructure the code to permit workers to wake up
additional workers if there is additional work to be performed
or during library shutdown.

Thanks to Peter Scott for his assistance.

Change-Id: I0fb9d8578035f606f03170622fc9c50a1dbfee3a
Reviewed-on: http://gerrit.openafs.org/6595
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-26 16:11:41 -08:00
Jeffrey Altman
1161d5fc3c Windows: DriveSubstitution handle too small buffer
If the buffer passed to DriveSubstitution is too small the
resulting file path will end up being truncated.  At the very
least log the fact that truncation is occurring.  In addition
return the fact that truncation occurred to the caller.

In NPGetUniversalName allocate a 4K buffer on the heap instead
of calculating a buffer based on the local name buffer size.
The local name buffer size has no relationship with the required
buffer size for the expanded unc or device path.

FIXES 130548

Change-Id: I86fbb9db4aa6a438dbb5e793678ec52283d5546b
Reviewed-on: http://gerrit.openafs.org/6618
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-26 16:11:28 -08:00
Jeffrey Altman
3d10edc2d4 Windows: Invalidate all volumes at library init
The afsredirlib.sys library driver is unloaded when the afsd_service
stops and is reloaded when the afsd_service restarts.  During the
shutdown window any objects known to the kernel are preserved by
afsredir.sys.  When the afsd_service restarts, there are no valid
callbacks on any objects so the afsredirlib.sys must invalidate all
status info to permit the service to request a callback from the
file server on next use.

Change-Id: I3e8fa9513f435ff5cd1a8cfb8daa766aa30dd8c1
Reviewed-on: http://gerrit.openafs.org/6617
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-26 16:11:18 -08:00
Jeffrey Altman
e44163a547 Windows: Refactor and consolidate afsredir invalidation
Invalidation requests were being processed in an inconsistent
manner because different rules were being applied to volume root
directories and other objects and whether or not the invalidation
was a whole volume invalidation or not.

This patchset consolidates all invalidation logic for an object
in the new AFSInvalidateObject function.  AFSInvalidateObject
is then called from AFSInvalidateCache and AFSInvalidateVolume
as necessary.

AFSInvalidateVolume executes AFSInvalidateObject on all objects
in the volume object tree.  As a result, whole volume invalidations
whether triggered by the file server or "fs flushvolume" now work.

Change-Id: I83f110b0987eb153794b6803a1fe48247090277f
Reviewed-on: http://gerrit.openafs.org/6616
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-26 16:11:08 -08:00
Marc Dionne
e0eb5405a8 vlserver: Consolidate VLDB entry server flag definitions
Group the definitions of server flags for VLDB entries in one place,
and rename VLSERVER_FLAG_UUID to make its name consistent with the
other flags.
This makes it easier to see the complete set of flags and avoid
conflicts.

Change-Id: I3b326e3d97bc297c0314cfc48f0a066c3ff0415e
Reviewed-on: http://gerrit.openafs.org/6615
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-01-23 18:49:05 -08:00
Simon Wilkinson
ca0fdd84a4 viced: Remove the LWP fileserver
*) Remove all LWP specific code from the fileserver, and make pthread
   the default
*) Build the pthreaded fileserver in the 'viced' directory, rather than
   in tviced
*) Move the DAFS specific files from tviced to viced (arguably, these
   should move into dviced, but there are currently no source files in
   that directory)
*) Remove tviced from the build

Change-Id: I6e186c9fad6d9dccd04cf1317a80c087587ef25f
Reviewed-on: http://gerrit.openafs.org/5816
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
2012-01-23 14:18:59 -08:00
Andrew Deason
40bf6dee24 vol: remove SYNC fatal_error processing
Currently SYNC clients will "disable" themselves on certain error
patterns. For example, if the server end closes its file descriptor
too many times, or takes too long and then closes the fd, the SYNC
client will return an error and set fatal_error. On any subsequent
SYNC requests, the request will immediately fail without contacting
the server, often making SYNC client programs effectively useless
until they are restarted.

There isn't really any reason to cause future requests to fail.
Transient problems in the fileserver can easily make this situation
possible (e.g. a fileserver can crash but still take several minutes
to close the SYNC fd while the core is written to disk), and so while
we may return an error for a specific problematic request, future
requests may be fine.

So, just remove everything related to fatal_error, so future SYNC
requests can continue to be attempted. Adjust some log messages to
reflect the new behavior.

Change-Id: I4b8bfe53f591a9e8541cd5a98c909208df5bcbac
Reviewed-on: http://gerrit.openafs.org/6548
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-01-23 07:28:03 -08:00
Derrick Brashear
cd1f726496 libafs: add replicated connection pool
keep pool of connections to use for replicated volumes,
so we can have a separate idle time setting

Change-Id: I61ed62c652c924b33fde920fac766c4ca0043826
Reviewed-on: http://gerrit.openafs.org/6546
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-01-23 07:23:16 -08:00
Jeffrey Altman
a9803ae643 Windows: make lock reader history debug only
The lock reader history on osi_rwlock is proving to be too
expensive.  Only use it for DEBUG builds.  Leave the data
structures the same so that DEBUG builds can be mixed with
a RELEASE build of afsd_service.exe.

Change-Id: If0eeddb63c8f9919cdb5e119f31cde77974447b6
Reviewed-on: http://gerrit.openafs.org/6559
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-22 21:25:23 -08:00
Jeffrey Altman
dfd0c2acc1 Windows: store data verification mode
Over the lifetime of OpenAFS a number of bugs have been discovered
that can result in data corruption.  This new mode (Windows only)
will double check that the data received by the file server does
in fact match the data that was written by the cache manager.

After a successful StoreData and status merge but before the BIOD
is released, a fetchdata is issued to read the data written by the
cache manager.  If the data fails to match, the StoreData operation
is repeated.

Data verification mode can be queried with "fs getverify" and set
with "fs setverify {on, off}".  The default value can be set with
the TransarcAFSDaemon\Parameters DWORD "VerifyData" registry value.

By default verification is disabled.

Change-Id: Ic99c1692e6e78790e65ae600c3e428a79df59370
Reviewed-on: http://gerrit.openafs.org/6601
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-22 21:25:09 -08:00
Jeffrey Altman
1474b4a739 Windows: VIOC_GETUNIXMODE = smb_IoctlGetUnixMode
VIOC_GETUNIXMODE pioctl should execute smb_IoctlGetUnixMode not
smb_IoctlSetUnixMode.

Change-Id: Ia7dc3e1a82d7d14810f743f50ff7666f13ba8afc
Reviewed-on: http://gerrit.openafs.org/6600
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-22 21:24:55 -08:00
Jeffrey Altman
898930fc3c Windows: fix fs setcrypt help message
Options are on, auth, and off.

Change-Id: I671df4233801f39482b8cac096e89fa38955a852
Reviewed-on: http://gerrit.openafs.org/6599
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-22 21:24:41 -08:00
Jeffrey Altman
111de76ea8 Windows; release BIOD after status merge
Releasing the BIOD permits the accumulated buffers to be accessed.
Releasing the BIOD before the cm_MergeStatus() call creates a
window where the buffer data version is larger than the cm_scache
data version.  Release the BIOD after the status merge.

Change-Id: I023413cd41fbbd2d844d79a3b29c087792fffa24
Reviewed-on: http://gerrit.openafs.org/6598
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-22 21:24:28 -08:00
Derrick Brashear
05f3a0d1e0 viced: disable rx keepalives during disk io
when we are going to hit the backend storage, disable keepalives.
the net effect of this is that no idle dead time is needed; instead,
the normal dead time will result in a connection with no activity
simply dying naturally if i/o blocks forever.

it's important that keepalives be enabled during callback breaks,
so that is done.

Change-Id: I1a7bfe0bc62a092ca7dd6dbc4710f1b8254ca9a1
Reviewed-on: http://gerrit.openafs.org/6515
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-22 21:23:19 -08:00
Jeffrey Altman
6e85044efe Revert "Windows: disable memory extent interface"
This reverts commit 503bc56403

Change-Id: I9e40787ecd0833370a86486fab6644667e03aa3b
Reviewed-on: http://gerrit.openafs.org/6603
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-22 21:22:53 -08:00
Marc Dionne
44261b2564 viced: remove FS_STATS_DETAILED
FS_STATS_DETAILED has been unconditionally defined since the IBM days.
Adjust the code to assume it is set.

Change-Id: If7fb913bbb42dba5d749e7c30b8d9b7d81e4b4f8
Reviewed-on: http://gerrit.openafs.org/5550
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
2012-01-22 07:37:30 -08:00
Jeffrey Altman
9056d09887 Windows: failover and retry for VBUSY
When a file server returns the VBUSY error for an RPC the
cache manager records the 'srv_busy' state in the cm_serverRef_t
structure binding that file server to the active cm_volume_t
object.  The 'srv_busy' was never cleared which prevents the
volume from being accessed.

Clear the 'srv_busy' flag whenever cm_Analyze() receives a
CM_ERROR_ALLBUSY error which means that all replicas have
been tried or whenever the error is not VBUSY or VRESTARTING.

FIXES 130537

Change-Id: I5020198e4f0ded1df0f64e228e699852f9de7c4d
Reviewed-on: http://gerrit.openafs.org/6563
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-20 09:33:51 -08:00
Jeffrey Altman
f768fb95f3 Windows: improved idle dead time handling
RX_CALL_IDLE has been treated the same as RX_CALL_DEAD which is
a fatal error that results in the server being marked down.  This
is not the appropriate behavior for an idle dead timeout error
which should not result in servers being marked down.

Idle dead timeouts are locally generated and are an indication
that the server:

 a. is severely overloaded and cannot process all
    incoming requests in a timely fashion.

 b. has a partition whose underlying disk (or iSCSI, etc) is
    failing and all I/O requests on that device are blocking.

 c. has a large number of threads blocking on a single vnode
    and cannot process requests for other vnodes as a result.

 d. is malicious.

RX_CALL_IDLE is distinct from RX_DEAD_CALL in that idle dead timeout
handling should permit failover to replicas when they exist in a
timely fashion but in the non-replica case should not be triggered
until the hard dead timeout.  If the request cannot be retried, it
should fail with an I/O error.  The client should not retry a request
to the same server as a result of an idle dead timeout.

In addition, RX_CALL_IDLE indicates that the client has abandoned
the call but the server has not.  Therefore, the client cannot determine
whether or not the RPC will eventually succeed and it must discard
any status information it has about the object of the RPC if the
RPC could have altered the object state upon success.

This patchset splits the RX_CALL_DEAD processing in cm_Analyze() to
clarify that only RX_CALL_DEAD errors result in the server being marked
down.  Since Rx idle dead timeout processing is per connection and
idle dead timeouts must differ depending upon whether or not replica
sites exist, cm_ConnBy*() are extended to select a connection based
upon whether or not replica sites exist.  A separate connection object
is used for RPCs to replicated objects as compared to RPCs to non-replicated
objects (volumes or vldb).

For non-replica connections the idle dead timeout is set to the hard
dead timeout.  For replica connections the idle dead timeout is set
to the configured idle dead timeout.

Idle dead timeout events and whether or not a retry was triggered
are logged to the Windows Event Log.

cm_Analyze() is given a new 'storeOp' parameter which is non-zero
when the execute RPC could modify the data on the file server.

Change-Id: Idef696b15a8161335aa48907c15a4dc37f918bdb
Reviewed-on: http://gerrit.openafs.org/6118
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
2012-01-20 08:55:14 -08:00
Jeffrey Altman
c7673f4fad rx: RX_CALL_IDLE and RX_CALL_BUSY
Allocate new Rx error codes for Idle and Busy calls but do not
send these errors on the wire.  They are only intended for local
use.

RX_CALL_IDLE is an indication to an application that requests it
that the rx peer is maintaining an open call channel but has not
sent any actual data for the length of the registered idle dead
timeout.

RX_CALL_BUSY is an indication to an application that requests it
that the rx peer believes the selected call channel is in use by
a pre-existing call.

When either RX_CALL_IDLE or RX_CALL_BUSY are assigned as the call
error and an abort must be sent to the rx peer, the errors are
translated to RX_CALL_TIMEOUT.  This is necessary because it is
not possible to add new Rx error values in a method that is safe
for peers that are not expecting them.

This patchset also documents which Rx errors defined in rx.h are
used on the wire and which are not.

The Unix and Windows cache managers are updated to build with
these new error codes.

Change-Id: Ib236f27b88d503c68134534bb069e12dd83537d8
Reviewed-on: http://gerrit.openafs.org/6128
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-20 08:39:10 -08:00
Peter Scott
f6828bd9f1 Windows Asynchronous purging of file content after a DV change
Purge all regions of the file surrounding the extents which are to be
purged. If a failure occurs on the purge due to an existing mapping, flag
for purge during handle close

Change-Id: Id8ef81afaa614ea08e03bbd55ec2cdded0d7139f
Reviewed-on: http://gerrit.openafs.org/6573
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-19 23:07:03 -08:00
Jeffrey Altman
22cba8e970 Windows: cm_buf refcnt must hold buf_globalLock
An assertion in buf_Recycle() was being triggered when a cm_buf_t
object was supposed to be in the free buffer list but wasn't.
buf_Recycle() was racing with another thread.  The test for
refCount == 0 was performed while holding the buf_globalLock
exclusively but the InterlockedDecrement(refCount) in buf_Release()
was performed without holding buf_globalLock at all.  buf_globalLOck
must be held at least as a read lock.  Otherwise, the refCount can
reach 0 prior to the thread blocking for exclusive access to the
buf_globalLock.  This provides buf_Recycle() which is holding
buf_globalLock the opportunity to race.

The solution is to make sure that buf_Release() always holds
buf_globalLock as a read lock and then use buf_ReleaseLocked()
to perform the actual decrement and test.

Change-Id: Ieb67548a7e44fa5f06f9346f428b1edadfc80696
Reviewed-on: http://gerrit.openafs.org/6576
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-19 15:49:09 -08:00
Jeffrey Altman
201c954a36 Windows: Redesign daemon thread queue management
The daemon thread worker pool has some very poor properties.
The threads spend a significant amount of time polling for
ready to process tasks because so frequently a store/fetch data
request is accompanied by many other requests for the same FID
that would block.

Lets try a new approach. Create one queue for each worker thread
and assign the tasks to a thread by a hash of the FID.  This ensures
that all tasks for a single FID are serialized and prevents multiple
threads from attempting to perform the same task only to decide that
the thread would be forced to block.

Change-Id: I1d00ba0df1aa646e05b2cb3cb0796629f2e6d233
Reviewed-on: http://gerrit.openafs.org/6575
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-19 15:48:43 -08:00
Jeffrey Altman
afeb3c3a83 Windows: prevent race assigning Fcb in AFSInitFcb()
AFSInitFcb() is executed when the ObjectInformation->Fcb pointer
is NULL.  More than one thread can make that determination at the
same time.  Use InterlockedCompareExchangePointer() to detect
a race and permit cleanup to be performed.

Remove the output parameter of AFSInitFcb() to avoid a double
assignment.

Change-Id: I3870cccd5cd5e95134446523cce3547a2135d5e3
Reviewed-on: http://gerrit.openafs.org/6562
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-19 15:38:50 -08:00
Jeffrey Altman
503d09413a Windows: cm_EndCallbackGrantingCall refactoring
Refactor cm_EndCallbackGrantingCall to prevent assigning a
callback to the cm_scache object in the case where it is going
to be discarded.  If the race was lost the callback data was
already discarded by cm_RevokeCallback.  By assigning and then
discarding we are forced to issue an additional change notification
to the smb client or afs redirector.  Not only is this extra work
but the afs redirector notification can result in a deadlock with
a kernel thread that is waiting for the current thread to complete.

modify the function signature to return whether or not a race
was lost with a callback revocation.

rename 'freeFlag' to 'freeRacingRevokes' since that is what
the flag is meant to indicate.

create a new 'freeServer' flag to indicate when the server
reference should be released.  There was a leak of server
references when a race occurred.

modify all calls to cm_EndCallbackGrantingCall() that provide
an AFSCallBack structure on input to check for a lost race.
If a race occurs, cm_MergeStatus() should not be performed.

Change-Id: Ib17091ed51a24826bf84d33235125b3ccbbe47d4
Reviewed-on: http://gerrit.openafs.org/6556
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-18 15:16:46 -08:00
Jeffrey Altman
d9884a480c Windows: deadlock bet. DirEntry lock + DirectoryNodeHdr.TreeLock
The DirectoryNodeHdr.TreeLock must be obtained before the
DirEntry->NonPaged->Lock.  In AFSLocateNameEntry(), the
DirEntry lock is obtained before the TreeLock when processing
a symlink object.  For that case obtain the TreeLOCK first.
Drop it if it is not required.

Change-Id: I5b73f98b4bc7fcd5c02b8f255fa2423b52eb4a4d
Reviewed-on: http://gerrit.openafs.org/6558
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-18 15:12:27 -08:00
Peter Scott
2fa7f36f74 Windows: Correctly mark extents dirty when using the non-persistent AFS
cache

Change-Id: I9e03264bb94fe6494f1ca3721e4d7c7faf469fb5
Reviewed-on: http://gerrit.openafs.org/6571
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-18 14:54:02 -08:00
Peter Scott
3cf5064c91 Windows: Performing async work after cache invalidation
The code now queues a work item to perform additional work on extent
processing after a cache invalidation has occurred. This additional work
involves walking the current list of extents and purging/flushing regions of
the system cache based upon the current state of the extent.
Additional changes to filter which invlidation events result in a queued
worker to perform asynchronous work.

Change-Id: I72e4e0bac2caf69e41a095ce8fc4c2e083702b5c
Reviewed-on: http://gerrit.openafs.org/6528
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-18 14:53:28 -08:00
Marc Dionne
6fda61ba2e Parallel build fixes
Assorted fixes for issues seen with parallel builds:
- bucoord must depend on butm, since it uses libbutm
- for most object files in roken and hcrypto, headers must be installed
  before building
- remove rules with 2 targets in rxkad and ubik
- budb: add dependencies for db_dump.o

Change-Id: Ide05f223c2f1fe53bff33cb03011ca47bf741c80
Reviewed-on: http://gerrit.openafs.org/6568
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
2012-01-18 10:43:31 -08:00
Marc Dionne
beafc7f742 Linux 3.3: use umode_t for mkdir and create inode ops
The mkdir and create inode operations have switched to using
umode_t instead of int for the file mode.

Change-Id: Ib8bbf6eaa6e87d6a9692c45b1a3fe93fcc3eff7a
Reviewed-on: http://gerrit.openafs.org/6567
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
2012-01-18 10:12:40 -08:00
Marc Dionne
64bd0b728c Linux: use standard macro for set_nlink configure test
A generic macro exists to test for functions in the kernel, use
it for set_nlink.

Change-Id: Iaec2b29e48f500bcf7a1ef80a3f2a1305e5dbb8f
Reviewed-on: http://gerrit.openafs.org/6566
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
2012-01-18 09:28:23 -08:00
Derrick Brashear
60df98806b volinfo: fix formating of placeholder printfs
needed to placate gcc-llvm on lion

Change-Id: Ie15e4768d2e3feb7ad80dfef05395f2c4a227c0f
Reviewed-on: http://gerrit.openafs.org/6565
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-01-18 09:27:59 -08:00
Marc Dionne
6ad3d646e6 rx: Correctly test for end of call queue
The intention of this condition is to check if the current call
being considered is the last one on the queue, but the test is
incorrect.  A null next pointer indicates a removed item, not
the end of the queue.

Use the queue_IsLast macro instead to correctly determine that
this is the last item in the queue and that a call has to be
selected, either the current one or a previously seen good choice.

This can cause calls to get permanently stuck in the call queue
and never get assigned to a thread, even when all threads are
idle.

Change-Id: Ie44a45734ab25bd3d2be3635c2e8f05857ca935e
Reviewed-on: http://gerrit.openafs.org/6564
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
2012-01-18 09:27:44 -08:00
Jeffrey Altman
20151a8699 Windows: disable memory extent interface
There have been reports that the memory extent interface which
is used when NonPersistentCache is active can lead to data corruption.

Change-Id: I3a8acae0648a67534e46c73ef1dcbf7f939a558d
Reviewed-on: http://gerrit.openafs.org/6557
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-18 07:36:36 -08:00
Jeffrey Altman
69196e41ec Windows: restrict service to 2 cpus by default
Performance drops off considerably when the number of processors
increases due to lock contention and the cm_SyncOp wait processing.
If the MaxCPUs registry value is not set, limit ourselves to two.
Setting MaxCPUs to zero permits use of all CPUs.

Change-Id: I4bae328ed589811b0ea2a514501a0c1aa74e8015
Reviewed-on: http://gerrit.openafs.org/6555
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-18 07:35:35 -08:00
Jeffrey Altman
7ae2c0df33 Windows: AFS_SERVER_FLUSH_DELAY AFS_SERVER_PURGE_DELAY
Alter the flush delay to 5 seconds from 30 seconds

Alter the purge delay to 300 seconds from 5 seconds

Change-Id: I3f8e79d84582c4015e35d58cf1bedc9a023c0d73
Reviewed-on: http://gerrit.openafs.org/6554
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-18 07:35:17 -08:00
Jeffrey Altman
f549911027 Windows: AFSParseName edge cases
If the input path is \afs\ behave as if the path is \afs.

If the input path is \afs\*\ detect the wildcard and return
STATUS_OBJECT_NAME_INVALID.

Change-Id: I0ef4f30fb3b6245a52160b5e7f9233bc5f599485
Reviewed-on: http://gerrit.openafs.org/6553
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
2012-01-18 07:34:56 -08:00