From c0f52c3a3d76059c9d8b2df3374df844d8d6861b Mon Sep 17 00:00:00 2001 From: Andrew Deason Date: Tue, 7 Apr 2015 22:10:53 -0500 Subject: [PATCH] afs: Fix fetchInit for negative/large lengths Currently, the 'length64' variable in rxfs_fetchInit is almost completely unused (it just goes into an icl logging function). For the length that we actually use ('*alength'), we just take the lower 32 bits of the length that the fileserver told us. This method is incorrect in at least the following cases: - If the fileserver returns a length that is larger than 2^32-1, we'll just take the lower 32 bits of the 64-bit length the fileserver told us about. The client currently never requests a fetch larger than 2^32-1, so this would be an error, but if this occurred, we would not detect it until much later in the fetch. - If the fileserver returns a length that is larger than 2^31-1, but smaller than 2^32, we'll interpret the length as negative (which we assume is just 0, due to bugs in older fileservers). This is also incorrect. - If the fileserver returns a negative length smaller than -2^31+1, we may interpret the give length as a positive value instead of a negative one. Older fileservers can do this if we fetch data beyond the file's EOF (this was fixed in the fileserver in commit 529d487d65d8561f5d0a43a4dc71f72b86efd975). This positive length will cause an error (usually), instead of proceeding without error (which is what would happen if we correctly interpreted the length as negative). On Solaris, this can manifest as a failed write, when writing to a location far beyond the file's EOF from the fileserver's point of view, because Solaris writes can trigger a fetch for the same area. Seeking to a location far beyond the file's EOF and writing can trigger this, as can a normal copy into AFS, if the file is large enough and the cache is large enough. To explain in more detail: When copying a file into AFS, the cache manager will buffer the dirty data in the disk cache until the file is synced/closed, or we run out of cache space. While this data is buffering, the application will write into an offset, say, 3GiB into the file. On Solaris, this can trigger a read for the same region, which will trigger a fetch from the fileserver at the offset 3GiB into the file. If the fileserver does not contain the fix in commit 529d487d65d8561f5d0a43a4dc71f72b86efd975, it will respond with a large negative number, which we interpret as a large positive number; much larger than the requested length. This will cause the fetch to fail, which then causes the whole write() call to fail. Specifically this will fail with EINVAL on Solaris, since that is the error code we return from afs_GetOnePage when we fail to acquire a dcache. If the cache is small enough, this will not happen, since we will flush data to the fileserver before we have a large amount of dirty data, e.g., 3GiB. (The actual error occurs closer to 2GiB, but this is just for illustrative purposes.) To fix this, detect the various ranges of values mentioned above, and handle them specially. Lengths that are too large will yield an error, since we cannot handle values over 2^31-1 in the rxfs_* framework currently. For lengths that are negative, just act as if we received a length of 0. Do this for both the 64-bit codepath and the non-64-bit codepath, just so they remain identical. [mmeffie@sinenomine.net: directly use 64 bit comparisons, don't mask end call error code, commit nits.] Change-Id: I7e8f2132d52747b7f0ce4a6a5ba81f6641a298a8 Reviewed-on: http://gerrit.openafs.org/11829 Reviewed-by: Chas Williams <3chas3@gmail.com> Reviewed-by: Benjamin Kaduk Tested-by: BuildBot --- src/afs/afs_fetchstore.c | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/src/afs/afs_fetchstore.c b/src/afs/afs_fetchstore.c index 3dc865bea7..3267aeb595 100644 --- a/src/afs/afs_fetchstore.c +++ b/src/afs/afs_fetchstore.c @@ -980,6 +980,34 @@ rxfs_fetchInit(struct afs_conn *tc, struct rx_connection *rxconn, } } FillInt64(length64, length_hi, length); + + if (!code) { + /* Check if the fileserver said our length is bigger than can fit + * in a signed 32-bit integer. If it is, we can't handle that, so + * error out. */ + if (length64 > MAX_AFS_INT32) { + RX_AFS_GUNLOCK(); + code = rx_EndCall(v->call, RX_PROTOCOL_ERROR); + v->call = NULL; + length = 0; + RX_AFS_GLOCK(); + code = code != 0 ? code : EIO; + } + } + + if (!code) { + /* Check if the fileserver said our length was negative. If it + * is, just treat it as a 0 length, since some older fileservers + * returned negative numbers when they meant to return 0. Note + * that we must do this in this 64-bit-specific block, since + * length64 being negative will screw up our conversion to the + * 32-bit 'alength' below. */ + if (length64 < 0) { + length_hi = length = 0; + FillInt64(length64, 0, 0); + } + } + afs_Trace3(afs_iclSetp, CM_TRACE_FETCH64LENG, ICL_TYPE_POINTER, avc, ICL_TYPE_INT32, code, ICL_TYPE_OFFSET, @@ -998,6 +1026,12 @@ rxfs_fetchInit(struct afs_conn *tc, struct rx_connection *rxconn, RX_AFS_GLOCK(); if (bytes == sizeof(afs_int32)) { *alength = ntohl(length); + if (*alength < 0) { + /* Older fileservers can return a negative length when they + * meant to return 0; just assume negative lengths were + * meant to be 0 lengths. */ + *alength = 0; + } } else { code = rx_EndCall(v->call, RX_PROTOCOL_ERROR); v->call = NULL;