From bc8f62fcdfa479023d15125404d1b13b6dfd6dc3 Mon Sep 17 00:00:00 2001 From: Jeffrey Altman Date: Wed, 11 Jun 2014 19:03:45 -0400 Subject: [PATCH 1/2] Revert "viced: Avoid issuing redundant TMAY requests" This reverts commit 03a9b481c7f27c462c9d65a756d172e79758b86d. Andrew Deason wrote, "Briefly, 'host' structures are allocated without clearing all of the contents to '0'. Only part of the structure is cleared, according to the HOST_TO_ZERO macro. Unfortunately I put the new tmay_ fields right below the 'index' field for some reason, so this means they aren't zeroed and can contain garbage. This means we can easily segfault in the fileserver when we try to access the pointers in there. "We access uninitialized memory for every 'host' that is allocated. So the chance of us corrupting memory is the chance that a particular pointer-sized area of memory from 'malloc' is not already NULL. "That seems pretty likely, but it's not so frequent as to have the fileserver effectively "constantly" crashing at the site that noticed. So it has not been a fire drill, but it has been noticeable (we heard about it I think yesterday, and got details today when it happened again). The noticing incident was a segfault, but an abort or sigbus are probably also likely. "Of course, the chances of noticing go way up with more clients. I expect the chances dramatically increase if you have more than 512 client hosts hit the box, since the first block of 512 are allocated before we really do anything. For the next 512, it seems much more likely that 'malloc' will give us back non-zeroed data. But this is just theory. "With the incident I know about, the crash happened semi-quickly after the server started (a few minutes). But it seems likely to occur after the server has been up for a long time, if/when you cross the next line of 512 hosts. "I am also concerned that this can easily be corrupting memory without being noticed via a crash (or it takes a while to crash), since we are potentially free'ing invalid pointers, or stomping over someone else's memory, etc etc." Change-Id: I20bd40fc9df69247884099a0623e6db40908b3e8 --- src/viced/host.c | 243 +---------------------------------------------- src/viced/host.h | 6 -- 2 files changed, 4 insertions(+), 245 deletions(-) diff --git a/src/viced/host.c b/src/viced/host.c index eb717d5430..803a6471c1 100644 --- a/src/viced/host.c +++ b/src/viced/host.c @@ -954,9 +954,6 @@ h_TossStuff_r(struct host *host) free(host->hcps.prlist_val); host->hcps.prlist_val = NULL; host->hcps.prlist_len = 0; - free(host->tmay_caps.Capabilities_val); - host->tmay_caps.Capabilities_val = NULL; - host->tmay_caps.Capabilities_len = 0; DeleteAllCallBacks_r(host, 1); host->hostFlags &= ~RESETDONE; /* just to be safe */ @@ -1656,210 +1653,7 @@ removeInterfaceAddr_r(struct host *host, afs_uint32 addr, afs_uint16 port) return 0; } -/* - * The following few functions deal with caching TellMeAboutYourself calls - * that we issued to clients. Why do we do this? Well: - * - * Q: First, why do we need to issue a TMAY against a client on an incoming new - * Rx connection? - * - * A: We must verify that the incoming Rx connection is the same host that - * we have a 'host' structure for. On new calls for existing connections, we - * can remember which host corresponds to that connection, but for new - * connections, we have no way to find what host it is for, except by looking - * up the host by IP address. Since hosts can change IP addresses, we need to - * contact the IP address to see if it's the host we think it is. - * - * Q: Okay, then why cache the results? - * - * A: The TMAY calls to a single host are serialized, because they are issued - * with the host's host->lock held. If we get 4 Rx calls each on new - * connections to the same host at the same time, the 1st call will lock the - * host, and issue a TMAY. Once that's done, the second call will issue a - * TMAY, then the 3rd and then the 4th. However, the 3rd and 4th calls have - * been waiting to issue a TMAY since before the 2nd call even _started_ to - * issue a TMAY. So, we can just effectively give the results of the 2nd - * call's TMAY to the 3rd and 4th, too. Since it is nondeterministic which of - * those calls gets to issue a TMAY "first", we can just assume they all got - * the same result. - * - * Note that in the above example, we cannot reuse the results of the 1st TMAY - * for the 2nd, 3rd, and 4th calls (we only reuse the results of the 2nd). - * This is because by the time the 2nd call starts waiting for the host lock, - * it doesn't know how long the 1st TMAY has been running, so the 2nd call - * might indeed get a different TMAY result (though this probably would be - * extremely rare). - * - * Anyway, so, if we don't cache the results, we are issuing TMAYs that are - * pure overhead. In an environment with clients that create a lot of - * connections to a fileserver (keep in mind each new PAG on OpenAFS clients - * creates a new connection), this can mean a significant amount of overhead. - * So, we use this "TMAY cache" to avoid this overhead for the common case. - */ -/** - * Should we skip calling TellMeAboutYourself on this host, and instead rely - * on cached TMAY results? - * - * @param[in] host The host we are dealing with - * @param[in] prewait_tmays What the value of host->n_tmays was _before_ we - * locked 'host' - * @param[in] prewait_host What the primary IP address for 'host' was before - * we locked 'host' - * @param[in] prewait_port What the primary port was for 'host' before we - * locked 'host' - * - * @return Whether we should skip calling TMAY, and instead rely on cached - * results in host - */ -static int -ShouldSkipTMAY(struct host *host, int prewait_tmays, afs_uint32 prewait_host, - afs_uint16 prewait_port) -{ - int skiptmay = 0; - if (host->n_tmays > prewait_tmays + 1) { - /* while we were waiting for the host lock, the in-progress TMAY - * call finished, someone else started a new TMAY call, and that - * finished. so, calling TMAY again won't give us any information - * or additional guarantees. */ - skiptmay = 1; - } - if (host->host != prewait_host || host->port != prewait_port) { - /* ...but don't skip it if the host has changed */ - skiptmay = 0; - } - return skiptmay; -} - -/** - * If appropriate, simulate a TellMeAboutYourself call on 'host' by extracting - * cached interfaceAddr and Capabilities information from 'host' itself. - * - * @param[in] host The host we're dealing with - * @param[inout] askiptmay On entering this function, this should contain the - * result of ShouldSkipTMAY. On return, it is 1 if we - * actually did use cached TMAY results, or 0 if we - * did not. - * @param[out] interf The interfaceAddr result of the simulated TMAY call - * @param[out] caps The Capabilities result of the simulated TMAY call - * - * @return status - * @retval 0 We used the cached TMAY results; do NOT make a real TMAY request - * @retval otherwise We did not use cached TMAY results; issue a real TMAY request - */ -static int -SimulateTMAY(struct host *host, int *askiptmay, struct interfaceAddr *interf, - Capabilities *caps) -{ - size_t capsize; - - if (!*askiptmay) { - /* we're not supposed to skip the actual TMAY call */ - return -1; - } - - *interf = host->tmay_interf; - - free(caps->Capabilities_val); - caps->Capabilities_val = NULL; - caps->Capabilities_len = 0; - - if (!host->tmay_caps.Capabilities_val) { - return 0; - } - - capsize = sizeof(caps->Capabilities_val[0]) * host->tmay_caps.Capabilities_len; - - caps->Capabilities_val = malloc(capsize); - if (!caps->Capabilities_val) { - /* we should/did _not_ skip the real TMAY call, since we couldn't - * alloc memory to use the cached results */ - *askiptmay = 0; - return -1; - } - caps->Capabilities_len = host->tmay_caps.Capabilities_len; - memcpy(caps->Capabilities_val, host->tmay_caps.Capabilities_val, capsize); - - return 0; -} - -/** - * If appropriate, store the given results from a real TellmeAboutYourself - * call, and cache them in the given host structure. - * - * @param[in] host The host we're dealing with - * @param[in] skiptmay 1 if we skipped making a real TMAY call, 0 otherwise - * @param[in] didtmay 1 if we issued a successful real TMAY call, 0 otherwise - * @param[in] interf The interfaceAddr result from the real TMAY call - * @param[in] caps The Capabilities result from the real TMAY call - */ -static void -CacheTMAY(struct host *host, int skiptmay, int didtmay, - struct interfaceAddr *interf, Capabilities *caps) -{ - size_t capsize; - - if (skiptmay) { - /* we simulated the TMAY call, so the state of the world hasn't - * changed; don't touch anything */ - return; - } - if (!didtmay) { - /* we did not perform a successful TMAY, so we don't have valid - * results to cache. blow away the existing cache so we don't use - * stale results */ - goto resetcache; - } - if (host->n_tmays == INT_MAX) { - /* make sure int rollover doesn't screw up our ordering */ - goto resetcache; - } - if (host->lock.num_waiting == 0) { - /* nobody is waiting for this host, so no reason to cache anything */ - goto resetcache; - } - - /* okay, if we got here, everything looks good; let's cache the given - * 'interf' and 'caps' */ - - host->tmay_interf = *interf; - - if (!caps->Capabilities_val) { - free(host->tmay_caps.Capabilities_val); - host->tmay_caps.Capabilities_val = NULL; - host->tmay_caps.Capabilities_len = 0; - - } else { - if (caps->Capabilities_len != host->tmay_caps.Capabilities_len) { - free(host->tmay_caps.Capabilities_val); - host->tmay_caps.Capabilities_val = NULL; - host->tmay_caps.Capabilities_len = 0; - } - - capsize = sizeof(caps->Capabilities_val[0]) * caps->Capabilities_len; - - if (!host->tmay_caps.Capabilities_val) { - host->tmay_caps.Capabilities_val = malloc(capsize); - if (!host->tmay_caps.Capabilities_val) { - goto resetcache; - } - } - - host->tmay_caps.Capabilities_len = caps->Capabilities_len; - memcpy(host->tmay_caps.Capabilities_val, caps->Capabilities_val, capsize); - } - host->n_tmays++; - return; - - resetcache: - /* blow away the cached TMAY data; pretend we never saw anything */ - free(host->tmay_caps.Capabilities_val); - host->tmay_caps.Capabilities_val = NULL; - host->tmay_caps.Capabilities_len = 0; - memset(&host->tmay_interf, 0, sizeof(host->tmay_interf)); - - host->n_tmays = 0; -} static int h_threadquota(int waiting) @@ -1910,24 +1704,12 @@ h_GetHost_r(struct rx_connection *tcon) * structure for this address. Verify that the identity * of the caller matches the identity in the host structure. */ - - int didtmay = 0; /* did we make a successful TMAY call against host->host? */ - unsigned int prewait_tmays; - afs_uint32 prewait_host; - afs_uint16 prewait_port; - int skiptmay; - if ((host->hostFlags & HWHO_INPROGRESS) && h_threadquota(host->lock.num_waiting)) { h_Release_r(host); host = NULL; goto gethost_out; } - - prewait_tmays = host->n_tmays; - prewait_host = host->host; - prewait_port = host->port; - h_Lock_r(host); if (!(host->hostFlags & ALTADDR) || (host->hostFlags & HOSTDELETED)) { @@ -1958,29 +1740,15 @@ h_GetHost_r(struct rx_connection *tcon) cb_conn = host->callback_rxcon; rx_GetConnection(cb_conn); - - skiptmay = ShouldSkipTMAY(host, prewait_tmays, prewait_host, prewait_port); - H_UNLOCK; if (haddr == host->host && hport == host->port) { /* The existing callback connection matches the * incoming connection so just use it. */ - - if (SimulateTMAY(host, &skiptmay, &interf, &caps) == 0) { - /* noop; we don't need to call TellMeAboutYourself; we can - * trust the results from the last TMAY call */ - code = 0; - - } else { - code = - RXAFSCB_TellMeAboutYourself(cb_conn, &interf, &caps); - if (code == RXGEN_OPCODE) - code = RXAFSCB_WhoAreYou(cb_conn, &interf); - if (code == 0) { - didtmay = 1; - } - } + code = + RXAFSCB_TellMeAboutYourself(cb_conn, &interf, &caps); + if (code == RXGEN_OPCODE) + code = RXAFSCB_WhoAreYou(cb_conn, &interf); } else { /* We do not have a match. Create a new connection * for the new addr/port and use multi_Rx to probe @@ -2000,9 +1768,6 @@ h_GetHost_r(struct rx_connection *tcon) rx_PutConnection(cb_conn); cb_conn=NULL; H_LOCK; - - CacheTMAY(host, skiptmay, didtmay, &interf, &caps); - if ((code == RXGEN_OPCODE) || ((code == 0) && (afs_uuid_equal(&interf.uuid, &nulluuid)))) { identP = (struct Identity *)malloc(sizeof(struct Identity)); diff --git a/src/viced/host.h b/src/viced/host.h index e0cb2f5e08..011bb5bf8a 100644 --- a/src/viced/host.h +++ b/src/viced/host.h @@ -88,12 +88,6 @@ struct host { * the index fields isn't zeroed. XXX */ afs_uint32 index; /* Host table index, for vicecb.c */ - unsigned int n_tmays; /* how many successful TellMeAboutYourself calls - * have we made against this host? */ - /* cache of the result of the last successful TMAY call to this host */ - struct interfaceAddr tmay_interf; - Capabilities tmay_caps; - struct Lock lock; /* Write lock for synchronization of * VenusDown flag */ #ifdef AFS_PTHREAD_ENV From 3381492d26c2c3677b5f6065e1022c9cfe225c79 Mon Sep 17 00:00:00 2001 From: Stephan Wiesand Date: Thu, 12 Jun 2014 10:30:48 +0200 Subject: [PATCH 2/2] Make OpenAFS 1.6.9 Update version strings and NEWS for 1.6.9 Change-Id: I80fe292dd091a26cbec5d5b4a2fab51e4cf4dee6 --- NEWS | 6 ++++++ configure-libafs.ac | 2 +- configure.ac | 4 ++-- src/config/NTMakefile.amd64_w2k | 2 +- src/config/NTMakefile.i386_nt40 | 2 +- src/config/NTMakefile.i386_w2k | 2 +- 6 files changed, 12 insertions(+), 6 deletions(-) diff --git a/NEWS b/NEWS index 0bd5e2cbf7..3c6ec2f70c 100644 --- a/NEWS +++ b/NEWS @@ -1,5 +1,11 @@ User-Visible OpenAFS Changes +OpenAFS 1.6.9 + + All server platforms + + * Fix for OPENAFS-SA-2014-002 + OpenAFS 1.6.8 All platforms diff --git a/configure-libafs.ac b/configure-libafs.ac index 162dbbdf7c..5b91de1c2d 100644 --- a/configure-libafs.ac +++ b/configure-libafs.ac @@ -5,7 +5,7 @@ AC_CONFIG_SRCDIR(src/libafs/Makefile.common.in) AM_INIT_AUTOMAKE AC_CONFIG_HEADER(src/config/afsconfig.h) -MACOS_VERSION=1.6.8 +MACOS_VERSION=1.6.9 AC_SUBST(MACOS_VERSION) diff --git a/configure.ac b/configure.ac index dd16efd061..6a32534135 100644 --- a/configure.ac +++ b/configure.ac @@ -5,8 +5,8 @@ AC_CONFIG_SRCDIR([src/config/stds.h]) AM_INIT_AUTOMAKE AC_CONFIG_HEADER(src/config/afsconfig.h) -MACOS_VERSION=1.6.8 -LINUX_PKGVER=1.6.8 +MACOS_VERSION=1.6.9 +LINUX_PKGVER=1.6.9 dnl Debian wants the release candidate version in the main upstream version, dnl and wants ~ before it. diff --git a/src/config/NTMakefile.amd64_w2k b/src/config/NTMakefile.amd64_w2k index 447b7c0ae5..90202dbbc4 100644 --- a/src/config/NTMakefile.amd64_w2k +++ b/src/config/NTMakefile.amd64_w2k @@ -90,7 +90,7 @@ AFSPRODUCT_VER_MAJOR=1 AFSPRODUCT_VER_MINOR=6 !ENDIF !IF !DEFINED(AFSPRODUCT_VER_PATCH) -AFSPRODUCT_VER_PATCH=0802 +AFSPRODUCT_VER_PATCH=0900 !ENDIF !IF !DEFINED(AFSPRODUCT_VER_BUILD) AFSPRODUCT_VER_BUILD=0 diff --git a/src/config/NTMakefile.i386_nt40 b/src/config/NTMakefile.i386_nt40 index 7831ad1b9e..56425cd0a0 100644 --- a/src/config/NTMakefile.i386_nt40 +++ b/src/config/NTMakefile.i386_nt40 @@ -90,7 +90,7 @@ AFSPRODUCT_VER_MAJOR=1 AFSPRODUCT_VER_MINOR=6 !ENDIF !IF !DEFINED(AFSPRODUCT_VER_PATCH) -AFSPRODUCT_VER_PATCH=0802 +AFSPRODUCT_VER_PATCH=0900 !ENDIF !IF !DEFINED(AFSPRODUCT_VER_BUILD) AFSPRODUCT_VER_BUILD=0 diff --git a/src/config/NTMakefile.i386_w2k b/src/config/NTMakefile.i386_w2k index 4d7a6bb957..5b7d9d62e2 100644 --- a/src/config/NTMakefile.i386_w2k +++ b/src/config/NTMakefile.i386_w2k @@ -94,7 +94,7 @@ AFSPRODUCT_VER_MAJOR=1 AFSPRODUCT_VER_MINOR=6 !ENDIF !IF !DEFINED(AFSPRODUCT_VER_PATCH) -AFSPRODUCT_VER_PATCH=0802 +AFSPRODUCT_VER_PATCH=0900 !ENDIF !IF !DEFINED(AFSPRODUCT_VER_BUILD) AFSPRODUCT_VER_BUILD=0