openafs

mirror of https://git.openafs.org/openafs.git synced 2025-01-18 15:00:12 +00:00

Author	SHA1	Message	Date
Benjamin Kaduk	32e97d07a3	Make OpenAFS 1.9.2 Update version strings for the third 1.9.x development release. Change-Id: I203a9d3487d1344882afb02ded094e01e7672753 Reviewed-on: https://gerrit.openafs.org/15926 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:35:52 -05:00
Andrew Deason	c4e28c2afe	OPENAFS-SA-2024-003: xdr: Initialize memory for INOUT args CVE-2024-10397 Currently, there are a few callers of RPCs that specify some data for an INOUT parameter, but do not initialize the memory for that data. This can result in the uninitialized memory being sent to the peer when the argument is processed as an IN argument. Simply clear the relevant data before running the RPC to avoid this. The relevant RPCs and arguments are: - For RMTSYS_Pioctl, the 'OutData' argument. - For BUDB_GetVolumes, the 'volumes' argument. -- via DBLookupByVolume -> bcdb_LookupVolume -> ubik_BUDB_GetVolumes -- and via bc_Restorer -> bcdb_FindVolumes -> ubik_BUDB_GetVolumes - For KAA_Authenticate_old / KAA_Authenticate, this can happen with the 'answer' argument in ka_Authenticate if KAA_AuthenticateV2 or KAA_Authenticate return RXGEN_OPCODE, but the server manages to populate oanswer.SeqLen with non-zero. For all of these, make sure the memory is blanked before running the relevant RPC. For ka_Authenticate, reset oanswer.SeqLen to 0 to avoid sending any data, but still blank 'answer' and 'answer_old' just to be safe. FIXES 135043 Change-Id: I95ee91a1710d66e2e88c3086d57c279a376f7dc6 Reviewed-on: https://gerrit.openafs.org/15925 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:35:48 -05:00
Andrew Deason	f31a79d749	OPENAFS-SA-2024-003: sys: Don't over-copy RMTSYS_Pioctl output data CVE-2024-10397 Here, 'OutData' only has OutData.rmtbulk_len bytes in it. We know that OutData.rmtbulk_len is at most data->out_size, but it could be smaller. So, only copy OutData.rmtbulk_len bytes, not data->out_size, since data->out_size could be more than the number of bytes we have allocated in OutData. FIXES 135043 Change-Id: I6f87fc8cb5df0298061f419112200f6c7e1974ba Reviewed-on: https://gerrit.openafs.org/15924 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:35:43 -05:00
Andrew Deason	1f5e1ef9e3	OPENAFS-SA-2024-003: Run xdr_free for retried RPCs CVE-2024-10397 A few areas of code retry the same RPC, like so: do { code = VL_SomeRPC(rxconn, &array_out); } while (some_condition); xdr_free((xdrproc_t) xdr_foo, &array_out); Or try a different version/variant of an RPC (e.g. VLDB_ListAttributesN2 -> VLDB_ListAttributes). If the first RPC call causes the output array to be allocated with length N, then the subsequent RPC calls may fail if the server responds with an array larger than N. Furthermore, if the subsequent call responds with an array smaller than N, then when we xdr_free the array, our length will be smaller than the actual number of allocated elements. That results in two potential issues: - We'll fail to free the elements at the end of the array. This is only a problem if each element in the array also uses dynamically-allocated memory (e.g. each element contains a string or another array). Fortunately, there are only a few such structures in any of our RPC-L definitions: SysNameList and CredInfos. And neither of those are used in such a retry loop, so this isn't a problem. - We'll give the wrong length to osi_free when freeing the array itself. This only matters for KERNEL, and only on some platforms (such as Solaris), since the length given to osi_free is ignored everywhere else. To avoid these possible issues, change the relevant retry loops to free our xdr-allocated arrays on every iteration of the loop, like this: do { xdr_free((xdrproc_t) xdr_foo, &array_out); code = VL_SomeRPC(rxconn, &array_out); } while (some_condition); xdr_free((xdrproc_t) xdr_foo, &array_out); Or like this: do { code = VL_SomeRPC(rxconn, &array_out); xdr_free((xdrproc_t) xdr_foo, &array_out); } while (some_condition); FIXES 135043 Change-Id: I4e058eb52d277e6d3817df8b703c5e5b894c0b5a Reviewed-on: https://gerrit.openafs.org/15923 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:35:38 -05:00
Andrew Deason	7d0675e6c6	OPENAFS-SA-2024-003: xdr: Ensure correct string length in xdr_string CVE-2024-10397 Currently, if a caller calls an RPC with a string output argument, like so: { char str = NULL; code = RXAFS_SomeCall(&str); / do something with 'str' / xdr_free((xdrproc_t) xdr_string, &str); } Normally, xdr_free causes xdr_string to call osi_free, specifying the same size that we allocated for the string. However, since we only have a char, the amount of space allocated for the string is not recorded separately, and so xdr_string calculates the size of the buffer to free by using strlen(). This works for well-formed strings, but if we fail to decode the payload of the string, or if our peer gave us a string with a NUL byte in the middle of it, then strlen() may be significantly less than the actual allocated size. And so in this case, the size given to osi_free will be wrong. The size given to osi_free is ignored in userspace, and for KERNEL on many platforms like Linux and DARWIN. However, it is notably not ignored for KERNEL on Solaris and some other less supported platforms (HPUX, Irix, NetBSD). At least on Solaris, an incorrect size given to osi_free can cause a system panic or possibly memory corruption. To avoid this, change xdr_string during XDR_DECODE to make sure that strlen() of the string always reflects the allocated size. If we fail to decode the string's payload, replace the payload with non-NUL bytes (fill it with 'z', an arbitrary choice). And if we do successfully decode the payload, check if the strlen() is wrong (that is, if the payload contains NUL '\0' bytes), and fail if so, also filling the payload with 'z'. This is only strictly needed in KERNEL on certain platforms, but do it everywhere so our behavior is consistent. FIXES 135043 Change-Id: I90c419a7ef0ede247187172a182863dcb4250578 Reviewed-on: https://gerrit.openafs.org/15922 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:35:34 -05:00
Andrew Deason	c732715e4e	OPENAFS-SA-2024-003: Check sanity on lengths of RPC returned arrays CVE-2024-10397 Various RPCs return a variable-length array in an OUT argument, but are only supposed to return specific sizes. A few instances of this include the following (but this is not an exhaustive list): - AFSVolListOneVolume should only return a single volintInfo. - PR_NameToID should return the same number of IDs as names given. - VL_GetAddrsU should return the same number of addresses as the 'nentries' OUT argument. Some callers of these RPCs just assume that the server has not violated these rules. If the server responds with a nonsensical array size, this could cause us to read beyond the end of the array, or cause a NULL dereference or other errors. For example, some callers of VL_GetAddrsU will iterate over 'nentries' addresses, even if the 'blkaddrs' OUT argument contains fewer entries. Or with AFSVolListOneVolume, some callers assume that at least 1 volintInfo has been returned; if 0 have been returned, we can try to access a NULL array. To avoid all of this, add various sanity checks on the relevant returned lengths of these RPCs. For most cases, if the lengths are not sane, return an internal error from the appropriate subsystem (or RXGEN_CC_UNMARSHAL if there isn't one). For VL_GetAddrsU, if 'nentries' is too long, just set it to the length of the returned array. FIXES 135043 Change-Id: Ibdc7837ab09b4765436fc4c0d780e695bba07128 Reviewed-on: https://gerrit.openafs.org/15921 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:35:28 -05:00
Andrew Deason	13413eceed	OPENAFS-SA-2024-003: xdr: Prevent XDR_DECODE buffer overruns CVE-2024-10397 When making an RPC call from a client, output arguments that use arrays (or array-like objects like strings and opaques) can be allocated by XDR, like so: { struct idlist ids; ids.idlist_val = NULL; ids.idlist_len = 0; code = PR_NameToID(rxconn, names, &ids); /* data inside ids.idlist_val[...] / xdr_free((xdrproc_t) xdr_idlist, &ids); } With this approach, during XDR_DECODE, xdr_array() reads in the number of array elements from the peer, then allocates enough memory to hold that many elements, and then reads in the array elements. Alternatively, the caller can provide preallocated memory, like so: { struct idlist ids; afs_int32 ids_buf[30]; ids.idlist_val = ids_buf; ids.idlist_len = 30; code = PR_NameToID(rxconn, names, &ids); / data inside ids.idlist_val[...] / } With this approach, during XDR_DECODE, xdr_array() reads in the number of array elements from the peer, and then reads in the array elements into the supplied buffer. However, in this case, xdr_array() never checks that the number of array elements will actually fit into the supplied buffer; the _len field provided by the caller is just ignored. In this example, if the ptserver responds with 50 elements for the 'ids' output argument, xdr_array() will write 50 afs_int32's into 'ids.idlist_val', going beyond the end of the 30 elements that are actually allocated. It's also possible, and in fact very easy, to use xdr-allocated buffers and then reuse them as a preallocated buffer, possibly accidentally. For example: { struct idlist ids; ids.idlist_val = NULL; ids.idlist_len = 0; while (some_condition) { code = PR_NameToID(rxconn, names, &ids); } } In this case, the first call to PR_NameToID can cause the buffer for 'ids' to be allocated by XDR, which will then be reused by the subsequent calls to PR_NameToId. Note that this can happen even if the first PR_NameToID call fails; the call can be aborted after the output array is allocated. Retrying an RPC in this way is effectively what all ubik_Call codepaths do (including all ubik_* wrappers, e.g. ubik_PR_NameToID). Or some callers retry effectively the same RPC when falling back to earlier versions (e.g. VL_ListAttributesN2 -> VL_ListAttributesN). To prevent this for arrays and opaques, change xdr_array (and xdr_bytes) to check if the _len field for preallocated buffers is large enough, and return failure if it's not. Also perform the same check for the ka_CBS and ka_BBS structures. These are mostly the same as opaques, but they have custom serialization functions in src/kauth/kaaux.c. ka_BBS also has two lengths: the actual length of bytes, and a 'max' length. ka_CBS isn't used for any RPC output arguments, but fix it for consistency. For strings, the situation is complicated by the fact that callers cannot pass in how much space was allocated for the string, since callers only provide a char**. So for strings, just refuse to use a preallocated buffer at all, and return failure if one is provided. Note that for some callers using preallocated arrays or strings, the described buffer overruns are not possible, since the preallocated buffers are larger than the max length specified in the relevant RPC-L. For example, afs_DoBulkStat() allocates AFSCBMAX entries for the output args for RXAFS_InlineBulkStatus, which is the max length specified in the RPC-L, so a buffer overrun is impossible. But since it is so easy to allow a buffer overrun, enforce the length checks for everyone. FIXES 135043 Change-Id: Iaf913e2156af2081c72125ec066d9636086c7d14 Reviewed-on: https://gerrit.openafs.org/15920 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:35:23 -05:00
Andrew Deason	b2b1110ddd	OPENAFS-SA-2024-003: xdr: Set _len for prealloc'd opaque/array OUT args CVE-2024-10397 Currently, a few RPCs with arrays or opaque OUT arguments are called with preallocated memory for the arg, but also provide a _len of 0 (or an uninitialized _len). This makes it impossible for the xdr routine to tell whether we have allocated enough space to actually hold the response from the server. To help this situation, either specify an appropriate _len for the preallocated value (cm_IoctlGetACL, fsprobe_LWP), or don't provide a preallocated buffer at all and let xdr allocate a buffer for us (PGetAcl). Note that this commit doesn't change xdr to actually check the value of the given _len; but now a future commit can do so without breaking callers. FIXES 135043 Change-Id: Ieb50aaa5ae9a1bde027999ce1c668e0c99b4d82b Reviewed-on: https://gerrit.openafs.org/15919 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:35:18 -05:00
Andrew Deason	00a1b266af	OPENAFS-SA-2024-003: xdr: Avoid prealloc'd string OUT args CVE-2024-10397 Currently, several callers call RPCs with string OUT arguments, and provide preallocated memory for those arguments. This can easily allow a response from the server to overrun the allocated buffer, stomping over stack or heap memory. We could simply make our preallocated buffers larger than the maximum size that the RPC allows, but relying on that is error prone, and there's no way for XDR to check if a string buffer is large enough. Instead, to make sure we don't overrun a given preallocated buffer, avoid giving a preallocated buffer to such RPCs, and let XDR allocate the memory for us. Specifically, this commit changes several callers to RXAFS_GetVolumeStatus(), and one caller of BOZO_GetInstanceParm(), to avoid passing in a preallocated string buffer. All other callers of RPCs with string OUT args already let XDR allocate the buffers for them. FIXES 135043 Change-Id: If42e2cc983903cff9766e1bab487142d4d493a17 Reviewed-on: https://gerrit.openafs.org/15918 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:35:13 -05:00
Andrew Deason	ac602a0a56	OPENAFS-SA-2024-002: Avoid uninitialized memory when parsing ACLs CVE-2024-10396 Several places in the tree parse ACLs using sscanf() calls that look similar to this: sscanf(str, "%d dfs:%d %s", &nplus, &dfs, cell); sscanf(str, "%100s %d", tname, &trights); Some callers check whether the scanf() returns negative or 0, but some callers do not check the return code at all. If only some of the fields are present in the sscanf()'d string (because, for instance, the ACL is malformed), some of the arguments are left alone, and may be set to garbage if the relevant variable was never initialized. If the parsed ACL is copied to another ACL, this can result in the copied ACL containing uninitialized memory. To avoid this, make sure all of the variables passed to sscanf() and similar calls are initialized before parsing. This commit does not guarantee that the results make sense, but at least the results do not contain uninitialized memory. Change-Id: Ib0becffbce5a6e15f91ac4390bb9c422d478bea6 Reviewed-on: https://gerrit.openafs.org/15917 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:35:08 -05:00
Benjamin Kaduk	a4ecb05054	OPENAFS-SA-2024-002: make VIOCGETAL consumers stay within string bounds CVE-2024-10396 After the preceding commits, the data returned by the VIOCGETAL pioctl (a RXAFS_FetchAcl wrapper) will safely be NUL-terminated. However, the callers that attempt to parse the ACL string make assumptions that the returned data will be properly formatted, and implement a "skip to next line" functionality (under various names) that blindly increments a char* until it finds a newline character, which can read past the end of even a properly NUL-terminated string if there is not a newline where one is expected. Adjust the various "skip to next line" functionality to keep the current string pointer at the trailing NUL if the end of the string is reached while searching for a newline. Change-Id: I7fb7f23d7d6f68608f3e656a1530a7fc40b4a567 Reviewed-on: https://gerrit.openafs.org/15916 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:35:03 -05:00
Benjamin Kaduk	7e13414e8e	OPENAFS-SA-2024-002: verify FetchACL returned only a string CVE-2024-10396 Supplement the previous commit by additionally verifying that the returned ACL string occupies the entire XDR opaque, rejecting any values returned that have an internal NUL prior to the end of the opaque. Change-Id: Iefa3d00a9a0e25ef66b7166fe952aae0603ee3d7 Reviewed-on: https://gerrit.openafs.org/15915 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:34:59 -05:00
Benjamin Kaduk	0b1ccb0dbc	OPENAFS-SA-2024-002: verify FetchACL returned a valid string CVE-2024-10396 Analogously to how a call to RXAFS_StoreACL() with a malformed ACL string can cause a fileserver to perform invalid memory operations, a malformed ACL string returned in response to a call to RXAFS_FetchACL() can cause a client to perform invalid memory operations. Modify all the in-tree callers of the RPC to verify that the ACL data, which is conveyed as an XDR 'opaque' but whose contents are actually expected to be a string, is a valid C string. If a zero-length opaque or one without a trailing NUL is received, treat that as an error response from the fileserver rather than returning success. The Unix cache manager's pioctl handler already has logic to cope with a zero-length reply by emitting a single NUL byte to userspace. This special-casing seems to have been in place from the original IBM import, though it does so by confusingly "skipping over" a NUL byte already put in place. For historical compatibility, preserve that behavior rather than treating the zero-length reply as an error as we do for the other callers. It seems likely that this location should treat a zero-length reply as an error just as the other call sites do, but that can be done as a later change. Change-Id: Ibf685e54e7e3fca6a4caac63c961cfcfb2f4732a Reviewed-on: https://gerrit.openafs.org/15914 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:34:55 -05:00
Andrew Deason	c9eae1e8b2	OPENAFS-SA-2024-002: viced: Avoid unchecked ACL in StoreACL audit log CVE-2024-10396 Currently in SRXAFS_StoreACL, if CallPreamble() or check_acl() fail, we will jump to Bad_StoreACL, which will pass the ACL string from the client to osi_auditU. Since check_acl() hasn't yet checked if the given ACL contains a NUL byte, the ACL may be an unterminated string. If auditing is enabled, this can cause garbage to be logged to the audit log, or cause the fileserver to crash. To avoid this, set 'rawACL' to NULL at first, only setting it to the actual ACL string after check_acl() has succeeded. This ensures that all code accessing 'rawACL' is guaranteed to be using a terminated string. This may mean that we pass a NULL AUD_ACL to osi_auditU. Our auditing code explicitly checks for and handles handles NULL strings, so this is fine. FIXES 135445 Change-Id: Iecde5677805a28d55c833b135732a14fd86cc985 Reviewed-on: https://gerrit.openafs.org/15913 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:34:49 -05:00
Andrew Deason	eb8b93a971	OPENAFS-SA-2024-002: viced: Introduce 'rawACL' in StoreACL CVE-2024-10396 Change our StoreACL implementation to refer to the 'AccessList' argument via a new local variable called 'rawACL'. This makes it clearer to users that the data is a string, and makes it easier for future commits to make sure we don't access the 'AccessList' argument in certain situations. Update almost all users in StoreACL to refer to 'rawACL' instead of 'AccessList'. Change the name of 'AccessList' to 'uncheckedACL' to make sure we don't miss any users. Update our check_acl() call to use 'uncheckedACL' (and not 'rawACL'), because it must use an AFSOpaque to check the ACL. Change RXStore_AccessList() and printableACL() to accept a plain char* instead of a struct AFSOpaque. This commit should not incur any noticeable behavior change. Technically printableACL() is changed to run strlen() on the given string, but this should not cause any noticeable change in behavior: This change could cause printableACL() to process less of the string than before, if the string contains a NUL byte before the end of the AFSOpaque buffer. But this doesn't matter, since the all of our code after this treats the ACL as a plain string, and so doesn't look at any data beyond the first NUL. It's not possible for printableACL() to process more data than before, because check_acl() has already checked that the ACL string contains a NUL byte, so we must process AFSOpaque_len bytes or fewer. FIXES 135445 Change-Id: I26229a39528fec788d96c77012c042c278a214c9 Reviewed-on: https://gerrit.openafs.org/15912 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:34:44 -05:00
Andrew Deason	96ab2c6f8a	OPENAFS-SA-2024-002: acl: Error on missing newlines when parsing ACL CVE-2024-10396 In acl_Internalize_pr(), each line in an ACL granting rights (positive or negative) is sscanf()'d with "%63s\t%d\n", and then we try to advance 'nextc' beyond the next newline character. However, sscanf()'ing "%63s\t%d\n" does not guarantee that there is a newline in the given string. Whitespace characters in sscanf() are not matched exactly, and may match any amount of whitespace (including none at all). For example, a string like "foo 4" may be parsed by sscanf(), but does not contain any newlines. If this happens, strchr(nextc, '\n') will return NULL, and we'll advance 'nextc' to 0x1, causing a segfault when we next try to dereference 'nextc'. To avoid this, check if 'nextc' is NULL after the strchr() call, and return an error if so. FIXES 135445 Change-Id: I6bcbbaf88a16202fb84c0932578dd8d5712726dd Reviewed-on: https://gerrit.openafs.org/15911 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:34:39 -05:00
Andrew Deason	35d218c1d1	OPENAFS-SA-2024-002: acl: Do not parse beyond end of ACL CVE-2024-10396 The early parsing code in acl_Internalize_pr() tries to advance 'nextc' to go beyond the first two newlines in the given ACL string. But if the given ACL string has no newlines, or only 1 newline, then 'nextc' will point beyond the end of the ACL string, potentially pointing to garbage. Intuitively, it may look like the ACL string must contain at least 2 newlines because we have sscanf()'d the string with "%d\n%\d". However, whitespace characters in sscanf() are not matched exactly like non-whitespace characters are; a sequence of whitespace characters matches any amount of whitespace (including none). So, a string like "1 2" will be parsed by "%d\n%d\n", but will not contain any newline characters. Usually this should result in a parse error from acl_Internalize_pr(), but if the garbage happens to parse successfully, this could result in unrelated memory getting stored to the ACL. To fix this, don't advance 'nextc' if we're already at the end of the ACL string. FIXES 135445 Change-Id: Ie009b59bec9a75afc81fee201c2fca6955f484e4 Reviewed-on: https://gerrit.openafs.org/15910 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:34:34 -05:00
Andrew Deason	0358648dbe	OPENAFS-SA-2024-001: afs: Throttle PAG creation in afs_genpag() CVE-2024-10394 Currently, we only throttle PAG creation in afs_setpag(). But there are several callers that call setpag() directly, not via afs_setpag; notably _settok_setParentPag in afs_pioctl.c. When setpag() is called with a PAG value of -1, it generates a new PAG internally without any throttling. So, those callers effectively bypass the PAG throttling mechanism, which allows a calling user to create PAGs without any delay. To avoid this, move our afs_pag_wait call from afs_setpag() to afs_genpag(), which all code uses to generate a new PAG value. This ensures that PAG creation is always throttled for unprivileged users. FIXES 135062 Change-Id: Ic4cb352edaa693984995fbdb6dc35b89686e8470 Reviewed-on: https://gerrit.openafs.org/15907 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:34:09 -05:00
Andrew Deason	f701f704c7	OPENAFS-SA-2024-001: afs: Introduce afs_genpag() CVE-2024-10394 Currently, several areas in the code call genpag() to generate a new PAG id, but the signature of genpag() is very limited. To allow for the code in genpag() to return errors and to examine the calling user's credentials, introduce a new function, afs_genpag(), that does the same thing as genpag(), but accepts creds and allows errors to be returned. Convert all existing callers to use afs_genpag() and to handle any errors, though no errors are ever returned in this commit on its own. To ensure there are no old callers of genpag() left around, change the existing genpag() to be called genpagval(), and declare it static. FIXES 135062 Change-Id: I5c96c0134db901f21ca30c8b3f57aeec1eb67aa5 Reviewed-on: https://gerrit.openafs.org/14090 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:33:38 -05:00
Andrew Deason	f4dfc2d718	OPENAFS-SA-2024-002: viced: Free ACL on acl_Internalize_pr error CVE-2024-10396 Currently, we don't free 'newACL' if acl_Internalize_pr() fails. If acl_Internalize_pr() has already allocated 'newACL', then the memory associated with newACL will be leaked. This can happen if parsing the given ACL fails at any point after successfully parsing the first couple of lines in the ACL. Change acl_FreeACL() to make freeing a NULL acl a no-op, to make it easier to make sure the acl has been freed. FIXES 135445 Change-Id: I87745fa9b6285574acdd5ecb613e80fa1ea37ae8 Reviewed-on: https://gerrit.openafs.org/15909 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:32:27 -05:00
Andrew Deason	e15decb318	OPENAFS-SA-2024-002: viced: Refuse ACLs without '\0' in SRXAFS_StoreACL CVE-2024-10396 Currently, the fileserver treats the ACL given in RXAFS_StoreACL as a string, even though it is technically an AFSOpaque and could be not NUL-terminated. We give the ACL opaque/string to acl_Internalize_pr() to parse, which will run off the end of the allocated buffer if the given ACL does not contain a '\0' character. Usually this will result in a parse error since we'll encounter garbage, but if the partially-garbage ACL happens to parse successfully, some uninitialized data could make it into the stored ACL. In addition, if the given ACL is an opaque of length 0, we'll still give the opaque pointer to acl_Internalize_pr(). In this case, the pointer will point to &memZero, which happens to contain a NUL byte, and so is treated like an empty string (which is not a valid ACL). But the fact that this causes no problems is somewhat a coincidence, and so should also be avoided. To avoid both of these situations, just check if the given ACL string contains a NUL byte. If it doesn't, or if it has length 0, refuse to look at it and abort the call with EINVAL. FIXES 135445 Change-Id: If55f72d6556bc7b1704a3848865bfb902ee9f92a Reviewed-on: https://gerrit.openafs.org/15908 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2024-11-12 12:32:24 -05:00
Mark Vitale	21491d9a82	Restore max() macro for Heimdal Commit `de46edef69` (Remove lowercase min/max macro definitions) inadvertently removed a definition of max() that is required by _krb5_n_fold() (src/external/heimdal/krb5/n-fold.c) if no system definition of max() is available. For example, on Solaris this results in a kernel module with an undefined external symbol 'max', which prevents the kernel module from loading: genunix: [ID 819705 kern.notice] /kernel/drv/amd64/afs: undefined symbol genunix: [ID 826211 kern.notice] 'max' genunix: [ID 472681 kern.notice] WARNING: mod_load: cannot load module 'afs' Restore the required max() macro definition. Change-Id: I6bca2bb2b90d7cdbe90a3b769997cdc153f59f6c Reviewed-on: https://gerrit.openafs.org/15874 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2024-11-07 16:35:00 -05:00
Andrew Deason	d93871b054	libtool: Make libfoo.krb depend on libfoo During parallel builds, we can run the final 'libtool --mode=link' step for, say, pam_afs.la and pam_afs.krb.la in parallel. When libtool runs --mode=link for these libraries, it effectively does the following (amongst many other steps): - rm -rf .libs/"$libname".* - mkdir .libs/"$libname".lax - mkdir .libs/"$libname".lax/libopr_pic.a The 'rm -rf' cleans up any lingering files from previous runs in the .libs dir (e.g. .libs/pam_afs.a, .libs/pam_afs.lai, etc etc). The '.lax' directory is used as a temporary space to extract object files in dependant libs. And of course it creates various other files in there named like .libs/pam_afs.lai. But if we're building a library called pam_afs.krb.la, then some of our temporary files will be named like .libs/pam_afs.krb.lax, which get deleted when building pam_afs.la when it deletes ".libs/pam_afs.". So, if we build pam_afs.la while the libtool command for pam_afs.krb.la is using .libs/pam_afs.krb.lax, the temporary files can get deleted out from under us, causing rare confusing libtool errors. For example: /bin/sh ../../libtool --quiet --mode=link --tag=CC gcc -rpath /usr/local/lib \ -module -no-undefined -o pam_afs.la [...] /bin/sh ../../libtool --quiet --mode=link --tag=CC gcc -rpath /usr/local/lib \ -module -no-undefined -o pam_afs.krb.la [...] ../../libtool: line 1719: cd: .libs/pam_afs.krb.lax/libopr_pic.a: No such file or directory make[3]: [Makefile:76: pam_afs.krb.la] Error 1 make[3]: * Waiting for unfinished jobs.... Fortunately, most of the libraries we build don't have dots in their name, and so avoid this issue. But we do have a few libraries that build a ".krb" variant, and so can be impacted by this. This includes pam_afs.krb, libauth.krb, and libkauth.krb. To make sure these libraries don't encounter this problem, make the .krb variant depend on the non-.krb variant, so the libraries aren't built in parallel. Change-Id: Icb96a721b481bc3d99c9e24cf81fcfbbf7d498f6 Reviewed-on: https://gerrit.openafs.org/15328 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>	2024-11-07 15:49:40 -05:00
Mark Vitale	af1a384072	bozo: Remove unused conditional OPBOS from 'bos' Since the original IBM code import, the 'bos' utility source code has contained an unused pre-processor conditional 'OPBOS'. If OPBOS is defined, it builds a bos binary that implements only the following bos commands: status, start, stop, restart, startup, shutdown. There is no configure option or any other logic that defines 'OPBOS', nor is it documented. Remove the unused conditional. Change-Id: I4d32084e6a6b4ca8cadb0aa0ef06c74fb6edb770 Reviewed-on: https://gerrit.openafs.org/15875 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2024-11-05 11:12:48 -05:00
Mark Vitale	3db9ecd852	SOLARIS: Support 64-bit SPARC userspace builds Modify OpenAFS to respect user-specified ARCHFLAGS (-m32 or -m64) when building on Solaris 10 and 11 SPARC platforms. If ARCHFLAGS is unset/unspecified, we will continue to build 32-bit userspace by default. Implement -m64 SPARC LWP via USE_UCONTEXT; leave the -m32 SPARC LWP (default) implementation unchanged. Kernelspace builds remain 64-bit, regardless of ARCHFLAGS value. Change-Id: Ibb8e145d0cb8dfbe05de896b84c1d1135664b845 Reviewed-on: https://gerrit.openafs.org/14897 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2024-11-04 18:19:16 -05:00
Marcio Barbosa	a06cafd6d6	afs: Make FlushReclaimedVcaches() Darwin specific Currently, the ReclaimedVCList is only fed on Darwin. As a result, the logic present in afs_FlushReclaimedVcaches() is never used on all other platforms and should be wrapped in #ifdef AFS_DARWIN_ENV. While here, also reorganize this function in an effort to make it more readable. No functional change is incurred by this commit. Change-Id: I1097a2b07fcb48dec2b31385477c34e974144b51 Reviewed-on: https://gerrit.openafs.org/15027 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2024-11-04 13:39:55 -05:00
Marcio Barbosa	f98e84458c	afs: Convert afs_vhashT to use struct afs_q Because the afs_vhashT hash table uses singly-linked lists in its hash buckets, to remove an entry we have to: 1. Find the bucket where the entry to be removed should be present. 2. Go through each element of this bucket until we find this entry. 3. Remove it. Due to step number 2, the time required to remove an entry from a bucket increases as the linked list associated with this bucket gets longer. This can get particularly bad when evicting vcaches (ShakeLooseVCaches), as each entry from the least recently used queue (VLRU) has to be found in our afs_vhashT hash table. This problem is exacerbated when files are repeatedly deleted and recreated in the same volume, resulting in many vcaches with the same volume id and vnode number (but different unique ids) in the same bucket. To avoid this problem, build afs_vhashT using doubly-linked lists with 'struct afs_q', like we did for afs_vhashTV in commit `4fc48af8` (vc-hashing-be-less-expensive-20050728). Now, removing a vcache from afs_vhashT is an O(1) operation (QRemove). Change-Id: I5a1a9a090f9aa3d8884e2bb12aca1f1acc3b902d Reviewed-on: https://gerrit.openafs.org/14949 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2024-11-01 12:44:45 -04:00
Cheyenne Wills	e1bb50849e	WINNT: Use safer string functions in DumpAfsLog Replace strcpy, strcat, and sprintf with safer string functions in the tests/torture/DumpAfsLog program. Add checks for string truncation where needed. Change-Id: I4eac12e73bbeaeadc0a8a6b5ebded664ca91b7f8 Reviewed-on: https://gerrit.openafs.org/15489 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Gaurav Saxena <gsaxena@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: Michael Meffie <mmeffie@sinenomine.net>	2024-11-01 11:20:22 -04:00
Michael Meffie	1fdc1812ce	WINNT: Remove commented out code in DumpAfsLog Remove the commented out code in the DumpAfsLog test program. Change-Id: Ie7b380ec37d18cfc7b1be8ce9904b55ad101b263 Reviewed-on: https://gerrit.openafs.org/15871 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>	2024-11-01 11:19:12 -04:00
Cheyenne Wills	cacc1bb9d4	afsd: Fix problems found by static analysis Several static analysis tools have identified various problems: - files left open from early returns/exits (infer) - missing checks to ensure alloc was successful (infer) - memory leaks due to not freeing storage before a return (infer) - unused variables (cppcheck) To resolve the above problems: - close files before returning/exiting - add checks to ensure alloc was successful before using the memory - free storage before returning - remove unused variables This commit is a reorganization of commits submitted by Pat Riehecky, who ran the static analysis tools and developed the fixes. Change-Id: If45d64c370ad68d1e844b6aa5881f9dbd75300c9 Reviewed-on: https://gerrit.openafs.org/14671 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>	2024-11-01 10:54:53 -04:00
Cheyenne Wills	278fc9d1c3	Linux: style cleanup in osi_groups.c There are several large #if/#endif blocks that do not have a comment on the #endif, and some of the nested #if/#else/#endif blocks are not indented. In addition, there are violations of the CODING style such as indentation errors, trailing whitespace, braces for functions that are not on their own line, etc. Add comments to the #endif statements for the larger blocks. Use indent with parameters from CODING to fix the remaining style issues, and fix up the preprocessor indentation. There are no functional changes Change-Id: I89a13a1f43aa08728777a71a41984613ac163b95 Reviewed-on: https://gerrit.openafs.org/15669 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>	2024-11-01 10:52:51 -04:00
Michael Meffie	7ad7a2bebf	util: Restore min macro in ktime.h Commit `de46edef69` (Remove lowercase min/max macro definitions) removed the min() macro definition from the util/ktime.h header. Unfortunately, that min() definition is still required for some of the Windows source files. Later, the windows sources can be converted to use the new in-tree opr_min and opr_max macros, but for now reinstate the min() macro definition in ktime.h. Fixes the build errors on Windows: msrpc.obj : error LNK2001: unresolved external symbol _min RDRFunction.obj : error LNK2001: unresolved external symbol _min cm_server.obj : error LNK2019: unresolved external symbol _min referenced in function _cm_SetServerIPRank cm_dcache.obj : error LNK2001: unresolved external symbol _min cm_vnodeops.obj : error LNK2001: unresolved external symbol _min smb3.obj : error LNK2001: unresolved external symbol _min Change-Id: I043e23e72977ef06330b53f05044cd3be93cc00e Reviewed-on: https://gerrit.openafs.org/15870 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2024-10-19 10:44:19 -04:00
Michael Meffie	67a5d474fa	auth: Add missing opr.h include in ktc_nt.c Commit `8e1e242ce4` (Replace lowercase min/max macros with opr_min/max) converted the min() invocation in auth/ktc_nt.c with our in-tree opr_min() macro, but missed adding the opr.h header to that source file. Add the opr.h include to fix the build error on windows: afsauth.lib(ktc_nt.obj) : error LNK2019: unresolved external symbol _opr_min referenced in function _GetLocalToken Change-Id: I96dbe377a8a85921fa1833cd27638cd853e26b9c Reviewed-on: https://gerrit.openafs.org/15869 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2024-10-19 10:44:11 -04:00
Cheyenne Wills	df010f3e7c	afsd: honor the -nbiods option The commit: "afsd: Move to newer cmd syntax" (`fbe6f0f4ad`) removed setting the flag variable "sawBiods" when the -nbiods option is used, this results in the -nbiods value being ignored and the default value being used. This problem was identified by the cppcheck static analysis tool. Note: The -nbiods option is only for AIX systems. Change-Id: I9a8d199a212a164660c1e9f8b4151792d9c9d0e0 Reviewed-on: https://gerrit.openafs.org/15212 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>	2024-10-19 06:53:22 -04:00
Mark Vitale	094066b485	pt_util: Dynamically allocate page buffer Since introduced in commit `53986479ee` (ptutil-initial-20001219), pt_util has used a statically allocated buffer to read pages from the database. In several places the buffer is cast to a struct ubik_header. Memory allocated dynamically via malloc()/calloc() is guaranteed to be properly aligned for objects of any type at run time, however buffers that are not allocated dynamically have no such guarantee. So, instead of using a statically allocated buffer, dynamically allocate it with calloc() to ensure it is aligned for struct ubik_header objects. This fixes a SIGBUS error whenever the buffer happens to be built with a non-word alignment. This failure was observed while running the tests/ptserver/pt_util-t unit test on Solaris SPARC 64-bit. Note: By inspection, the same potential problem exists in src/kauth/ka_util.c. However, kaserver is deprecated and ka_util has never been built by default. Change-Id: I00372ed6adede32448b5e0df1c1d74689286a218 Reviewed-on: https://gerrit.openafs.org/15352 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>	2024-10-19 06:51:32 -04:00
Cheyenne Wills	03f9b08e90	rxkad: rename ENCRYPT macro to FCRYPT_ENCRYPT rxkad/fcrypt defines 2 macros: ENCRYPT and DECRYPT. The names are pretty generic and do not clearly associate with fcrypt. In addition, the name ENCRYPT collide with the Microsoft provided header 'ntddndis.h' that uses ENCRYPT as an ENUM (on newer versions of the Windows development kits): typedef enum _OFFLOAD_OPERATION_E { AUTHENTICATE = 1, ENCRYPT } This results in a build error on Windows when fcrypt.h is included with ntddndis.h: ... ntddndis.h(2212): error C2059: syntax error: 'constant' Rename the ENCRYPT/DECRYPT macros to FCRYPT_ENCRYPT/FCRYPT_DECRYPT in fcrypt.h to a name that relates the macros back to fcrypt. Note: The ENCRYPT/DECRYPT symbols are part of a public interface installed in fcrypt.h, but keeping the old names are impractical, so we are changing them anyway. Change-Id: I9c51c81fab7fcbf0dae01569852ca94c0e6a0439 Reviewed-on: https://gerrit.openafs.org/15868 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2024-10-18 15:18:23 -04:00
Cheyenne Wills	5b2b27e321	opr: Check for failed memory allocation Several static analysis tools have identified various problems: - missing checks to ensure alloc was successful (infer) To resolve the above problems: - add checks to ensure alloc was successful before using the memory This commit is a reorganization of commits developed by Pat Riehecky, who ran the static analysis tools and developed the fixes. Change-Id: I9396b8327f8283c60821050c75de1c36e0ebd12e Reviewed-on: https://gerrit.openafs.org/14684 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>	2024-10-18 13:51:45 -04:00
Mark Vitale	c20dceb392	volser: Always fetch locked entry in CheckVolume CheckVolume accepts an optional nvldbentry as an argument. When this argument is NULL, CheckVolume correctly locks the nvldbentry before obtaining the entry and making changes. However, if an nvldbentry is passed, CheckVolume merely copies it to a work entry; it also neglects to obtain the current state of the entry after locking it for modification. Change CheckVolume to obtain a locked copy of the nvldbentry if modifications are required. Change-Id: Ie98bc57af1cc2955d10982f44abd4c55755c6244 Reviewed-on: https://gerrit.openafs.org/14356 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2024-10-18 11:29:18 -04:00
Mark Vitale	663be53c18	volser: Split CheckVolume into RW/RO/BK functions In preparation for a future commit, refactor some large chunks of logic out of CheckVolume and into new static routines to handle the specific cases for RW, RO, and BK volumes. Also provide Doxygen documentation for CheckVolume. No functional change should be incurred by this commit. Change-Id: Iac6a773d29ed530e9676f02726db031b1f2a2526 Reviewed-on: https://gerrit.openafs.org/14355 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2024-10-18 11:28:37 -04:00
Mark Vitale	c14536545d	volser: Lock entry properly in UV_RestoreVolume2 Since the original IBM code import, UV_RestoreVolume (later renamed to to the current name UV_RestoreVolume2) has called VLDB_GetEntryByID to obtain the VLDB entry of the target/destination volume. This entry is then modified to match the restore changes, locked via VL_SetLock, and replaced via VL_ReplaceEntryN. Unfortunately, this allows for a race with other volume operations against the same volume, if another operation alters the entry after we call VLDB_GetEntryByID but before ubik_VL_SetLock. To avoid this, call GetLockedEntry to ensure that the entry is locked while we are reading and modifying it. Change-Id: I772ef80158205f45c04f8ef0aa95726f1f2c4ac4 Reviewed-on: https://gerrit.openafs.org/14354 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2024-10-18 11:27:51 -04:00
Mark Vitale	1073e94779	volser: remove check for VL_ENTDELETED Since the original IBM code import, UV_RestoreVolume2 (originally UV_RestoreVolume) has checked for VL_ENTDELETED from VLDB_GetEntryByID. However, if VL_ENTDELETED is received, we will act like we retrieved the vlentry successfully. If the -verbose flag is given, we'll try to print the entry (even though one was not returned), and we'll call VL_SetLock for the volume, which will almost certainly also fail with VL_ENTDELETED and bail out. Furthermore (also since the original IBM import), there is no longer any code path to set the volume entry flag VLDELETED. So it should be rare for any VL_* RPC to return VL_ENTDELETED; that would require an entry to have the VLDELETED flag carried over from a very old database. Therefore, it is safe and reasonable to simply remove the check for VL_ENTDELETED from UV_RestoreVolume2. If we do somehow receive the VL_ENTDELETED error code, vos will terminate with an error instead of possibly printing an uninitialized entry. In addition, remove the same check for VL_ENTDELETED in the equivalent code path in libadmin's UV_RestoreVolume. Change-Id: I11d1c3306f67d68de54780f6aac75e4c27779db4 Reviewed-on: https://gerrit.openafs.org/14357 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: Andrew Deason <adeason@sinenomine.net>	2024-10-18 11:26:58 -04:00
Mark Vitale	f3b5bb9a05	volser: clean up and clarify storeEntry usage In several routines in vsprocs.c, we declare and clear nvldbentry storeEntry long before it is actually needed. Instead, where possible, move the declaration and memset of storeEntry closer to the actual point of use. No functional change is incurred by this commit. Change-Id: Iffbf5ff56fab4d221586dfb93bbbbfde063c856e Reviewed-on: https://gerrit.openafs.org/14353 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net>	2024-10-18 11:16:43 -04:00
Mark Vitale	ab660e2537	volser: Use GetLockedEntry in UV_ReleaseVolume Change UV_ReleaseVolume() to use the new GetLockedEntry() function. Add logic to GetLockedEntry() to support the special semantics of VL_SetLock for VLOP_RELEASE, to preserve the current behavior of UV_ReleaseVolume() with the VL_RERELEASE error code. The special semantics of the VL_RERELEASE error code with VLOP_RELEASE are a bit odd: this situation can currently only happen with our vlserver if the vlentry has the VLOP_RELEASE flag set but no lock timestamp. The only way that situation can occur with our current 'vos' is during a small window after a partially failed 'vos release'. Around where UV_ReleaseVolume() prints the message "The volume %lu could not be released to the following %d sites", we call VLDB_ReplaceEntry() with LOCKREL_TIMESTAMP, clearing just the lock timestamp but not the lock flags. We then jump to the 'rfail' label where we ubik_VL_ReleaseLock() with all LOCKREL_* flags are set, clearing the lock flags. So, the only time in this codepath where the lock timestamp is cleared but the VLOP_RELEASE flag is set is after the mentioned VLDB_ReplaceEntry() call but before the ubik_VL_ReleaseLock() call. If that last ubik_VL_ReleaseLock() fails, or if vos dies before reaching it, then the vlentry is left in a state that causes the VL_RERELEASE error code in the future. This behavior has existed since at least OpenAFS 1.0. Currently, the handling of this special VL_RERELEASE case is completely silent, so it's not clear that this is happening, if it does manage to happen. These special semantics are not really safe, since two vos processes could encounter a VL_RERELEASE volume at the same time, and go through the 'vos release' process in parallel thinking they have the vlentry locked, causing all kinds of problems. This behavior may be changed in the future, but for now just preserve the existing special handling of VL_RERELEASE. Add a warning message, though, to make this special situation not silent. No functional change (other than error messages) should be incurred by this commit. [adeason@sinenomine.net: Add VL_RERELEASE warning message and additional related comments.] Change-Id: I9da50677233f63235de730b5d234a9b575e196fd Reviewed-on: https://gerrit.openafs.org/14352 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2024-10-18 11:13:40 -04:00
Mark Vitale	e18e94887a	volser: Introduce GetLockedEntry In preparation for a future commit, consolidate common VL_SetLock and VL_GetEntryById logic into a new routine GetLockedEntry. This commit should incur no practical change in behavior, but there is a minor internal behavior change for one case: For the case in UV_ConvertRO(), previously we called VL_SetLock with the RW volume id (with volume type RWVOL), and called VL_GetEntryByID with the RO volume id (with the volume type -1); now we call both with the RW volume id. We get these volume IDs from the same VL entry, and they must refer to the same VL entry in order for this function to work properly, so this difference should not matter. Change-Id: I76faf3d184c397bceff27ba3a857b5c80c88e814 Reviewed-on: https://gerrit.openafs.org/14350 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2024-10-18 11:13:03 -04:00
Mark Vitale	a5fc7b3fc3	rx: Reap client conns in rxi_ReapConnections rxi_DestroyConnectionNoLock contains the following early return test: if ((conn->refCount > 0) \|\| (conn->flags & RX_CONN_BUSY)) { /* Busy; wait till the last guy before proceeding */ MUTEX_EXIT(&rx_refcnt_mutex); MUTEX_EXIT(&conn->conn_data_lock); USERPRI; return; } This prevents the rx_connection ("conn") from being destroyed if it is still in use elsewhere, e.g.: - the rx_Listener is handling packets for one of the conn's rx_calls - another caller has a reference to the conn However, since the original IBM code import, rxi_ReceivePacket obtains a conn reference (via rxi_FindConnection) but returns it by merely decrementing the refCount (currently via putConnection), and does not call rx_DestroyConnection. If the listener's rxi_ReceivePacket is processing a reply or ack for a client conn (holding a reference) while another thread is calling rx_DestroyConnection on the otherwise-last reference, the rx_connection will not be destroyed and can never be destroyed - it has been leaked. Modify rxi_ReapConnections to destroy client conns with refCount == 0. This way, these connections can still "leak", but they're eventually cleaned up the next time rxi_ReapConnections runs. Change-Id: I8c588d43b108a8147a07d0ff0cc69055cd00d355 Reviewed-on: https://gerrit.openafs.org/15135 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2024-10-10 16:23:30 -04:00
Mark Vitale	4b5c803d8d	rx: Avoid unnecessary locking in rxi_ReapConnections rxi_ReapConnections obtains conn_data_lock and rx_refcnt_mutex for each server connection in the rx_connHashTable. Instead, only obtain these locks for server connections that have no calls. Change-Id: I27d748e604070792bee3c1f31d4d558462fd399a Reviewed-on: https://gerrit.openafs.org/15349 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2024-10-10 16:21:37 -04:00
Mark Vitale	c5993b0a4f	rx: prevent leak of cache manager NAT ping rx_connections Both the Unix and Windows cache managers maintain a set of persistent client rx_connections ("conns") to the known fileservers for each active AFS user. These conns are periodically refreshed (destroyed and possibly rebuilt) approximately NOTOKETIMEOUT (2h) after token expiration for authenticated (rx_kad) conns, or every NOTOKETIMEOUT (2h) for anonymous (rx_null) conns. Both cache managers enable NAT ping for one rx_connection to each known fileserver. Thus we see this common idiom when a cache manager destroys a fileserver connection: rx_SetConnSecondsUntilNatPing(conn, 0); rx_DestroyConnection(conn); Doing this for all conns is harmless, even if a given conn doesn't have NAT ping enabled. It is important to note that rx_SetConnSecondsUntilNatPing(conn, 0) does not actually cancel the conn's natKeepAliveEvent (if it has one); it merely sets conn->secondsUntilNatPing to 0. If there is a natKeepAliveEvent, this prevents it from being rescheduled after the next event. This was fine in the past because rx_DestroyConnection eventually did cancel natKeepAliveEvent and destroy the rx_connection itself. However, this idiom broke after commit `304d758983` "Standardize rx_event usage" introduced a number of changes: - an extra rx_connection refCount for each outstanding conn event - a requirement (enforced by osi_Assert) that all conn events be canceled before rx_DestroyConnection can succeed Therefore, if natKeepAliveEvent is still active when we enter rx_DestroyConnection, it will return early, due to the extra conn refCount associated with the NAT ping event. The rx_connection is not destroyed in this case. Eventually, the final scheduled natKeepAliveEvent will fire; the event will not be rescheduled, and the final conn refCount will be removed via putConnection and not rx_DestroyConnection (and so the conn is not destroyed). This rx_connection now has refCount 0, but can never be destroyed - it has been leaked: - The cache manager has "forgotten" this rx_connection and has no means to invoke rx_DestroyConnection again. - rxi_ReapConnections will not destroy it because it is a client conn and is still on the rx_connHashTable, not the rx_connCleanup_list. - If there is still a dallying rx_call, any eventual call to rxi_CheckCall -> rxi_FreeCall will not destroy the conn because it has not been flagged RX_CONN_DESTROY_ME. Modify rx_SetConnSecondsUntilNatPing to explicitly cancel any natKeepAliveEvent and remove its refcount if cancelled. With this change in place, the cache managers will no longer periodically leak client rx_connections. Change-Id: I4e89ebc4bd2c95b6e61b95bd8f91867d451dd34c Reviewed-on: https://gerrit.openafs.org/14951 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net>	2024-10-10 16:19:20 -04:00
Michael Meffie	983a670b9d	cmd: Do not leak param in cmd_Parse() The cmd_Parse() function calls strdup() to create a temporary string when there is a parameter in the form "-<name>=<value>". This is done in order to parse the <name> and <value> components of the parameter. This memory is not freed when the last parameter of the command is the form "-<name>=<value>". Always free the param pointer at the end of cmd_Parse() to avoid this memory leak. This memory leak was found by running the tests/cmd/command-t unit test program with valgrind. Change-Id: I85f8f5f10660268596072c8bb8be2c014ac46ad8 Reviewed-on: https://gerrit.openafs.org/15086 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>	2024-10-10 14:57:47 -04:00
Cheyenne Wills	de46edef69	Remove lowercase min/max macro definitions A prior commit: "Replace lowercase min/max macros with opr_min/max" (Change-Id: Ib7c91b4e59aaf76ab77f293016496d6179c23c88) replaced the uses of the min and max macros with opr_min and opr_max. As a cleanup, remove the definitions for the min() and max() macros. As noted in the above commit, there were several locations that left the min/max macros alone: src/WINNT src/external/heimdal Change-Id: I263d91b04a2aed14bf9e1ebca1616629f36e6b23 Reviewed-on: https://gerrit.openafs.org/15843 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>	2024-10-10 12:57:17 -04:00
Cheyenne Wills	8e1e242ce4	Replace lowercase min/max macros with opr_min/max The commit: "opr: replace MIN/MAX macros with opr_min/opr_max" (`915c5cff16`) replaced all uses of the MIN and MAX macros with opr_min and opr_max. Since we now have a consolidated version of opr_min and opr_max, replace the min/max macros with opr_min and opr_max, except for the files in src/WINNT (which can be handled in a future commit). The exception are the min/max macros that are defined and used in the external sources src/external/heimdal, which cannot be changed. Change-Id: Ib7c91b4e59aaf76ab77f293016496d6179c23c88 Reviewed-on: https://gerrit.openafs.org/15825 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>	2024-10-10 12:56:19 -04:00

1 2 3 4 5 ...

14085 Commits