openafs

mirror of https://git.openafs.org/openafs.git synced 2025-01-18 06:50:12 +00:00

Author	SHA1	Message	Date
Cheyenne Wills	88da6b4dfa	cf: Make local copy of ax_gcc_func_attribute.m4 Make a local copy of ax_gcc_func_attribute from autoconf-archive. This is needed in order to fix a bug in the detection of the fallthrough attribute. Remove ax_gcc_func_attribute.m4 from src/external/autoconf-archive/m4. Update LICENSE file to point to the local copy in src/cf. Change-Id: I6c4244d2cd4edab4262c1820435c00419d85303b Reviewed-on: https://gerrit.openafs.org/14272 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-24 08:35:59 -04:00
Mark Vitale	bb5397e4c4	rx: prevent leakage of non-cached rx_connections (pthread) The rxi_connectionCache (AFS_PTHREAD_ENV only) allows applications to reuse rx_connection structs. Cached rx_connections are obtained via rx_GetCachedConnection and released via rx_ReleaseCachedConnection. This feature is used most heavily by libadmin and kauth, but there are other users in the tree as well. For instance, ubikclient routines ubik_ClientInit and ubik_ClientDestroy call rx_ReleaseCachedConnections (if AFS_PTHREAD_ENV) when disposing of their rx_connections. Unfortunately, in many cases these rx_connections were obtained via rx_NewConnection, _not_ from the cache via rx_GetCachedConnection. In those cases, rx_ReleaseCachedConnection will not find the rx_connection in the rxi_connectionCache, and thus it returns without doing anything. Therefore, when ubik_ClientInit is passed an existing ubik_client (for re-initialization) that contains rx_connections NOT allocated via rx_GetCachedConnection, those connections are not destroyed, but will be silently leaked. Similarly, ubik_ClientDestroy will leak its rx_connections when it frees the ubik_client struct. For example, the fileserver host package calls ubik_ClientInit (via hpr_Initialize) and ubik_ClientDestroy (via hpr_End) to manage connections to the ptserver. However, these connections were obtained via rx_NewConnection, not rx_GetCachedConnection. If the fileserver has a failed call to the ptserver that sets prfail=1, the next RPC scheduled for that client (in CallPreamble) will refresh the thread's ubik_client (viced_uclient_key) by calling hprEnd -> ubik_ClientDestroy -> rx_ReleaseCachedConnection. The "released" connections will be leaked. This problem exists in all versions of OpenAFS going back to IBM 1.0. Starting with 1.8.x, many components that were formerly LWP-only are now pthreaded and thus susceptible to this leak. It seems difficult and error-prone to identify all possible code paths that may pass a non-cached rx_connection to rx_ReleaseCachedConnection, and convert them to obtain connections via rx_GetCachedConnection. Instead, prevent all existing and future leaks by modifying the connection cache to: - flag all rx_connections it allocates - correctly release any rx_connection it is passed, whether they came from the cache or not. Change-Id: Ibe164ccd30a8ddd799438c28fd6e1d8a0a9040dd Reviewed-on: https://gerrit.openafs.org/13042 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-07-23 23:42:20 -04:00
Mark Vitale	55fca11421	rx: fix out-of-range value for RX_CONN_NAT_PING Commit `496fb87372` ("rx: avoid nat ping until connection is attached") introduced functionality to defer turning on NAT ping for server connections until after reachability had been established for the client. Unfortunately, this feature could never work correctly because it assigned an out-of-range flag value of 256 (0x100) for the u_char flags field. Instead of calling this out as an error, both gcc and Solaris cc elide this flag so that it is never set in rx_SetConnSecondsUntilNatPing(), Furthermore, the test in rxi_ConnClearAttachWait() will always fail; therefore rxi_ScheduleNatKeepAliveEvent is never called after attach wait has ended. Fortunately, this bug is currently moot - not actually exposed in OpenAFS. (It was discovered by inspection). This is because there are currently no rx_connection objects in the tree that have both NAT ping and checkReach (rx_SetCheckReach) enabled. I also searched git history and found no time when this bug could ever have been exposed. This does raise the question of why the original commit was needed; but instead of reverting the original commit, this commit attempts to fix it. To prevent problems if NAT ping and checkReach are ever both enabled for an rx_connection, enlarge the rx_connection flags member so that the RX_CONN_NAT_PING value is no longer out of range. Change-Id: Ib667ece632f66fa5c63a76398acb3153fed6f9c3 Reviewed-on: https://gerrit.openafs.org/13041 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-23 23:06:14 -04:00
Andrew Deason	d231134aad	auth: Avoid cellconfig.c stdio renaming Since commit `35777145` (solaris-fopen-sucks-20060916), cellconfig.c has redirected fopen, fclose, and fgets to local functions on non-64bit-sparc Solaris, in order to work around that platform's stdio limitations. Commit `7c431f7571` (auth: retire writeconfig.c) moved the contents of writeconfig.c into cellconfig.c. The previous writeconfig.c contained some calls to stdio, including calling fprintf() on a pointer returned by fopen() in that file. Because fopen() was redirected to our local version, this means that afsconf_SetExtendedCellInfo() calls fopen() to get an afsconf_iobuffer, and passes that pointer to the real system fprintf() later on (instead of a native FILE). The compiler does warn about this, but this only happens on Solaris, where --enable-checking is not implemented, so the build never fails. To avoid this, remove the #defines for fopen, fgets, and fclose. Instead, change all of the old cellconfig.c callers to explicitly call afsconf_fopen, afsconf_fgets, and afsconf_fclose. On the affected Solaris platforms, we keep our local definitions, and for other platforms, we just make those functions call their system stdio equivalents. For the code that was pulled in from writeconfig.c, callers will just call the system fopen, fprintf, and fclose. We still keep our local afsconf_FILE* definition on all platforms, so the compiler will still do typechecking for our local afsconf_f* functions on all platforms. So now if we make a mistake, it should be a mistake on all platforms, so platforms with --enable-checking should flag the error. Change-Id: I4064d7f5ee82d5acab04a33b01c0603564a391e8 Reviewed-on: https://gerrit.openafs.org/14214 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-13 16:49:50 -04:00
Andrew Deason	cd65475e95	afs: Let afs_ShakeLooseVCaches run longer Currently, when afs_ShakeLooseVCaches runs osi_TryEvictVCache, we check if osi_TryEvictVCache slept (i.e. dropped afs_xvcache/GLOCK). If we sleep over 100 times, then we stop trying to evict vcaches and return. If we have recently accessed a lot of AFS files, this limitation can severely reduce our ability to keep our number of vcaches limited to a reasonable size. For example: Say a Linux client runs a process that quickly accesses 1 million files (a simple 'find' command) and then does nothing else. A few minutes later, afs_ShakeLooseVCaches is run, but since all of the newly accessed vcaches have dentries attached to them, we will sleep on each one in order to try to prune the attached dentries. This means that afs_ShakeLooseVCaches will evict 100 vcaches, and then return, leaving us with still almost 1 million vcaches. This will happen repeatedly until afs_ShakeLooseVCaches finally works its way through all of the vcaches (which takes quite a while, if we only clear 100 at once), or the dentries get pruned by other means (such as, if Linux evicts them due to memory pressure). The limit of 100 sleeps was originally added in commit `29277d96` (newvcache-dont-spin-20060128), but the current effect of it was largely introduced in commit `9be76c0d` (Refactor afs_NewVCache). It exists to ensure that afs_ShakeLooseVCaches doesn't take forever to run, but the limit of 100 sleeps may seem quite low, especially if those 100 sleeps run very quickly. To avoid the situation described above, instead of limiting afs_ShakeLooseVCaches based on a fixed number of sleeps, limit it based on how long we've been running, and set an arbitrary limit of roughly 3 seconds. Only check how long we've been running after 100 sleeps like before, so we're not constantly checking the time while running. Log a new warning if we exit afs_ShakeLooseVCaches prematurely if we've been running for too long, to help indicate what is going on. Change-Id: I65729ace748e8507cc0d5c26dec39e74d7bff5d2 Reviewed-on: https://gerrit.openafs.org/14254 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-10 01:27:45 -04:00
Andrew Deason	9ff45e73cf	afs: Skip bulkstat if stat cache looks full Currently, afs_lookup() will try to prefetch dir entries for normal dirs via bulkstat whenever multiple pids are reading that dir. However, if we already have a lot of vcaches, ShakeLooseVCaches may be struggling to limit the vcaches we already have. Entering afs_DoBulkStat can make this worse, since we grab afs_xvcache repeatedly, we may kick out other vcaches, and we'll possibly create 30 new vcaches that may not even be used before they're evicted. To try to avoid this, skip running afs_DoBulkStat if it looks like the stat cache is really full. Change-Id: I1634530170a189f32cb962dd7df28f88bc758b71 Reviewed-on: https://gerrit.openafs.org/13256 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-10 01:16:27 -04:00
Andrew Deason	0532f917f2	afs: Log warning when we detect too many vcaches Currently, afs_ShakeLooseVCaches has a kind of warning that is logged when we fail to free up any vcaches. This information can be useful to know, since it may be a sign that users are trying to access way more files than our configured vcache limit, hindering performance as we constantly try to evict and re-create vcaches for files. However, the current warning is not clear at all to non-expert users, and it can only occur for non-dynamic vcaches (which is uncommon these days). To improve this, try to make a general determination if it looks like the stat cache is "stressed", and log a message if so after afs_ShakeLooseVCaches runs (for all platforms, regardless of dynamic vcaches). Also try to make the message a little more user-friendly, and only log it (at most) once per 4 hours. Determining whether the stat cache looks stressed or not is difficult and arguably subjective (especially for dynamic vcaches). This commit draws a few arbitrary lines in the sand to make the decision, so at least something will be logged in the cases where users are constantly accessing way more files than our configured vcache limit. Change-Id: I022478dc8abb7fdef24ccc06d477b349cca759ac Reviewed-on: https://gerrit.openafs.org/13255 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-10 01:15:17 -04:00
Mark Vitale	42fb8786a8	viced: propagate return from CleanupTimedOutCallBacks_r The fileserver's FiveMinuteCheckLWP periodically calls CleanupTimedOutCallBacks, and logs an informational messages if the return code indicates that any callbacks were discarded. However, since the original IBM code import, CleanupTimedOutCallBacks has 1) ignored the return value from CleanupTimedOutCallBacks_r and 2) unconditionally returned 0. This makes the informational message essentially dead code. Instead, check the code from CleanupTimedOutCallBacks_r and pass it back to the caller. Change-Id: I631831c398e43431b79f4a3a0c6f01307ac0c05e Reviewed-on: https://gerrit.openafs.org/14256 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-10 00:53:12 -04:00
Andrew Deason	f9d20c631d	LINUX: Close cacheFp if no ->readpage in fastpath In afs_linux_readpage_fastpath, if we discover that our disk cache fs has no ->readpage function, we'll 'goto out', but we never close our cacheFp. To make sure we close it, add a filp_close() call to the 'goto out' cleanup code. Change-Id: I371c1d7ec51b03447fbcbe58fb89be7be0235022 Reviewed-on: https://gerrit.openafs.org/14252 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-07-03 18:16:46 -04:00
Cheyenne Wills	af73b9a3b1	LINUX: Don't panic on some file open errors Commit 'LINUX: Return NULL for afs_linux_raw_open error' (`f6af4a155`) updated afs_linux_raw_open to return NULL on some errors, but still panics if obtaining the dentry fails. Commit 'afs: Verify osi_UFSOpen worked' (`c6b61a451`) updated callers of osi_UFSOpen to verify whether or not the open was successful. This meant osi_UFSOpen (and routines it calls) could pass back an error indication rather than panic when an error is encountered. Update afs_linux_raw_open to return a failure instead of panic if unable to obtain a dentry. Update osi_UFSOpen to return a NULL instead of panic if unable to obtain memory or fails to open the file. All callers of osi_UFSOpen handle a fail return, though some will still issue a panic. Update afs_linux_readpage_fastpath and afs_linux_readpages to not panic if afs_linux_raw_open fails. Instead of panic, return an error. For testing, an error can be forced by removing a file from the cache directory. Note this work is based on a commit by pruiter@sinenomine.net Change-Id: Ic47e4868b4f81d99fbe3b2e4958778508ae4851f Reviewed-on: https://gerrit.openafs.org/14242 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-07-03 18:16:36 -04:00
Cheyenne Wills	d2d27f975d	afs: Avoid panics on failed return from afs_CFileOpen afs_CFileOpen is a macro that invokes the open "method" of the afs_cacheOps structure, and for disk caches the osi_UFSOpen function is used. Currently osi_UFSOpen will panic if there is an error encountered while opening a file. Prepare to handle osi_UFSOpen function returning a NULL instead of issuing a panic (future commit). Update callers of afs_CFileOpen to test for an error and to return an error instead of issuing a panic. While this commit eliminates some panics, it does not address some of the more complex cases associated with errors from afs_CFileOpen. Change-Id: I2bdd525633dd44ebf8e26fcfd7059dfdfffb6142 Reviewed-on: https://gerrit.openafs.org/14241 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-03 11:50:42 -04:00
Cheyenne Wills	7d85ce221d	LINUX 5.8: use lru_cache_add With Linux-5.8-rc1 commit 'mm: fold and remove lru_cache_add_anon() and lru_cache_add_file()' (6058eaec), the lru_cache_add_file function is removed since it was functionally equivalent to lru_cache_add. Replace lru_cache_add_file with lru_cache_add. Introduce a new autoconf test to determine if lru_cache_add is present For reference, the Linux changes associated with the lru caches: __pagevec_lru_add introduced before v2.6.12-rc2 lru_cache_add_file introduced in v2.6.28-rc1 __pagevec_lru_add_file replaces __pagevec_lru_add in v2.6.28-rc1 vmscan: split LRU lists into anon & file sets (4f98a2fee) __pagevec_lru_add removed in v5.7 with a note to use lru_cache_add_file mm/swap.c: not necessary to export __pagevec_lru_add() (bde07cfc6) lru_cache_add_file removed in v5.8 mm: fold and remove lru_cache_add_anon() and lru_cache_add_file() (6058eaec) lru_cache_add exported mm: fold and remove lru_cache_add_anon() and lru_cache_add_file() (6058eaec) Openafs will use: lru_cache_add on 5.8 kernels lru_cache_add_file from 2.6.28 through 5.7 kernels __pagevec_lru_add/__pagevec_lru_add_file on pre 2.6.28 kernels Change-Id: I79ebe4a81425bf8a8a327ddf2d3474aff9df039d Reviewed-on: https://gerrit.openafs.org/14249 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Yadavendra Yadav <yadayada@in.ibm.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-03 00:21:49 -04:00
Benjamin Kaduk	ae9ea8da69	Recode a couple files from ISO 8859-1 to UTF-8 Reported by Debian's lintian(1). The CellServDB, as an externally maintained file, is left unchanged. Change-Id: I3bf241b924cb8cd7799a4c3e799f6acd375b2e8a Reviewed-on: https://gerrit.openafs.org/14265 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-02 23:34:24 -04:00
Andrew Deason	ba8b92401b	afs: Bound afs_DoBulkStat dir scan Currently, afs_DoBulkStat will scan the entire directory blob, looking for entries to stat. If all or almost all entries are already stat'd, we'll scan through the entire directory, doing nontrivial work on each entry (we grab afs_xvcache, at least). All of this work is pretty pointless, since the entries are already cached and so we won't do anything. If many processes are trying to acquire afs_xvcache, this can contribute to performance issues. To avoid this, provide a constant bound on the number of entries we'll search through: nentries * 4. The current arbitrary limits cap nentries at 30, so this means we're capping the afs_DoBulkStat search to 120 entries. Change-Id: I66e9af5b27844ddf6cf37c8286fcc65f8e0d3f96 Reviewed-on: https://gerrit.openafs.org/13253 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-02 21:56:30 -04:00
Andrew Deason	6c808e05ad	afs: Avoid needless W-locks for afs_FindVCache The callers of afs_FindVCache must hold at least a read lock on afs_xvcache; some hold a shared or write lock (and set IS_SLOCK or IS_WLOCK in the given flags). Two callers (afs_EvalFakeStat_int and afs_DoBulkStat) currently hold a write lock, but neither of them need to. In the optimal case, where afs_FindVCache finds the given vcache, this means that we unnecessarily hold a write lock on afs_xvcache. This can impact performance, since afs_xvcache can be a very frequently accessed lock (a simple operation like afs_PutVCache briefly holds a read lock, for example). To avoid this, have afs_DoBulkStat hold a shared lock on afs_xvcache, upgrading to a write lock when needed. afs_EvalFakeStat_int doesn't ever need a write lock at all, so just convert it to a read lock. Change-Id: I5bd58b9e3a577c9e1ebf1bc3719e65a6c0af5cb8 Reviewed-on: https://gerrit.openafs.org/12656 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-07-02 21:46:45 -04:00
Kailas Zadbuke	e44d6441c8	util: Handle serverLogMutex lock across forks If a process forks when another thread has serverLogMutex locked, the child process inherits the locked serverLogMutex. This causes a deadlock when code in the child process tries to lock serverLogMutex, since we can never unlock serverLogMutex because the locking thread no longer exists. This can happen in the salvageserver, since the salvageserver locks serverLogMutex in different threads, and forks to handle salvage jobs. To avoid this deadlock, we register handlers using pthread_atfork() so that the serverLogMutex will be held during the fork. The fork will be blocked until the worker thread releases the serverLogMutex. Hence the serverLogMutex will be held until the fork is complete and it will be released in the parent and child threads. Thanks to Yadavendra Yadav(yadayada@in.ibm.com) for working with me on this issue. Change-Id: I191c8272825c1667bb2150146e04b1dfe36a54e4 Reviewed-on: https://gerrit.openafs.org/14239 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-06-30 00:49:21 -04:00
Andrew Deason	19cd454f11	afs: Split out bulkstat conditions into a function Our current if() statement for determining whether we should run afs_DoBulkStat to prefetch dir entries is a bit large, and grows over time. Split this logic out into a separate function to make it easier to maintain, and add some comments to help explain each condition. This commit should have no visible effects; it's just code reorganization. Change-Id: I0086189308d2f5e4b321c63f24110d74cda6433c Reviewed-on: https://gerrit.openafs.org/13254 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-25 23:37:15 -04:00
Andrew Deason	a05d5b7503	afs: Change VerifyVCache2 calls to VerifyVCache afs_VerifyVCache is a macro that (on most platforms) effectively expands to: if ((avc->f.states & CStatd)) { return 0; } else { return afs_VerifyVCache2(...); } Some callers call afs_VerifyVCache2 directly, since they already check for CStatd for other reasons. A few callers currently call afs_VerifyVCache2, but without guaranteeing that CStatd is not set. Specifically, in afs_getattr and afs_linux_VerifyVCache, CStatd could be set while afs_CreateReq drops GLOCK. And in afs_linux_readdir, CStatd could be cleared at multiple different points before the VerifyVCache call. This can result in afs_VerifyVCache2 acquiring a write-lock on the vcache, even when CStatd is already set, which is an unnecessary performance hit. To avoid this, change these call sites to use afs_VerifyVCache instead of calling afs_VerifyVCache2 directly, which skips the write lock when CStatd is already set. Change-Id: I7b75c9755af147b42a48160fa90c9849f2f03ddb Reviewed-on: https://gerrit.openafs.org/12655 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-22 22:37:44 -04:00
Mark Vitale	7c9fb44557	LINUX: replace BUG() call with osi_Panic() in osi_linux_free If osi_linux_free fails, it printf's an error message, then calls BUG(). This is the sole open-coded call to BUG() in OpenAFS; all other calls to BUG() are indirect via osi_Panic(). For consistency, eliminate this direct BUG() call by replacing the printf and BUG() with an equivalent osi_Panic(). This also ensures that the error messsage is logged as critical, and prefixed with "openafs:". Change-Id: Id319dffa859308528a66991bbbc522ca49552d51 Reviewed-on: https://gerrit.openafs.org/14250 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-19 12:55:49 -04:00
Cheyenne Wills	d8ec294534	LINUX 5.8: do not set name field in backing_dev_info Linux-5.8-rc1 commit 'bdi: remove the name field in struct backing_dev_info' (1cd925d5838) Do not set the name field in the backing_dev_info structure if it is not available. Uses an existing config test 'STRUCT_BACKING_DEV_INFO_HAS_NAME' Note the name field in the backing_dev_info structure was added in Linux-2.6.32 Change-Id: I20b80e49e8a15a2949003101f24d9ce39f63b59b Reviewed-on: https://gerrit.openafs.org/14248 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-19 12:00:57 -04:00
Cheyenne Wills	c48072b980	LINUX 5.8: Replace kernel_setsockopt with new funcs Linux 5.8-rc1 commit 'net: remove kernel_setsockopt' (5a892ff2facb) retires the kernel_setsockopt function. In prior kernel commits new functions (ip_sock_set_) were added to replace the specific functions performed by kernel_setsockopt. Define new config test 'HAVE_IP_SOCK_SET' if the 'ip_sock_set' functions are available. The config define 'HAVE_KERNEL_SETSOCKOPT' is no longer set in Linux 5.8. Create wrapper functions that replace the kernel_setsockopt calls with calls to the appropriate Linux kernel function(s) (depending on what functions the kernel supports). Remove the unused 'kernel_getsockopt' function (used for building with pre 2.6.19 kernels). For reference Linux 2.6.19 introduced kernel_setsockopt Linux 5.8 removed kernel_setsockopt and replaced the functionality with a set of new functions (ip_sock_set_) Change-Id: I517b674303c5decc19313d9de51d04ddef36b421 Reviewed-on: https://gerrit.openafs.org/14247 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-19 12:00:28 -04:00
Andrew Deason	cbc5c4b51f	tests: Modernize writekeyfile.c tests/auth/writekeyfile.c contains some code used to generate tests/auth/KeyFile, which is used to test code interpreting the old-style KeyFile format. This code currently has a few problems: - We don't check the results of afstest_mkdtemp, which could allow symlink attacks from other users on the system. - We duplicate some logic from afstest_BuildTestConfig, in order to build a temporary config dir. - writekeyfile isn't built or run by default (it only exists to generate KeyFile, so it's almost never run), so eventual bitrot is quite likely, and the existing code already generates warnings. To avoid this, change writekeyfile.c to use the existing afstest_BuildTestConfig to generate a local config dir. To ensure we avoid bitrot, build writekeyfile by default, and create a test to run it, to make sure it can generate a KeyFile as expected. Note that the KeyFile.short we test against is different than the KeyFile currently in the tree. The existing KeyFile was generated from an older OpenAFS release, which always generated 100-byte KeyFiles, even if we only have a few keys. The current codebase only writes out as much key data as needed, so the generated KeyFiles are shorter (but still understandable by older OpenAFS releases). Keep the old 100-byte KeyFile around, since that's what older OpenAFS would generate, and create a new KeyFile.short to test against, to make sure our code for generating KeyFiles doesn't change any further. Change-Id: Ibe9246c6dd808ed2b2225dd7be2b27bbdee072fd Reviewed-on: https://gerrit.openafs.org/14246 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-06-19 11:48:57 -04:00
Cheyenne Wills	22a66e7b7e	tests: Use usleep instead of nanosleep Commit "Build tests by default" `68f406436c` changes the build so tests are always built. On Solaris 10 the build fails because nanosleep is in librt, which we do not link against. Replace nanosleep with usleep. This avoids introducing extra configure tests just for Solaris 10. Note that with Solaris 11 nanosleep was moved from librt to libc, the standard C library. Change-Id: I6639f32bb8c8ace438e0092a866f06561dad54f1 Reviewed-on: https://gerrit.openafs.org/14244 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-19 10:54:13 -04:00
Cheyenne Wills	5f4a681eeb	tests: Emulate mkdtemp when not available Commit "Build tests by default" `68f406436c` changes the build so tests are always built. On Solaris 10 Update 10 and earlier the build fails because the mkdtemp function is not available. Introduce a wrapper 'afstest_mkdtemp' that uses mkdtemp if available, otherwise uses mktemp/mkdir. Change-Id: I0118f838ed9a89927e2ddac4cad822574601558a Reviewed-on: https://gerrit.openafs.org/14243 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-19 10:54:07 -04:00
Michael Meffie	188ca8bf52	make-release: Run git describe once Run git describe once at the beginning of make-release to find the version information used to derive the tarball file names and saved in the .version file. This is a cleanup and refactoring change to prepare for a future commit. Change-Id: I0debeeffa5d2c63ab1498588766cb36424d15cd5 Reviewed-on: https://gerrit.openafs.org/14150 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-18 21:15:15 -04:00
Michael Meffie	d0753c0ace	make-release: Create output directory if needed Automatically create the --dir directory if it does not already exist, which makes this script slightly easier to use. Remove the now uneeded mkdir from the top-level makefile. Change-Id: I1f4561120a70263b0b2b194e65fec55fb5666f40 Reviewed-on: https://gerrit.openafs.org/14115 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-18 20:57:54 -04:00
Michael Meffie	d20d392091	make-release: Remove unused optional version argument The make-release help shows an optional version argument, but in fact the version info is always generated from the git tag name argument, which makes sense when creating releases. Continue to throw away the second positional argument just in case someone is still passing a second argument, but issue a warning if they do. Change-Id: Ie4c6e6efb7693e53a02fd009eecd64b47250c848 Reviewed-on: https://gerrit.openafs.org/14149 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-12 02:07:14 -04:00
Michael Meffie	46eb00ffa1	make-release: Clean up whitespace and spelling Fix whitespace errors, convert tabs to spaces, fix spelling errors, and fix pod markup in the make-release script. Change-Id: I24ede59d44a8818d89de454c0935586fccbd5d9a Reviewed-on: https://gerrit.openafs.org/14148 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-12 01:06:06 -04:00
Andrew Deason	c9eab4b1ee	afs: Remove osi_GetuTime osi_GetuTime has always been #define'd to be the same thing as osi_GetTime, ever since OpenAFS 1.0. Get rid of this redundant macro, and just use osi_GetTime instead. Change-Id: Ic826aeaa17314019b79cfb2df04a79309aa31db5 Reviewed-on: https://gerrit.openafs.org/14236 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-03 20:10:42 -04:00
Jeffrey Altman	dedb1aed97	afs/viced: New UAE (unified_afs) error codes The following registrations werte submitted to registrar@central.org as [rt.central.org #135105]. UAECANCELED, "Operation canceled" (49733499L) UAENOTRECOVERABLE, "State not recoverable" (49733500L) UAENOTSUP, "Not supported" (49733501L) UAEOTHER, "Other" (49733502L) UAEOWNERDEAD, "Owner dead" (49733503L) UAEPROCLIM, "Too many processes" (49733504L) UAEDISCON, "Graceful shutdown in progress" (49733505L) Change-Id: I1458b8a9441b3826756ca67af70eee5e835d989f Reviewed-on: https://gerrit.openafs.org/14235 Reviewed-by: Jeffrey Hutzelman <jhutz@cmu.edu> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>	2020-06-03 14:15:08 -04:00
Cheyenne Wills	ed9a3b7165	util: Fix segfault in the func ConstructLocalPath The function ConstructLocalPath will segfault if passed a NULL for the command path parameter. Update ConstructLocalPath to test the passed command path for a NULL and return ENOENT. The segfault can be triggered by setting up a BosConfig with a dafs bnode that does not contain all the required parms. This setup results in bosserver segfaulting. With the fix, bosserver now logs an error and exits cleanly. Change-Id: I26015c8accd829f3101b073964777b41d16b07f7 Reviewed-on: https://gerrit.openafs.org/14223 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-30 14:40:38 -04:00
Mark Vitale	336f5d91c6	DARWIN: ensure OpenAFS.pkg is signed Installation fails because the OpenAFS.pkg was inadvertently omitted from the codesign logic. Ensure that the package is signed. Change-Id: I0745146bc523750912dd6ee95fc16a70572be175 Reviewed-on: https://gerrit.openafs.org/14221 Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-29 00:50:54 -04:00
Mark Vitale	d3f8d81228	DARWIN: ensure PrefPane materials are properly signed Notarization fails because some prefPane materials were inadvertently omitted by the codesign logic. Ensure that these objects are properly signed. Change-Id: Ifc58e6f834a3237b7991257ee85de4e90fc3da12 Reviewed-on: https://gerrit.openafs.org/14220 Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-29 00:48:23 -04:00
Andrew Deason	80afdc2ada	vol: Avoid building devname.c on AFS_NAMEI_ENV Everything in devname.c is for the inode vol backend, so skip building it when AFS_NAMEI_ENV is defined. While we're doing this, alter the #ifdefs inside this file to assume that we're not on XBSD, DARWIN, or LINUX, since those platforms are all namei-only. Change-Id: I3a46568940e1a865a381c1ac7e98aea94df9f3ef Reviewed-on: https://gerrit.openafs.org/13995 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-29 00:34:41 -04:00
Andrew Deason	99eedfdb16	vol: Indent ifdef maze in devname.c Change-Id: I371eb1d79ae9fb3f07af993be834af6f6b59c100 Reviewed-on: https://gerrit.openafs.org/13994 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-29 00:29:14 -04:00
Tim Creech	71ce9fff8e	FBSD: Add support for FreeBSD 12.1 Change-Id: I5779c586b6b1255de0ee0dea66b09f3a5dffddc1 Reviewed-on: https://gerrit.openafs.org/13982 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-29 00:24:34 -04:00
Andrew Deason	20dc283226	FBSD: Ignore VI_DOOMED vnodes Currently on FreeBSD, osi_TryEvictVCache calls vgone() for our vnode after checking if the given vcache is in use. vgone() then calls our VOP_RECLAIM operation, which calls afs_vop_reclaim, which calls afs_FlushVCache to finally actually flush the vcache. The current approach has at least the following major issues: - In afs_vop_reclaim, we return success even if afs_FlushVCache() fails. This allows FreeBSD to reuse the vnode for another file, but the vnode is still being referenced by our vcache, which is referenced by the global VLRU and various other structures. This causes all kinds of weird errors, since we try to use the underlying vnode for different files. - After the relevant checks in osi_TryEvictVCache are done, another thread can acquire a new reference to our vcache (this can happen while vgone() is running up until the vnode is locked). This new reference will cause afs_FlushVCache to fail. - Our afs_vop_reclaim callback is called while the vnode is locked, and can acquire afs_xvcache. Other code locks the vnode while afs_xvcache is already held (such as afs_PutVCache -> vrele). This can lead to deadlocks if two threads try to run these codepaths for the same vnode at the same time. - afs_vop_reclaim optionally acquires afs_xvcache based on the return value of CheckLock(&afs_xvcache). However, CheckLock just returns if that lock is locked by anyone, not if the current thread holds the lock. This can result in the rest of the function running without afs_xvcache actually being held if we drop AFS_GLOCK at any point. - osi_TryEvictVCache() tries to vn_lock() the target vnode, but we may already have another vnode locked in the current thread. If the vnode we're trying to evict is a descendant of a vnode we already have locked, this can deadlock. To fix these issues, make some changes to how our vcache management works on FreeBSD: - Do not allow anyone to hold a new reference on a VI_DOOMED vnode. We do this by checking for VI_DOOMED in osi_vnhold, and returning an error if VI_DOOMED is set. - In afs_vop_reclaim, panic if afs_FlushVCache fails. With the new VI_DOOMED check, afs_FlushVCache show now never fail; and if it somehow does, panic'ing immediately is better than corrupting various structures and panic'ing later on. - Move around some of the relevant locking in afs_vop_reclaim to fix the lock-related issues. - In osi_TryEvictVCache, don't wait for the vnode lock (LK_NOWAIT); treat the vnode as "in use" if we can't immediately obtain the lock. Thanks to tcreech@tcreech.com and kaduk@mit.edu for insight and help investigating the relevant issues. FIXES 135041 Change-Id: I23e94ecebbddc8c68a8f4ea918d64efd0f9f9dfd Reviewed-on: https://gerrit.openafs.org/13972 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-29 00:21:09 -04:00
Mark Vitale	145c90bdbe	DARWIN: remove vestigial etap_event_t typedefs These typedefs have been present since commit `a41175cfbb` "initial-darwin-support-20010327"; at least some of this material was obtained directly from IBM after the initial code import. Based on research of old Darwin source code and kernel documentation, the Event Trace Analysis Package (ETAP) was a lock-profiling interface provided in older versions of Mach and xnu. ETAP was not enabled by default; the kernel had to be recompiled with certain options to enable it. Support for ETAP was removed from the xnu tree sometime between xnu-517 (10.3 Panther) and xnu-792 (10.4 Tiger), although some references remain in the latter under PPC support (osfmk/ppc/hw_lock.s). All remaining references to etap_event_t disappeared when PPC support was removed, some time between xnu-1456.1.26 (10.6 Snow Leopard) and xnu-1699.24.8 (10.7.2 Lion). Therefore, it is possible that these typedefs were needed in the past by (IBM/Transarc) AFS to support use of some lock APIs (e.g., simple_lock_init, usimple_lock_init) after the ETAP code was withdrawn from xnu. However, these typedefs have probably always been vestigial for OpenAFS, because OpenAFS has never used any lock API that took etap_event_t as an argument. Regardless, OpenAFS does not need these definitions to build and run on any currently supported version of macOS. Remove the vestigial code. No functional change should be incurred by this commit. Change-Id: I39b3f82a8933d15ef5b5de5eb92366c0a31f8bb6 Reviewed-on: https://gerrit.openafs.org/14219 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-28 23:22:59 -04:00
Mark Vitale	f065706fed	DARWIN: remove errant typedef for etap_event_t This code has been dead since its introduction, because XAFS_DARWIN_ENV is a typo for AFS_DARWIN_ENV. Introduced from day 1 of DARWIN support with commit `a41175cfbb` "initial-darwin-support-20010327". No functional change should be incurred by this commit. Change-Id: I6b74f01b4dd1230559ac8d75f0644071357f38b7 Reviewed-on: https://gerrit.openafs.org/14218 Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-05-28 23:22:52 -04:00
Mark Vitale	c6eff25be9	Convert all osi_timeval_t to osi_timeval32_t Since commit `130144850c` "xstat: cm xstat time values are 32 bit", OpenAFS has had two timeval definitions: osi_timeval_t and osi_timeval32_t. Since they are functionally equivalent, convert all references to osi_timeval_t to osi_timeval32_t. This makes clear that this struct is always expected to contain 32-bit members for tv_sec and tv_usec. There are still a few platforms where osi_timeval32_t is mistakenly defined with 64-bit members; these will be addressed in future commits. No functional change should be incurred by this commit. Change-Id: I3e8e44235e813571723fcd114194f6cb83de90e4 Reviewed-on: https://gerrit.openafs.org/14215 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-28 23:09:05 -04:00
Mark Vitale	d610112866	UKERNEL: remove dead code osi_SetTime osi_SetTime has been dead code since the original IBM code import. Remove it from the tree. No functional change is incurred by this commit. Change-Id: I25612a044ad550d798003979afc6845e502ebe3b Reviewed-on: https://gerrit.openafs.org/14191 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-28 23:01:48 -04:00
Mark Vitale	03f4417218	UKERNEL: remove redundant declaration of osi_GetTime Commit `c861bb0d77` "Additional UKERNEL headers, prototyping and other fixes" added the following lines to src/rx/rx_prototypes.h: #if defined(UKERNEL) && !defined(osi_GetTime) extern int osi_GetTime(struct timeval tv); #endif However, this appears to be redundant with the declaration in src/afs/afs_prototypes.h: #ifdef UKERNEL ... extern int osi_GetTime(struct timeval tv); ... #endif which was added much earlier with commit `8f2df21ffe` "pull-prototypes-to-head-20020821". Remove the redundant declaration in rx/rx_prototypes.h. No functional change is incurrred by this commit. Change-Id: I2032d302e862eed47250357e604cba4f26e89814 Reviewed-on: https://gerrit.openafs.org/14192 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-27 20:40:24 -04:00
Mark Vitale	3ab022fda9	afs: remove commented xstats externs Extern declarations for the xstats recording areas have been commented out since `8f2df21ffe` "pull-prototypes-to-head-20020821". Remove the vestigial comments. No functional change is incurred by this commit. Change-Id: Ieef9a4b21e78db8d5427bed7b621ba043663b1d1 Reviewed-on: https://gerrit.openafs.org/14197 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-05-27 20:31:16 -04:00
Mark Vitale	4caadf71f5	afs: remove stats dead code afs_GetCMSTats, afs_AddToMean, and macro AFS_MEANCNT have been dead code since the original IBM code import. Remove them from the tree. No functional change is incurred by this commit. Change-Id: Icd6aeff7896d69a4d334531b5e0c632d807457ce Reviewed-on: https://gerrit.openafs.org/14196 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-05-27 20:30:43 -04:00
Mark Vitale	9a5790cfbb	LINUX 5.6: define osi_timeval32_t for 32-bit Linux For 32-bit Linux (e.g., arch i586), AFS_LINUX_64BIT_KERNEL is not defined, so osi_timeval32_t is defined as a typedef of the native 'timeval'. However, as of commit c766d1472c70d25ad475cf56042af1652e792b23 "y2038: hide timeval/timespec/itimerval/itimerspec types" (Linux 5.6), the native timeval struct is no longer available. On such a kernel, the OpenAFS build will fail because osi_timeval32_t is not properly defined. Instead, add new conditionals to properly define osi_timeval32_t for this platform. Change-Id: I1eddeeb3651dcd3c55920ab1d2ad2838f4729bdd Reviewed-on: https://gerrit.openafs.org/14216 Reviewed-by: Cheyenne Wills <cwills@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-27 20:28:52 -04:00
Andrew Deason	13e44b2b20	afs: Refactor osi_vnhold/AFS_FAST_HOLD Make a few changes to osi_vnhold and AFS_FAST_HOLD: - Currently, the second argument of osi_vnhold ("retry") is never used by any implementation. Get rid of it. - AFS_FAST_HOLD() is the same as osi_vnhold(). Get rid of AFS_FAST_HOLD, and just have all callers use osi_vnhold instead. - Allow osi_vnhold to return an error, and adjust callers to handle it. - Change osi_vnhold to be a real function, instead of a macro, to make nontrivial implementations less cumbersome. Most platforms never return an error from osi_vnhold(), so the added code paths to check the return value of osi_vnhold() will not trigger. However, this lets us add future commits that do make osi_vnhold() return an error. Change-Id: Id2f3717be6c305d06305685247ac789815e1ebf7 Reviewed-on: https://gerrit.openafs.org/13971 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-22 16:27:18 -04:00
Andrew Deason	d013987315	vlserver: Return error when growing beyond 2 GiB In the vlserver, when we add a new vlentry or extent block, we grow the VLDB by doing something like this: vital_header.eofPtr += sizeof(item); Since we don't check for overflow, and all of our offset-related variables are signed 32-bit integers, this can cause some odd behavior if we try to grow the database to be over 2 GiB in size. To avoid this, change the two places in vlserver code that grow the database to use a new function, grow_eofPtr(), which checks for 31-bit overflow. If we are about to overflow, log a message and return an error. See the following for a specific example of our "odd behavior" when we overflow the 2 GiB limit in the VLDB: With 1 extent block, we can create 14509076 vlentries successfully. On the 14509077th vlentry, we'll attempt to write the entry to offset 2147483560 (0x7FFFFFA8). Since a vlentry is 148 bytes long, we'll write all the way through offset 2147483707 (0x8000003B), which is over the 31-bit limit. In the udisk subsystem, this results in writing to page numbers 2097151, and -2097152 (since our ubik pages are 1k, and going over the 31-bit limit causes us to treat offsets as negative). These pages start at physical offsets 2147482688 (0x7FFFFC40) and -2147483584 (-0x7FFFFFC0) in our vldb.DB0 (where offset is page*1024+64). Modifying each of these pages involves reading in the existing page first, modifying the parts we are changing, and writing it back. This works just fine for 2097151, but of course fails for -2097152. The latter fails in DReadBuffer when eventually our pread() fails with EINVAL, and causes ubik to log the message: Ubik: Error reading database file: errno=22 But when DReadBuffer fails, DReadBufferForWrite assumes this is due to EOF, and just creates a new buffer for the given page (DNewBuffer). So, the udisk_write() call ultimately succeeds. When we go to flush the dirty data to disk when committing the transaction, after we have successfully written the transaction log, DFlush() fails for the -2097152 page when the pwrite() call eventually fails with EINVAL, causing ubik to panic, logging the messages: Ubik PANIC: Writing Ubik DB modifications When the vlserver gets restarted by bosserver, we then process the transaction log, and perform the operations in the log before starting up (ReplayLog). The log records the actual data we wrote, not split into pages, and the log-replaying code writes directly to the db usying uphys_write instead of udisk_write. So, because of this, the write actually succeeds when replaying the log, since we just write 148 bytes to offset 2147483624 (0x7FFFFFE8), and no negative offsets are used. The vlserver will then be able to run, but will be unable to read that newly-created vlentry, since it involves reading a ubik page beyond the 31-bit boundary. That means trying to lookup that entry will fail with i/o errors, and as well as any entry on the same hash chains as the new entry (since the new entry will be added to the head of the hash chain). Listing all entries in the database will also just show an empty database, since our vital_header.eofPtr will be negative, and we determine EOF by comparing our current blockindex to the value in eofPtr. Change-Id: Ie0b7ac61f9121fa265686449efbae8e18edb1896 Reviewed-on: https://gerrit.openafs.org/14180 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>	2020-05-22 12:48:10 -04:00
Cheyenne Wills	d73680c5f7	vol: Fix format-truncation warning with gcc-10.1 Building with gcc-10.1 produces a warning (error if --enable-checking) in vol-salvage.c error: ‘%s’ directive output may be truncated writing up to 755 bytes into a region of size 255 [-Werror=format-truncation=] 809 \| snprintf(inodeListPath, 255, "%s" OS_DIRSEP "salvage.inodes.%s.%d", tdir, name, Use strdup/asprintf to allocate the buffer dynamically instead of using a buffer with a hardcoded size. Change-Id: Ib2f01c2eb73c7abc162be2b1939e55688a81f812 Reviewed-on: https://gerrit.openafs.org/14207 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-22 12:11:56 -04:00
Andrew Deason	c81579dc7b	auth: Close fd on SetExtendedCellInfo write error Currently, and since OpenAFS 1.0, if write() fails here, we leak the file descriptor. A write() failure should be very unlikely, but close the fd to make sure we avoid the leak. Change-Id: I4e8ed4216c4aa5041232fc798a7bc59f6a5570d9 Reviewed-on: https://gerrit.openafs.org/14213 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>	2020-05-18 13:57:06 -04:00
Andrew Deason	85df3e3d43	afs: Free rx/rxevent resources during shutdown Call shutdown_rx() and shutdown_rxevent() near the end of our shutdown sequence, in order to free various Rx resources and avoid memory leaks. Change-Id: Id2e912295cf760b5ad83057487e6c4c4fadda11b Reviewed-on: https://gerrit.openafs.org/13719 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>	2020-05-15 12:36:25 -04:00

1 2 3 4 5 ...

13328 Commits