openafs/doc
Andrew Deason 4498bd8179 volser: Don't NUL-pad failed pread()s in dumps
Currently, the volserver SAFSVolDump RPC and the 'voldump' utility
handle short reads from pread() for vnode payloads by padding the
missing data with NUL bytes. That is, if we request 4k of data for our
pread() call, and we only get back 1k of data, we'll write 1k of data
to the volume dump stream followed by 3k of NUL bytes, and log
messages like this:

    1 Volser: DumpFile: Error reading inode 1234 for vnode 5678
    1 Volser: DumpFile: Null padding file: 3072 bytes at offset 40960

This can happen if we hit EOF on the underlying file sooner than
expected, or if the OS just responds with fewer bytes than requested
for any reason.

The same code path tries to do the same NUL-padding if pread() returns
an error (for example, EIO), padding the entire e.g. 4k block with
NULs. However, in this case, the "padding" code often doesn't work as
intended, because we compare 'n' (set to -1) with 'howMany' (set to 4k
in this example), like so:

    if (n < howMany)

Here, 'n' is signed (ssize_t), and 'howMany' is unsigned (size_t), and
so compilers will promote 'n' to the unsigned type, causing this
conditional to fail when n is -1. As a result, all of the relevant log
messages are skipped, and the data in the dumpstream gets corrupted
(we skip a block of data, and our 'howFar' offset goes back by 1). So
this can result in rare silent data corruption in volume dumps, which
can occur during volume releases, moves, etc.

To fix all of this, remove this bizarre NUL-padding behavior in the
volserver. Instead:

- For actual errors from pread(), return an error, like we do for I/O
  errors in most other code paths.

- For short reads, just write out the amount of data we actually read,
  and keep going.

- For premature EOF, treat it like a pread() error, but log a slightly
  different message.

For the 'voldump' utility, the padding behavior can make sense if a
user is trying to recover volume data offline in a disaster recovery
scenario. So for voldump, add a new switch (-pad-errors) to enable the
padding behavior, but change the default behavior to bail out on
errors.

Change-Id: Ibd6e76c5ea0dea95e3354d9b34536296f81b4f67
Reviewed-on: https://gerrit.openafs.org/14255
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
2020-07-24 12:03:44 -04:00
..
doxygen Cleanup vestiges of old shared library build directories 2020-03-05 21:53:26 -05:00
man-pages volser: Don't NUL-pad failed pread()s in dumps 2020-07-24 12:03:44 -04:00
pdf initial-pdf-with-embedded-cmr-fonts-20010606 2001-06-06 18:58:13 +00:00
protocol lwp: remove preemption support 2016-05-05 12:51:14 -04:00
txt rx: fix out-of-range value for RX_CONN_NAT_PING 2020-07-23 23:06:14 -04:00
xml doc: the last partition name is /vicepiu 2018-09-14 08:35:26 -04:00
LICENSE Rework the Kerberos Autoconf probes 2010-06-15 16:30:04 -07:00
README doc: relocate notes from arch to txt 2017-08-03 20:44:28 -04:00

What's in the "doc" subdirectory

** doc/man-pages
pod sources for man pages (converted from original IBM html source).

** doc/xml
xml sources for manuals (converted from original IBM html source).
Note: The doc/xml/AdminRef uses doc/xml/AdminRef/pod2refentry to convert the
pod man pages to xml for printing.  pod goes directly to html just fine.

** doc/pdf
Old Transarc (and possibly pre-Transarc) protocol and API documentation for
which we have no other source.

** doc/txt
Technical notes, Windows notes, and examples.

** doc/doxygen
Configuration files for the doxygen tool to generate documentation from
the annotated sources. See the 'dox' Makefile target in the top level
Makefile.