Commit 33cb9b3c3a replaced a g_raid3_destroy_device() call with a
g_raid3_free_device() call, which was incorrect and could lead to a
panic if a RAID3 GEOM failed to start (e.g., due to missing disks).
Reported by: graid3 tests
Fixes: 33cb9b3c3a ("graid3: Fix teardown races")
MFC after: 3 days
Sponsored by: Klara, Inc.
Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.
Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/
Sponsored by: Netflix
Ensure they are all panic/debugger safe.
Most handlers for this event are for disk drivers/geom modules. There
are a mix of checks being used here (or not), so let's standardize on
checking the presence of the RB_NOSYNC flag.
This flag is set whenever:
1. The kernel has panicked and kern.sync_on_panic=0*
2. We reboot from within the kernel debugger (the "reset" command)
3. Userspace requested it, e.g. by 'reboot -n'
Name the functions consistently.
*This sysctl is tuned to zero by default, but its existence means that
these handlers can be executed after a panic, at the user's discretion.
IMO this use-case is implicitly understood to be risky, and we'd be
better off eliminating it altogether.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42337
This is not exhaustive, just done ahead of some upcoming changes to
these files.
Don't include sys/cdefs.h explicitly. No functional change intended.
Reviewed by: imp, jhb
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42335
Port commit dc399583ba from g_mirror, which has an effectively
identical startup sequence.
This fixes a race that was occasionally causing panics during GEOM test
suite runs on riscv.
MFC after: 1 month
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
As in commit 2f1cfb7f63 ("gmirror: Pre-allocate the timeout event
structure"), graid3 must avoid M_WAITOK allocations in callout handlers.
Reported by: graid3 regression tests
MFC after 2 weeks
All I/O requests through the taste consumers are synchronous, done
with g_read_data() and without any locks held. It makes no sense
to delegate the I/O to g_down/g_up threads.
This removes many of context switches during disk retaste.
MFC after: 2 weeks
Ensure that string buffers and pad bytes are zero-filled before writing
graid3 metadata.
Reported by: KMSAN
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.
Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*). Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.
Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys. Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight. Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.
Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.
Suggested by: mav (*)
Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27225
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT
Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718
While some geom layers pass unknown commands down, not all do. For the ones that
don't, pass BIO_SPEEDUP down to the providers that constittue the geom, as
applicable. No changes to vinum or virstor because I was unsure how to add this
support, and I'm also unsure how to test these. gvinum doesn't implement
BIO_FLUSH either, so it may just be poorly maintained. gvirstor is for testing
and not supportig BIO_SPEEDUP is fine.
Reviewed by: chs
Differential Revision: https://reviews.freebsd.org/D23183
Similar to what was done for device_printfs in r347229.
Convert g_print_bio() to a thin shim around g_format_bio(), which acts on an
sbuf; documented in g_bio.9.
Reviewed by: markj
Discussed with: rlibby
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D21165
When it comes to megabytes of text, difference between sbuf_printf() and
sbuf_cat() becomes substantial.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
GEOM ELI may double ask the password during boot. Once at loader time, and
once at init time.
This happens due a module loading bug. By default GEOM ELI caches the
password in the kernel, but without the MODULE_VERSION annotation, the
kernel loads over the kernel module, even if the GEOM ELI was compiled into
the kernel. In this case, the newly loaded module
purges/invalidates/overwrites the GEOM ELI's password cache, which causes
the double asking.
MFC Note: There's a pc98 component to the original submission that is
omitted here due to pc98 removal in head. This part will need to be revived
upon MFC.
Reviewed by: imp
Submitted by: op
Obtained from: opBSD
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D14992
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
for all struct bio you get back from g_{new,alloc}_bio. Temporary
bios that you create on the stack or elsewhere should use this before
first use of the bio, and between uses of the bio. At the moment, it
is nothing more than a wrapper around bzero, but that may change in
the future. The wrapper also removes one place where we encode the
size of struct bio in the KBI.
years for head. However, it is continuously misused as the mpsafe argument
for callout_init(9). Deprecate the flag and clean up callout_init() calls
to make them more consistent.
Differential Revision: https://reviews.freebsd.org/D2613
Reviewed by: jhb
MFC after: 2 weeks
These changes prevent sysctl(8) from returning proper output,
such as:
1) no output from sysctl(8)
2) erroneously returning ENOMEM with tools like truss(1)
or uname(1)
truss: can not get etype: Cannot allocate memory
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.
Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.
MFC after: 2 weeks
Sponsored by: Mellanox Technologies
as clean on shutdown and move that action from shutdown_pre_sync stage to
shutdown_post_sync to avoid extra flapping.
ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID
to shutdown gracefully. To handle that, mark volume as clean just when
shutdown time comes and there are no active writes.
MFC after: 2 weeks
bytes syncronized.
The rationale behind this is the following: for large disks the
percent synchronisation counter ticks too seldom, and monitoring
software (as well as human operator) can't tell whether
synchronisation goes on or one of disks got stuck. On an idle
server one can look into gstat and see whether synchronisation goes
on or not, but on a busy server that won't work. Also, new value
monitored can be differentiated obtaining the synchronisation speed
quite precisely.
Submitted by: Konstantin Kukushkin <dark ramtel.ru>
Reviewed by: pjd
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
No FreeBSD version bump, the userland application to query the features will
be committed last and can serve as an indication of the availablility if
needed.
Sponsored by: Google Summer of Code 2010
Submitted by: kibab
Reviewed by: silence on geom@ during 2 weeks
X-MFC after: to be determined in last commit with code from this project
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.
I'd LOVE to do this rename in 7.0 so that we can eventually MFC the
new kthread_xxx() calls.
- Use thread_lock() rather than sched_lock for per-thread scheduling
sychronization.
- Use the per-process spinlock rather than the sched_lock for per-process
scheduling synchronization.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
gmirror and graid3 in a way that it is not resynchronized after a
power failure or system crash.
It is safe when gjournal is running on top of gmirror/graid3.