Commit Graph

246 Commits

Author SHA1 Message Date
Mark Johnston
23732c0fe3 Don't enable interrupts in init_secondary().
The MI kernel assumes that interrupts will not be enabled on APs until
after the first context switch.  In particular, the problem was causing
occasional deadlocks during boot.

Remove an unneeded intr_disable() added in r335005.

Reviewed by:	jhb (previous version)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18738
2019-01-04 17:14:50 +00:00
Mark Johnston
c1959ba49b Fix dirty bit handling in pmap_remove_write().
Reviewed by:	jhb, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18732
2019-01-04 17:10:16 +00:00
Mark Johnston
b679dc7fee Clear PGA_WRITEABLE in pmap_remove_pages().
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18731
2019-01-04 17:08:45 +00:00
Mark Johnston
7c59ec14e6 Fix a use-after-free in the riscv pmap_release() implementation.
Don't bother zeroing the top-level page before freeing it.  Previously,
the page was freed before being zeroed.

Reviewed by:	jhb, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18720
2019-01-03 16:26:52 +00:00
Mark Johnston
bad66a29d4 Synchronize access to the allpmaps list.
The list will be removed with some future work.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18721
2019-01-03 16:24:03 +00:00
Mark Johnston
60af34002e Fix some issues with the riscv pmap_protect() implementation.
- Handle VM_PROT_EXECUTE.
- Clear PTE_D and mark the page dirty when removing write access
  from a mapping.
- Atomically clear PTE_W to avoid clobbering a hardware PTE update.

Reviewed by:	jhb, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18719
2019-01-03 16:21:44 +00:00
Mark Johnston
8ccaccd522 Set PTE_U on PTEs created by pmap_enter_quick().
Otherwise prefaulted entries are not accessible from user mode and
end up triggering a fault upon access, so prefaulting has no effect.

Reviewed by:	jhb, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18718
2019-01-03 16:19:32 +00:00
Mark Johnston
619999ff9f Use regular stores to update PTEs in the riscv pmap layer.
There's no need to use atomics when the previous value isn't needed.
No functional change intended.

Reviewed by:	kib
Discussed with:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18717
2019-01-03 16:15:28 +00:00
Mark Johnston
7b1e32a5be Configure hz=100 in the QEMU target.
We currently don't have a good way to dynamically detect whether the
kernel is running as a guest.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18715
2019-01-03 16:11:21 +00:00
Mateusz Guzik
628888f0e0 Remove iBCS2, part2: general kernel
Reviewed by:	kib (previous version)
Sponsored by:	The FreeBSD Foundation
2018-12-19 21:57:58 +00:00
Mark Johnston
53941c0a73 Replace uses of sbadaddr with stval.
The sbadaddr register was renamed in version 1.10 of the privileged
architecture specification.  No functional change intended.

Submitted by:	Mitchell Horne <mhorne063@gmail.com>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D18594
2018-12-19 17:52:09 +00:00
Mark Johnston
5268e09865 Implement cpu_halt() for RISC-V.
Submitted by:	Mitchell Horne <mhorne063@gmail.com>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D18595
2018-12-19 17:45:16 +00:00
Mark Johnston
3ec68206f5 Add some more checking to the RISC-V page fault handler.
- Panic immediately if witness says we're holding non-sleepable locks.
  This helps ensure that we don't recurse on the pmap lock in
  pmap_fault_fixup().
- Panic if the kernel faults on a user address without setting an
  onfault handler.
- Panic if the fault occurred in a critical section or interrupt
  handler, like we do on other platforms.
- Fix some style issues in trap_pfault().

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18561
2018-12-14 21:07:12 +00:00
Mark Johnston
105c317166 Avoid needless TLB invalidations in pmap_remove_pages().
pmap_remove_pages() is called during process termination, when it is
guaranteed that no other CPU may access the mappings being torn down.
In particular, it unnecessary to invalidate each mapping individually
since we do a pmap_invalidate_all() at the end of the function.

Also don't call pmap_invalidate_all() while holding a PV list lock, the
global pvh lock is sufficient.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18562
2018-12-14 21:04:30 +00:00
Mark Johnston
4f86ff4e47 Assume that pmap_l1() will return a PTE.
pmaps on RISC-V always have an L1 page table page, so we don't need to
check for this when performing lookups.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18563
2018-12-14 21:03:01 +00:00
Mark Johnston
01cd6fba6c Add a QEMU config for RISC-V.
Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18560
2018-12-14 21:00:41 +00:00
Mark Johnston
fb50c41448 Enable witness(4) in the RISC-V GENERIC config.
Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18559
2018-12-14 20:57:57 +00:00
Mark Johnston
4a02086817 Clean up the riscv pmap_bootstrap() implementation.
- Build up phys_avail[] in a single loop, excluding memory used by
  the loaded kernel.
- Fix an array indexing bug in the aforementioned phys_avail[]
  initialization.[1]
- Remove some unneeded code copied from the arm64 implementation.

PR:		231515 [1]
Reviewed by:	jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18464
2018-12-14 18:50:32 +00:00
Mark Johnston
a64886cef3 Remove an unused malloc(9) type.
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2018-12-11 02:16:27 +00:00
Mark Johnston
e7d46a1d71 Use inline tests for individual PTE bits in the RISC-V pmap.
Inline tests for PTE_* bits are easy to read and don't really require a
predicate function, and predicates which operate on a pt_entry_t are
inconvenient when working with L1 and L2 page table entries.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18461
2018-12-11 02:15:56 +00:00
Mark Johnston
1a153f42fa Update the description of the address space layout on RISC-V.
This adds more detail and fixes some inaccuracies.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18463
2018-12-07 15:56:40 +00:00
Mark Johnston
1f5e341b46 Rename sptbr to satp per v1.10 of the privileged architecture spec.
Add a subroutine for updating satp, for use when updating the
active pmap.  No functional change intended.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18462
2018-12-07 15:55:23 +00:00
Eric van Gyzen
984969cd96 Fix reporting of SS_ONSTACK
Fix reporting of SS_ONSTACK in nested signal delivery when sigaltstack()
is used on some architectures.

Add a unit test for this.  I tested the test by introducing the bug
on amd64.  I did not test it on other architectures.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D18347
2018-11-30 22:44:33 +00:00
Eric van Gyzen
4d5a108409 Prevent kernel stack disclosure in signal delivery
On arm64 and riscv platforms, sendsig() failed to zero the signal
frame before copying it out to userspace.  Zero it.

On arm, I believe all the contents of the frame were initialized,
so there was no disclosure.  However, explicitly zero the whole frame
because that fact could inadvertently change in the future,
it's more clear to the reader, and I could be wrong in the first place.

MFC after:	2 days
Security:	similar to FreeBSD-EN-18:12.mem and CVE-2018-17155
Sponsored by:	Dell EMC Isilon
2018-11-26 20:52:53 +00:00
Mark Johnston
6f8ba91638 RISC-V: Implement get_cyclecount(9).
Add the missing implementation for get_cyclecount(9) on RISC-V by
reading the cycle CSR.

Submitted by:	Mitchell Horne <mhorne063@gmail.com>
Reviewed by:	jhb
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D17953
2018-11-13 18:20:27 +00:00
Mark Johnston
1e2ceeb16a RISC-V: Add macros for reading performance counter CSRs.
The RISC-V spec defines several performance counter CSRs such as: cycle,
time, instret, hpmcounter(3...31).  They are defined to be 64-bits wide
on all RISC-V architectures.  On RV64 and RV128 they can be read from a
single CSR.  On RV32, additional CSRs (given the suffix "h") are present
which contain the upper 32 bits of these counters, and must be read as
well.  (See section 2.8 in the User ISA Spec for full details.)

This change adds macros for reading these values safely on any RISC-V
ISA length.  Obviously we aren't supporting anything other than RV64
at the moment, but this ensures we won't need to change how we read
these values if we ever do.

Submitted by:	Mitchell Horne <mhorne063@gmail.com>
Reviewed by:	jhb
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D17952
2018-11-13 18:12:06 +00:00
John Baldwin
c5e797a836 Drop the legacy ELF brandinfo for the old rtld from arm64 and riscv.
These architectures never shipped binaries with an rtld path of
/usr/libexec/ld-elf.so.1.

Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17876
2018-11-07 18:28:55 +00:00
John Baldwin
274c0a806a Enable use of a global shared page for RISC-V.
machine/vmparam.h already defines the SHAREDPAGE constant.  This
change just enables it for ELF executables.  The only use of the
shared page currently is to hold the signal trampoline.

Reviewed by:	markj, kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17875
2018-11-07 18:27:43 +00:00
John Baldwin
4cbbb74888 Add a KPI for the delay while spinning on a spin lock.
Replace a call to DELAY(1) with a new cpu_lock_delay() KPI.  Currently
cpu_lock_delay() is defined to DELAY(1) on all platforms.  However,
platforms with a DELAY() implementation that uses spin locks should
implement a custom cpu_lock_delay() doesn't use locks.

Reviewed by:	kib
MFC after:	3 days
2018-11-05 21:34:17 +00:00
John Baldwin
ff9738d954 Rework setting PTE_D for kernel mappings.
Rather than unconditionally setting PTE_D for all writeable kernel
mappings, set PTE_D for writable mappings of unmanaged pages (whether
user or kernel).  This matches what amd64 does and also matches what
the RISC-V spec suggests (preset the A and D bits on mappings where
the OS doesn't care about the state).

Suggested by:	alc
Reviewed by:	alc, markj
Sponsored by:	DARPA
2018-11-05 20:00:36 +00:00
John Baldwin
d198cb6d83 Restrict setting PTE execute permissions on RISC-V.
Previously, RISC-V was enabling execute permissions in PTEs for any
readable page.  Now, execute permissions are only enabled if they were
explicitly specified (e.g. via PROT_EXEC to mmap).  The one exception
is that the initial kernel mapping in locore still maps all of the
kernel RWX.

While here, change the fault type passed to vm_fault and
pmap_fault_fixup to only include a single VM_PROT_* value representing
the faulting access to match other architectures rather than passing a
bitmask.

Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17783
2018-11-01 22:23:15 +00:00
John Baldwin
6f888020df Set PTE_A and PTE_D for user mappings in pmap_enter().
This assumes that an access according to the prot in 'flags' triggered
a fault and is going to be retried after the fault returns, so the two
flags are set preemptively to avoid refaulting on the retry.

While here, only bother setting PTE_D for kernel mappings in pmap_enter
for writable mappings.

Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17782
2018-11-01 22:17:51 +00:00
John Baldwin
a751b25546 SBI calls expect a pointer to a u_long rather than a pointer.
This is just cosmetic.

A weirder issue is that the SBI doc claims the hart mask pointer should
be a physical address, not a virtual address.  However, the implementation
in bbl seems to just dereference the address directly.

Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17781
2018-11-01 22:15:25 +00:00
John Baldwin
344adeab18 Don't allow debuggers to modify SSTATUS, only to read it.
Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17771
2018-11-01 22:13:22 +00:00
John Baldwin
ada1ceef0b Implement ptrace_set_pc() and fail PT_*STEP requests explicitly.
Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17769
2018-11-01 22:11:26 +00:00
Kyle Evans
be352d20d5 Compile in VERBOSE_SYSINIT support by default, remain silent by default
The loader tunable 'debug.verbose_sysinit' may be used to toggle verbosity.
This is added to the debugging section of these kernconfs to be turned off
in stable branches for clarity of intent.

MFC after:	never
2018-10-31 22:38:19 +00:00
Ruslan Bukin
b7b391934d o Add pmap lock around pmap_fault_fixup() to ensure other thread will not
modify l3 pte after we loaded old value and before we stored new value.
o Preset A(accessed), D(dirty) bits for kernel mappings.

Reported by:	kib
Reviewed by:	markj
Discussed with:	jhb
Sponsored by:	DARPA, AFRL
2018-10-26 12:27:07 +00:00
Brooks Davis
c3adaa3305 Consolidate identical ELF auxargs type defintions.
All platforms except powerpc use the same values and powerpc shares a
majority of them.

Go ahead and declare AT_NOTELF, AT_UID, and AT_EUID in favor of the
unused AT_DCACHEBSIZE, AT_ICACHEBSIZE, and AT_UCACHEBSIZE for powerpc.

Reviewed by:	jhb, imp
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17397
2018-10-22 22:24:32 +00:00
Ruslan Bukin
b977d81946 Support RISC-V implementations that do not manage the A and D bits
(e.g. RocketChip, lowRISC and derivatives).

RISC-V page table entries support A (accessed) and D (dirty) bits. The
spec makes hardware support for these bits optional. Implementations that
do not manage these bits in hardware raise page faults for accesses to a
valid page without A set and writes to a writable page without D set.
Check for these types of faults when handling a page fault and fixup the
PTE without calling vm_fault if they occur.

Reviewed by:	jhb, markj
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17424
2018-10-18 15:25:07 +00:00
Ruslan Bukin
3c8efd61f5 Revert r339421 due to unintended files included to commit.
Reported by:	ian
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
2018-10-18 15:17:58 +00:00
Ruslan Bukin
53c6ad1d62 Support RISC-V implementations that do not manage the A and D bits
(e.g. RocketChip, lowRISC and derivatives).

RISC-V page table entries support A (accessed) and D (dirty) bits. The
spec makes hardware support for these bits optional. Implementations that
do not manage these bits in hardware raise page faults for accesses to a
valid page without A set and writes to a writable page without D set.
Check for these types of faults when handling a page fault and fixup the
PTE without calling vm_fault if they occur.

Reviewed by:	jhb, markj
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17424
2018-10-18 15:08:14 +00:00
Ruslan Bukin
94036a2587 Invalidate TLB on a local hart.
This was missed in r339367 ("Various fixes for TLB management on RISC-V.").

This fixes operation on lowRISC.

Reviewed by:	jhb
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17583
2018-10-16 16:03:17 +00:00
John Baldwin
73efa2fbd1 Various fixes for TLB management on RISC-V.
- Remove the arm64-specific cpu_*cache* and cpu_tlb_flush* functions.
  Instead, add RISC-V specific inline functions in cpufunc.h for the
  fence.i and sfence.vma instructions.
- Catch up to changes in the arm64 pmap and remove all the cpu_dcache_*
  calls, pmap_is_current, pmap_l3_valid_cacheable, and PTE_NEXT bits from
  pmap.
- Remove references to the unimplemented riscv_setttb().
- Remove unused cpu_nullop.
- Add a link to the SBI doc to sbi.h.
- Add support for a 4th argument in SBI calls.  It's not documented but
  it seems implied for the asid argument to SBI_REMOVE_SFENCE_VMA_ASID.
- Pass the arguments from sbi_remote_sfence*() to the SEE.  BBL ignores
  them so this is just cosmetic.
- Flush icaches on other CPUs when they resume from kdb in case the
  debugger wrote any breakpoints while the CPUs were paused in the IPI_STOP
  handler.
- Add SMP vs UP versions of pmap_invalidate_* similar to amd64.  The
  UP versions just use simple fences.  The SMP versions use the
  sbi_remove_sfence*() functions to perform TLB shootdowns.  Since we
  don't have a valid pm_active field in the riscv pmap, just IPI all
  CPUs for all invalidations for now.
- Remove an extraneous TLB flush from the end of pmap_bootstrap().
- Don't do a TLB flush when writing new mappings in pmap_enter(), only if
  modifying an existing mapping.  Note that for COW faults a TLB flush is
  only performed after explicitly clearing the old mapping as is done in
  other pmaps.
- Sync the i-cache on all harts before updating the PTE for executable
  mappings in pmap_enter and pmap_enter_quick.  Previously the i-cache was
  only sync'd after updating the PTE in pmap_enter.
- Use sbi_remote_fence() instead of smp_rendezvous in pmap_sync_icache().

Reviewed by:	markj
Approved by:	re (gjb, kib)
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17414
2018-10-15 18:56:54 +00:00
Ruslan Bukin
05efeb8430 Initialize interrupt priority to 0 on all sources.
Without this hardware raises an interrupt regardless of any
pending bits set.

This fixes operation on RocketChip and derivatives (e.g. lowRISC).

Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-10-12 15:51:41 +00:00
Ruslan Bukin
053ec0508e Add support for the UART device found in lowRISC system-on-a-chip.
The only source of documentation for this device is verilog,
so driver is minimalistic.

Reviewed by:	Dr Jonathan Kimmitt <jrrk2@cam.ac.uk>
Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-10-12 15:19:41 +00:00
John Baldwin
7a102e0463 Implement pmap_sync_icache().
This invokes "fence" on the hart performing the write followed by an IPI
to execute "fence.i" on all harts.

This is required to support userland debuggers setting breakpoints in
user processes.

Reviewed by:	br (earlier version), markj
Approved by:	re (gjb)
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17139
2018-09-24 17:41:29 +00:00
John Baldwin
232d0b87e0 Various fixes for floating point on RISC-V.
- Explicitly load an empty initial state into FP registers when taking
  the fault on the first FP instruction in a thread.  Setting
  SSTATE.FS to INITIAL is just a marker to let context switch restore
  code know that it can load FP registers with zeroes instead of
  memory loads.  It does not imply that the hardware will reset all
  registers to zero on first access.  In addition, set the state to
  CLEAN instead of INITIAL after the first FP instruction.
  cpu_switch() doesn't do anything for INITIAL and only restores from
  the pcb if the state is CLEAN.  We could perhaps change cpu_switch
  to call fpe_state_clear if the state was INITIAL and leave SSTATE.FS
  set to INITIAL instead of CLEAN after the first FP instruction.
  However, adding this complexity to cpu_switch() doesn't seem worth
  the supposed gain.
- Only save the current FPU registers in fill_fpregs() if the request
  is made to save the current thread's registers.  Previously if a
  debugger requested FP registers via ptrace() it was getting a copy
  of the debugger's FP registers rather than the debugee's.
- Zero the entire FP register set structure returned for ptrace() if a
  thread hasn't used FP registers rather than leaking garbage in the
  fp_fcsr field.
- If a debugger writes FP registers via ptrace(), always mark the pcb
  as having valid FP registers and set SSTATUS.FS_MASK to CLEAN so
  that the registers will be restored when the debugged thread
  resumes.
- Be more explicit about clearing the SSTATUS.FS field before setting
  it to CLEAN on the first FP instruction trap.

Submitted by:	br, markj
Approved by:	re (rgrimes)
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17141
2018-09-19 23:45:18 +00:00
Ruslan Bukin
bd528a398e Enable VIMAGE support for RISC-V.
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
2018-09-12 08:13:54 +00:00
Ruslan Bukin
752a8ea48e Use elf_relocaddr() to find the address for R_RISCV_RELATIVE
relocation.

elf_relocaddr() has a hook to handle VIMAGE data addresses.

This fixes VIMAGE support for RISC-V when built as a module.

Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
2018-09-12 08:12:34 +00:00
Ruslan Bukin
157654d0c4 Permit supervisor to access user VA space for certain functions only.
This is done by setting SUM (permit Supervisor User Memory access)
bit in sstatus register.

The functions we allow access for are routines in assembly that
explicitly handle crossing the user kernel boundary.

Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-09-05 11:34:58 +00:00
Ruslan Bukin
93952a8b1b Fix bug: compare uaddr to VM_MAXUSER_ADDRESS, not to a tmp value
left by SET_FAULT_HANDLER().

Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-09-05 09:53:55 +00:00
Ruslan Bukin
378a495661 Add support for 'C'-compressed ISA extension to DTrace FBT provider.
Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-09-03 14:34:09 +00:00
Ruslan Bukin
0f669630ac Fix an integer overflow while setting the kernel address (MODINFO_ADDR).
This eliminates build warning and makes kldstat happy.

Approved by:	re (marius)
2018-08-31 16:15:46 +00:00
Konstantin Belousov
f0165b1ca6 Remove {max/min}_offset() macros, use vm_map_{max/min}() inlines.
Exposing max_offset and min_offset defines in public headers is
causing clashes with variable names, for example when building QEMU.

Based on the submission by:	royger
Reviewed by:	alc, markj (previous version)
Sponsored by:	The FreeBSD Foundation (kib)
MFC after:	1 week
Approved by:	re (marius)
Differential revision:	https://reviews.freebsd.org/D16881
2018-08-29 12:24:19 +00:00
Mark Johnston
36716fe2e6 Prepare the kernel linker to handle PC-relative ifunc relocations.
The boot-time ifunc resolver assumes that it only needs to apply
IRELATIVE relocations to PLT entries.  With an upcoming optimization,
this assumption no longer holds, so add the support required to handle
PC-relative relocations targeting GNU_IFUNC symbols.
- Provide a custom symbol lookup routine that can be used in early boot.
  The default lookup routine uses kobj, which is not functional at that
  point.
- Apply all existing relocations during boot rather than filtering
  IRELATIVE relocations.
- Ensure that we continue to apply ifunc relocations in a second pass
  when loading a kernel module.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16749
2018-08-22 20:44:30 +00:00
Alan Cox
83a90bffd8 Eliminate kmem_malloc()'s unused arena parameter. (The arena parameter
became unused in FreeBSD 12.x as a side-effect of the NUMA-related
changes.)

Reviewed by:	kib, markj
Discussed with:	jeff, re@
Differential Revision:	https://reviews.freebsd.org/D16825
2018-08-21 16:43:46 +00:00
John Baldwin
8cd385fda0 Make 'device crypto' lines more consistent.
- In configurations with a pseudo devices section, move 'device crypto'
  into that section.
- Use a consistent comment.  Note that other things common in kernel
  configs such as GELI also require 'device crypto', not just IPSEC.

Reviewed by:	rgrimes, cem, imp
Differential Revision:	https://reviews.freebsd.org/D16775
2018-08-18 20:32:08 +00:00
Conrad Meyer
08d77c0178 Riscv: Include crypto for IPSec
Similar to r337944.  I think this is the last configuration that includes IPsec
but not crypto.
2018-08-17 01:08:22 +00:00
Ruslan Bukin
9aa2d5e4fa Remove unused code.
Sponsored by:	DARPA, AFRL
2018-08-14 16:22:14 +00:00
Ruslan Bukin
2cfd37def0 Rewrite RISC-V disassembler:
- Use macroses from encoding.h generated by riscv-opcodes.
- Add support for C-compressed ISA extension.

Sponsored by:	DARPA, AFRL
2018-08-14 16:03:03 +00:00
Ruslan Bukin
c1d0e057d8 Add RISC-V instructions encoding.
This is the output of
$ cat opcodes opcodes-rvc-pseudo opcodes-rvc opcodes-custom |
    ./parse-opcodes -c

It is confirmed by author that the output of parse-opcodes is
in the public domain.

This will be required for DDB disassembler.

Discussed with: Andrew Waterman <waterman@eecs.berkeley.edu>
Obtained from:	https://github.com/riscv/riscv-opcodes
Sponsored by:	DARPA, AFRL
2018-08-13 16:07:18 +00:00
Ruslan Bukin
6371d0bd64 Implement uma_small_alloc(), uma_small_free().
Reviewed by:	markj
Obtained from:	arm64
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16628
2018-08-08 16:08:38 +00:00
Marius Strobl
13a10f3414 Implement atomic_swap_{int,long,ptr}(9). 2018-08-07 18:56:51 +00:00
Ruslan Bukin
c50c8f642c Return ENAMETOOLONG if the latest copied character
is not null terminator.

Sponsored by:	DARPA, AFRL
2018-08-03 16:44:56 +00:00
Ruslan Bukin
385a185b43 Don't overwrite tp in set_mcontext().
This makes libthr/swapcontext_test:swapcontext1 happy.

Sponsored by:	DARPA, AFRL
2018-08-02 12:13:52 +00:00
Ruslan Bukin
84154f4b9e o Don't overwrite tp in fork_trampoline().
o Save and restore tp in cpu_switch().
o Restore tp in cpu_throw().
o Save tp in savectx().

This makes libthr tests happy. In particular fork_test:fork.

Sponsored by:	DARPA, AFRL
2018-08-02 12:12:13 +00:00
Ruslan Bukin
7bb4a84ad3 o Correctly set user tls base: consider TP_OFFSET.
o Ensure tp (thread pointer) saved before copying the pcb.

Sponsored by:	DARPA, AFRL
2018-08-02 12:08:52 +00:00
Konstantin Belousov
e45b89d23d Add pmap_is_valid_memattr(9).
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation, Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15583
2018-08-01 18:45:51 +00:00
Ruslan Bukin
a304bc9729 Disable VIMAGE on RISC-V.
Similar to r326179 ("Temporarily disable VIMAGE on arm64") creation of
if_lagg or epair on RISC-V results a kernel panic.

Sponsored by:	DARPA, AFRL
2018-07-30 12:22:49 +00:00
Ruslan Bukin
b51092c7ec Use SPP (Supervisor Previous Privilege) bit in the sstatus
register to determine if trap is from userspace.

Otherwise if we jump to kernel address from userspace, then
TRAPF_USERMODE failed to detect usermode and then do_ast
triggers a panic "ast in kernel mode".

Reviewed by:	markj@
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16469
2018-07-27 16:13:06 +00:00
Mark Johnston
a3055a5e47 Implement pmap_mincore() for riscv.
Reviewed by:	alc, br
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16444
2018-07-26 16:08:26 +00:00
Ruslan Bukin
87f9acc9e1 Remove unused string.
Reported by:	markj@
Sponsored by:	DARPA, AFRL
2018-07-25 15:44:49 +00:00
Mark Johnston
c03590f5a2 Embed a simplebus_softc in struct soc_softc.
This is required by the definition of the soc driver.

Reviewed by:	br
Sponsored by:	The FreeBSD Foundation
2018-07-24 21:02:11 +00:00
Ruslan Bukin
8eca6e4855 Fix setjmp for RISC-V:
o The correct value for _JB_SIGMASK is 27.
o The storage size for double-precision floating
  point register is 8 bytes.

Submitted by:	"James Clarke" <jrtc4@cam.ac.uk>
Reviewed by:	markj@
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16344
2018-07-23 09:54:28 +00:00
Warner Losh
9ecd7fdebe Remove VM_FREELIST_ISADMA. It's not needed on these architectures.
Differential Review: https://reviews.freebsd.org/D16290
2018-07-17 21:07:53 +00:00
Alan Cox
afeed44dc5 Invalidate the mapping before updating its physical address.
Doing so ensures that all threads sharing the pmap have a consistent
view of the mapping.  This fixes the problem described in the commit
log message for r329254 without the overhead of an extra page fault
in the common case.  (Now that all pmap_enter() implementations are
similarly modified, the workaround added in r329254 can be removed,
reducing the overhead of COW faults.)

With this change we can reuse the PV entry from the old mapping,
potentially avoiding a call to reclaim_pv_chunk().  Otherwise, there is
nothing preventing the old PV entry from being reclaimed.  In rare
cases this could result in the PTE's page table page being freed,
leading to a use-after-free of the page when the updated PTE is written
following the allocation of the PV entry for the new mapping.

Reviewed by:	br, markj
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D16261
2018-07-14 20:14:00 +00:00
Matt Macy
ab3059a8e7 Back pcpu zone with domain correct pages
- Change pcpu zone consumers to use a stride size of PAGE_SIZE.
  (defined as UMA_PCPU_ALLOC_SIZE to make future identification easier)

- Allocate page from the correct domain for a given cpu.

- Don't initialize pc_domain to non-zero value if NUMA is not defined
  There are some misconceptions surrounding this field. It is the
  _VM_ NUMA domain and should only ever correspond to valid domain
  values as understood by the VM.

The former slab size of sizeof(struct pcpu) was somewhat arbitrary.
The new value is PAGE_SIZE because that's the smallest granularity
which the VM can allocate a slab for a given domain. If you have
fewer than PAGE_SIZE/8 counters on your system there will be some
memory wasted, but this is obviously something where you want the
cache line to be coming from the correct domain.

Reviewed by: jeff
Sponsored by: Limelight Networks
Differential Revision:  https://reviews.freebsd.org/D15933
2018-07-06 02:06:03 +00:00
Sean Bruno
096da4de18 riscv: Remove unused variable "code"
gcc found that the variabl "code", while being assigned a value, isn't
be used for anything.

Reviewed by:	br
Differential Revision:	https://reviews.freebsd.org/D16114
2018-07-05 17:26:44 +00:00
Sean Bruno
96744f0225 Make ZSTD a real option via ZSTDIO.
It looks like the intent was to allow ZSTD support to be
compiled into the kernel with options ZSTDIO. But it doesn't look
like that was ever implemented or I'm missing how to do it.

I did a cursory audit of kernel config files and made a decision to
enable ZSTDIO in riscv GENERIC and mips MALTA configurations.  All other
kernel configurations already had this option in their kernel configs
but they didn't do anything useful as the feature was declared as
"standard" prior to this.

Reviewed by:	cem allanjude
Differential Revision:	https://reviews.freebsd.org/D16007
2018-07-05 17:07:23 +00:00
Ruslan Bukin
a08232301d Include UART driver since it is now provided in QEMU.
Sponsored by:	DARPA, AFRL
2018-06-29 10:55:42 +00:00
Ruslan Bukin
c43e3c8659 PLIC driver was sponsored by ECATS contract, not CTSRD one. 2018-06-21 11:52:09 +00:00
Ruslan Bukin
b626c976dc Don't jump to VA space until kernel is ready.
This fixes the race when first core sets up the pagetables, while
secondary cores do translating the address of __riscv_boot_ap.

This now allows us to smpboot in QEMU with 8 cores just fine.

Sponsored by:	DARPA, AFRL
2018-06-13 10:32:21 +00:00
Ruslan Bukin
6fdc57357e Include VirtIO devices to the GENERIC configuration file.
These are now available in QEMU/RISC-V.

Sponsored by:	DARPA, AFRL
2018-06-12 17:55:40 +00:00
Ruslan Bukin
2d53a67c2c o Add driver for PLIC (Platform-Level Interrupt Controller) device.
o Convert interrupt machdep support to use INTRNG code.

Sponsored by:	DARPA, AFRL
2018-06-12 17:45:15 +00:00
Ruslan Bukin
ebdf0baf3a Add simplebus-like RISC-V SoC bus.
This is required in order to probe and attach devices described under
"riscv-virtio-soc" node of DTS.

Sponsored by:	DARPA, AFRL
2018-06-12 17:07:30 +00:00
Ruslan Bukin
f2e299880a Release secondary cores from WFI (wait for interrupt) by sending them
an IPI.

This does not work however yet in QEMU. As a temporary workaround set
software interrupt pending bit manually on a local core to ensure WFI
doesn't halt the hart.

This is required to smpboot in QEMU.

Sponsored by:	DARPA, AFRL
2018-06-12 16:47:33 +00:00
Ruslan Bukin
a9063ba1d7 Align virtual addressing entries.
This is required due to C-compressed ISA extension option being turned on.

This fixes SMP operation in QEMU.

Sponsored by:	DARPA, AFRL
2018-06-12 16:19:27 +00:00
John Baldwin
ca75fa17ee Export a breakpoint() function to userland for riscv.
As a result, enable tests using breakpoint() on riscv.

Reviewed by:	br
Differential Revision:	https://reviews.freebsd.org/D15191
2018-05-16 16:56:35 +00:00
Warner Losh
794af7cfdc Remove extra copy of bcopy.c now that we're using the libkern version
of this file.
2018-05-12 01:43:32 +00:00
Brooks Davis
9c11d8d483 Remove the unused fuwintr() and suiwintr() functions.
Half of implementations always failed (returned (-1)) and they were
previously used in only one place.

Reviewed by:	kib, andrew
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15102
2018-04-17 18:04:28 +00:00
Ed Maste
fc2a8776a2 Rename assym.s to assym.inc
assym is only to be included by other .s files, and should never
actually be assembled by itself.

Reviewed by:	imp, bdrewery (earlier)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D14180
2018-03-20 17:58:51 +00:00
Konstantin Belousov
8c8ee2ee1c Unify bulk free operations in several pmaps.
Submitted by:	Yoshihiro Ota
Reviewed by:	markj
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D13485
2018-03-04 20:53:20 +00:00
Warner Losh
ef1fcaf0f5 Do not include float interfaces when using libsa.
We don't support float in the boot loaders, so don't include
interfaces for float or double in systems headers. In addition, take
the unusual step of spiking double and float to prevent any more
accidental seepage.
2018-02-23 04:04:25 +00:00
Konstantin Belousov
2c0f13aa59 vm_wait() rework.
Make vm_wait() take the vm_object argument which specifies the domain
set to wait for the min condition pass.  If there is no object
associated with the wait, use curthread' policy domainset.  The
mechanics of the wait in vm_wait() and vm_wait_domain() is supplied by
the new helper vm_wait_doms(), which directly takes the bitmask of the
domains to wait for passing min condition.

Eliminate pagedaemon_wait().  vm_domain_clear() handles the same
operations.

Eliminate VM_WAIT and VM_WAITPFAULT macros, the direct functions calls
are enough.

Eliminate several control state variables from vm_domain, unneeded
after the vm_wait() conversion.

Scetched and reviewed by:	jeff
Tested by:	pho
Sponsored by:	The FreeBSD Foundation, Mellanox Technologies
Differential revision:	https://reviews.freebsd.org/D14384
2018-02-20 10:13:13 +00:00
Jeff Roberson
e958ad4cf3 Make v_wire_count a per-cpu counter(9) counter. This eliminates a
significant source of cache line contention from vm_page_alloc().  Use
accessors and vm_page_unwire_noq() so that the mechanism can be easily
changed in the future.

Reviewed by:	markj
Discussed with:	kib, glebius
Tested by:	pho (earlier version)
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14273
2018-02-12 22:53:00 +00:00
Warner Losh
62bca77843 Move __va_list and related defines to sys/sys/_types.h
__va_list and related defines are identical in all the
ARCH/include/_types.h files. Move them to sys/sys/_types.h

Sponsored by: Netflix
2018-02-12 14:48:20 +00:00
Warner Losh
33e959abab Use standard pattern for stdargs.h
We don't support older compilers. Most of the code in these files is
for pre-3.0 gcc, which is at least 15 years obsolete. Move to using
phk's sys/_stdargs.h for all these platforms.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14323
2018-02-12 14:48:05 +00:00
Mark Johnston
ab7c09f121 Use vm_page_unwire_noq() instead of directly modifying page wire counts.
No functional change intended.

Reviewed by:	alc, kib (previous revision)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D14266
2018-02-08 19:28:51 +00:00
Nathan Whitehorn
9a8196ce19 Remove SFBUF_OPTIONAL_DIRECT_MAP and such hacks, replacing them across the
kernel by PHYS_TO_DMAP() as previously present on amd64, arm64, riscv, and
powerpc64. This introduces a new MI macro (PMAP_HAS_DMAP) that can be
evaluated at runtime to determine if the architecture has a direct map;
if it does not (or does) unconditionally and PMAP_HAS_DMAP is either 0 or
1, the compiler can remove the conditional logic.

As part of this, implement PHYS_TO_DMAP() on sparc64 and mips64, which had
similar things but spelled differently. 32-bit MIPS has a partial direct-map
that maps poorly to this concept and is unchanged.

Reviewed by:		kib
Suggestions from:	marius, alc, kib
Runtime tested on:	amd64, powerpc64, powerpc, mips64
2018-01-19 17:46:31 +00:00
Jeff Roberson
ab3185d15e Implement NUMA support in uma(9) and malloc(9). Allocations from specific
domains can be done by the _domain() API variants.  UMA also supports a
first-touch policy via the NUMA zone flag.

The slab layer is now segregated by VM domains and is precise.  It handles
iteration for round-robin directly.  The per-cpu cache layer remains
a mix of domains according to where memory is allocated and freed.  Well
behaved clients can achieve perfect locality with no performance penalty.

The direct domain allocation functions have to visit the slab layer and
so require per-zone locks which come at some expense.

Reviewed by:	Attilio (a slightly older version)
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
2018-01-12 23:25:05 +00:00