This reduces the overhead of TLB invalidations by ensuring that we
only interrupt CPUs which are using the given pmap. Tracking is
performed in pmap_activate(), which gets called during context switches:
from cpu_throw(), if a thread is exiting or an AP is starting, or
cpu_switch() for a regular context switch.
For now, pmap_sync_icache() still must interrupt all CPUs.
Reviewed by: kib (earlier version), jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18874
This includes support for pmap_enter(..., psind=1) as described in the
commit log message for r321378.
The changes are largely modelled after amd64. arm64 has more stringent
requirements around superpage creation to avoid the possibility of TLB
conflict aborts, and these requirements do not apply to RISC-V, which
like amd64 permits simultaneous caching of 4KB and 2MB translations for
a given page. RISC-V's PTE format includes only two software bits, and
as these are already consumed we do not have an analogue for amd64's
PG_PROMOTED. Instead, pmap_remove_l2() always invalidates the entire
2MB address range.
pmap_ts_referenced() is modified to clear PTE_A, now that we support
both hardware- and software-managed reference and dirty bits. Also
fix pmap_fault_fixup() so that it does not set PTE_A or PTE_D on kernel
mappings.
Reviewed by: kib (earlier version)
Discussed with: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18863
Differential Revision: https://reviews.freebsd.org/D18864
Differential Revision: https://reviews.freebsd.org/D18865
Differential Revision: https://reviews.freebsd.org/D18866
Differential Revision: https://reviews.freebsd.org/D18867
Differential Revision: https://reviews.freebsd.org/D18868
The existing copyin(9) and copyout(9) routines on RISC-V perform only a
simple byte-by-byte copy. Improve their performance by performing
word-sized copies where possible.
Submitted by: Mitchell Horne <mhorne063@gmail.com>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18851
The sbadaddr register was renamed in version 1.10 of the privileged
architecture specification. No functional change intended.
Submitted by: Mitchell Horne <mhorne063@gmail.com>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18594
This adds more detail and fixes some inaccuracies.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18463
Add a subroutine for updating satp, for use when updating the
active pmap. No functional change intended.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18462
The RISC-V spec defines several performance counter CSRs such as: cycle,
time, instret, hpmcounter(3...31). They are defined to be 64-bits wide
on all RISC-V architectures. On RV64 and RV128 they can be read from a
single CSR. On RV32, additional CSRs (given the suffix "h") are present
which contain the upper 32 bits of these counters, and must be read as
well. (See section 2.8 in the User ISA Spec for full details.)
This change adds macros for reading these values safely on any RISC-V
ISA length. Obviously we aren't supporting anything other than RV64
at the moment, but this ensures we won't need to change how we read
these values if we ever do.
Submitted by: Mitchell Horne <mhorne063@gmail.com>
Reviewed by: jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D17952
Replace a call to DELAY(1) with a new cpu_lock_delay() KPI. Currently
cpu_lock_delay() is defined to DELAY(1) on all platforms. However,
platforms with a DELAY() implementation that uses spin locks should
implement a custom cpu_lock_delay() doesn't use locks.
Reviewed by: kib
MFC after: 3 days
Previously, RISC-V was enabling execute permissions in PTEs for any
readable page. Now, execute permissions are only enabled if they were
explicitly specified (e.g. via PROT_EXEC to mmap). The one exception
is that the initial kernel mapping in locore still maps all of the
kernel RWX.
While here, change the fault type passed to vm_fault and
pmap_fault_fixup to only include a single VM_PROT_* value representing
the faulting access to match other architectures rather than passing a
bitmask.
Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D17783
All platforms except powerpc use the same values and powerpc shares a
majority of them.
Go ahead and declare AT_NOTELF, AT_UID, and AT_EUID in favor of the
unused AT_DCACHEBSIZE, AT_ICACHEBSIZE, and AT_UCACHEBSIZE for powerpc.
Reviewed by: jhb, imp
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17397
(e.g. RocketChip, lowRISC and derivatives).
RISC-V page table entries support A (accessed) and D (dirty) bits. The
spec makes hardware support for these bits optional. Implementations that
do not manage these bits in hardware raise page faults for accesses to a
valid page without A set and writes to a writable page without D set.
Check for these types of faults when handling a page fault and fixup the
PTE without calling vm_fault if they occur.
Reviewed by: jhb, markj
Approved by: re (gjb)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17424
(e.g. RocketChip, lowRISC and derivatives).
RISC-V page table entries support A (accessed) and D (dirty) bits. The
spec makes hardware support for these bits optional. Implementations that
do not manage these bits in hardware raise page faults for accesses to a
valid page without A set and writes to a writable page without D set.
Check for these types of faults when handling a page fault and fixup the
PTE without calling vm_fault if they occur.
Reviewed by: jhb, markj
Approved by: re (gjb)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17424
- Remove the arm64-specific cpu_*cache* and cpu_tlb_flush* functions.
Instead, add RISC-V specific inline functions in cpufunc.h for the
fence.i and sfence.vma instructions.
- Catch up to changes in the arm64 pmap and remove all the cpu_dcache_*
calls, pmap_is_current, pmap_l3_valid_cacheable, and PTE_NEXT bits from
pmap.
- Remove references to the unimplemented riscv_setttb().
- Remove unused cpu_nullop.
- Add a link to the SBI doc to sbi.h.
- Add support for a 4th argument in SBI calls. It's not documented but
it seems implied for the asid argument to SBI_REMOVE_SFENCE_VMA_ASID.
- Pass the arguments from sbi_remote_sfence*() to the SEE. BBL ignores
them so this is just cosmetic.
- Flush icaches on other CPUs when they resume from kdb in case the
debugger wrote any breakpoints while the CPUs were paused in the IPI_STOP
handler.
- Add SMP vs UP versions of pmap_invalidate_* similar to amd64. The
UP versions just use simple fences. The SMP versions use the
sbi_remove_sfence*() functions to perform TLB shootdowns. Since we
don't have a valid pm_active field in the riscv pmap, just IPI all
CPUs for all invalidations for now.
- Remove an extraneous TLB flush from the end of pmap_bootstrap().
- Don't do a TLB flush when writing new mappings in pmap_enter(), only if
modifying an existing mapping. Note that for COW faults a TLB flush is
only performed after explicitly clearing the old mapping as is done in
other pmaps.
- Sync the i-cache on all harts before updating the PTE for executable
mappings in pmap_enter and pmap_enter_quick. Previously the i-cache was
only sync'd after updating the PTE in pmap_enter.
- Use sbi_remote_fence() instead of smp_rendezvous in pmap_sync_icache().
Reviewed by: markj
Approved by: re (gjb, kib)
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D17414
- Explicitly load an empty initial state into FP registers when taking
the fault on the first FP instruction in a thread. Setting
SSTATE.FS to INITIAL is just a marker to let context switch restore
code know that it can load FP registers with zeroes instead of
memory loads. It does not imply that the hardware will reset all
registers to zero on first access. In addition, set the state to
CLEAN instead of INITIAL after the first FP instruction.
cpu_switch() doesn't do anything for INITIAL and only restores from
the pcb if the state is CLEAN. We could perhaps change cpu_switch
to call fpe_state_clear if the state was INITIAL and leave SSTATE.FS
set to INITIAL instead of CLEAN after the first FP instruction.
However, adding this complexity to cpu_switch() doesn't seem worth
the supposed gain.
- Only save the current FPU registers in fill_fpregs() if the request
is made to save the current thread's registers. Previously if a
debugger requested FP registers via ptrace() it was getting a copy
of the debugger's FP registers rather than the debugee's.
- Zero the entire FP register set structure returned for ptrace() if a
thread hasn't used FP registers rather than leaking garbage in the
fp_fcsr field.
- If a debugger writes FP registers via ptrace(), always mark the pcb
as having valid FP registers and set SSTATUS.FS_MASK to CLEAN so
that the registers will be restored when the debugged thread
resumes.
- Be more explicit about clearing the SSTATUS.FS field before setting
it to CLEAN on the first FP instruction trap.
Submitted by: br, markj
Approved by: re (rgrimes)
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D17141
This is done by setting SUM (permit Supervisor User Memory access)
bit in sstatus register.
The functions we allow access for are routines in assembly that
explicitly handle crossing the user kernel boundary.
Approved by: re (kib)
Sponsored by: DARPA, AFRL
This is the output of
$ cat opcodes opcodes-rvc-pseudo opcodes-rvc opcodes-custom |
./parse-opcodes -c
It is confirmed by author that the output of parse-opcodes is
in the public domain.
This will be required for DDB disassembler.
Discussed with: Andrew Waterman <waterman@eecs.berkeley.edu>
Obtained from: https://github.com/riscv/riscv-opcodes
Sponsored by: DARPA, AFRL
register to determine if trap is from userspace.
Otherwise if we jump to kernel address from userspace, then
TRAPF_USERMODE failed to detect usermode and then do_ast
triggers a panic "ast in kernel mode".
Reviewed by: markj@
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D16469
o The correct value for _JB_SIGMASK is 27.
o The storage size for double-precision floating
point register is 8 bytes.
Submitted by: "James Clarke" <jrtc4@cam.ac.uk>
Reviewed by: markj@
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D16344
- Change pcpu zone consumers to use a stride size of PAGE_SIZE.
(defined as UMA_PCPU_ALLOC_SIZE to make future identification easier)
- Allocate page from the correct domain for a given cpu.
- Don't initialize pc_domain to non-zero value if NUMA is not defined
There are some misconceptions surrounding this field. It is the
_VM_ NUMA domain and should only ever correspond to valid domain
values as understood by the VM.
The former slab size of sizeof(struct pcpu) was somewhat arbitrary.
The new value is PAGE_SIZE because that's the smallest granularity
which the VM can allocate a slab for a given domain. If you have
fewer than PAGE_SIZE/8 counters on your system there will be some
memory wasted, but this is obviously something where you want the
cache line to be coming from the correct domain.
Reviewed by: jeff
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15933
We don't support float in the boot loaders, so don't include
interfaces for float or double in systems headers. In addition, take
the unusual step of spiking double and float to prevent any more
accidental seepage.
We don't support older compilers. Most of the code in these files is
for pre-3.0 gcc, which is at least 15 years obsolete. Move to using
phk's sys/_stdargs.h for all these platforms.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14323
kernel by PHYS_TO_DMAP() as previously present on amd64, arm64, riscv, and
powerpc64. This introduces a new MI macro (PMAP_HAS_DMAP) that can be
evaluated at runtime to determine if the architecture has a direct map;
if it does not (or does) unconditionally and PMAP_HAS_DMAP is either 0 or
1, the compiler can remove the conditional logic.
As part of this, implement PHYS_TO_DMAP() on sparc64 and mips64, which had
similar things but spelled differently. 32-bit MIPS has a partial direct-map
that maps poorly to this concept and is unchanged.
Reviewed by: kib
Suggestions from: marius, alc, kib
Runtime tested on: amd64, powerpc64, powerpc, mips64
They provide relaxed-ordered atomic access semantic. Due to the
FreeBSD memory model, the operations are syntaxical wrappers around
the volatile accesses. The volatile qualifier is used to ensure that
the access not optimized out and in turn depends on the volatile
semantic as implemented by supported compilers.
The motivation for adding the operation is to help people coming from
other systems or knowing the C11/C++ standards where atomics have
special type and require use of the special access operations. It is
still the case that FreeBSD requires plain load and stores of aligned
integer types to be atomic.
Suggested by: jhb
Reviewed by: alc, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D13534
- allocate value for new AT_HWCAP2 auxiliary vector on all platforms.
- expand 'struct sysentvec' by new 'u_long *sv_hwcap2', in exactly
same way as for AT_HWCAP.
MFC after: 1 month
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D12699
A new 'u_long *sv_hwcap' field is added to 'struct sysentvec'. A
process ABI can set this field to point to a value holding a mask of
architecture-specific CPU feature flags. If an ABI does not wish to
supply AT_HWCAP to processes the field can be left as NULL.
The support code for AT_EHDRFLAGS was already present on all systems,
just the #define was not present. This is a step towards unifying the
AT_* constants across platforms.
Reviewed by: kib
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D12290
New version is not compatible on supervisor mode with v1.9.1
(previous version).
Highlights:
o BBL (Berkeley Boot Loader) provides no initial page tables
anymore allowing us to choose VM, to build page tables manually
and enable MMU in S-mode.
o SBI interface changed.
o GENERIC kernel.
FDT is now chosen standard for RISC-V hardware description.
DTB is now provided by Spike (golden model simulator). This
allows us to introduce GENERIC kernel. However, description
for console and timer devices is not provided in DTB, so move
these devices temporary to nexus bus.
o Supervisor can't access userspace by default. Solution is to
set SUM (permit Supervisor User Memory access) bit in sstatus
register.
o Compressed extension is now turned on by default.
o External GCC 7.1 compiler used.
o _gp renamed to __global_pointer$
o Compiler -march= string is now in use allowing us to choose
required extensions (compressed, FPU, atomic, etc).
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D11800
--Remove special-case handling of sparc64 bus_dmamap* functions.
Replace with a more generic mechanism that allows MD busdma
implementations to generate inline mapping functions by
defining WANT_INLINE_DMAMAP in <machine/bus_dma.h>. This
is currently useful for sparc64, x86, and arm64, which all
implement non-load dmamap operations as simple wrappers
around map objects which may be bus- or device-specific.
--Remove NULL-checked bus_dmamap macros. Implement the
equivalent NULL checks in the inlined x86 implementation.
For non-x86 platforms, these checks are a minor pessimization
as those platforms do not currently allow NULL maps. NULL
maps were originally allowed on arm64, which appears to have
been the motivation behind adding arm[64]-specific barriers
to bus_dma.h, but that support was removed in r299463.
--Simplify the internal interface used by the bus_dmamap_load*
variants and move it to bus_dma_internal.h
--Fix some drivers that directly include sys/bus_dma.h
despite the recommendations of bus_dma(9)
Reviewed by: kib (previous revision), marius
Differential Revision: https://reviews.freebsd.org/D10729
from machine/proc.h, consistently on all architectures.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
X-Differential revision: https://reviews.freebsd.org/D11080
The MFC will include a compat definition of smp_no_rendevous_barrier()
that calls smp_no_rendezvous_barrier().
Reviewed by: gnn, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10313
I fixed this in 1997, but the fix was over-engineered and fragile and
was broken in 2003 if not before. i386 parameters were copied to 8
other arches verbatim, mostly after they stopped working on i386, and
mostly without the large comment saying how the values were chosen on
i386. powerpc has a non-verbatim copy which just changes the uncritical
parameter and seems to add a sign extension bug to it.
Just treat negative offsets as offsets if they are no more negative than
-db_offset_max (default -64K), and remove all the broken parameters.
-64K is not very negative, but it is enough for frame and stack pointer
offsets since kernel stacks are small.
The over-engineering was mainly to go more negative than -64K for the
negative offset format, without affecting printing for more than a
single address.
Addresses in the top 64K of a (full 32-bit or 64-bit) address space
are now printed less well, but there aren't many interesting ones.
For arches that have many interesting ones very near the top (e.g.,
68k has interrupt vectors there), there would be no good limit for
the negative offset format and -64K is a good as anything.
The types are for the byte offset and page index in vm object. They
are similar to off_t, which is defined as 64bit MI integer. Using MI
definitions will allow to provide consistent MD values of vm
object-related maximum sizes.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Hardfloat is now default (use riscv64sf as TARGET_ARCH
for softfloat).
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D8529