-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arch/rv32i: separate kernel and app trap handlers; specify trap handler "interface" #3864
Conversation
f86592b
to
93f7e72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just comments.
I'm not fond of comments before lines of assembly, as there is no way to mark which assembly instructions pertain to that comment. Commenting on the same line removes that potential confusion.
csrrw t0, mscratch, t0 | ||
|
||
// If mscratch contained 0, invoke the kernel trap handler. | ||
beq t0, x0, _start_kernel_trap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beq t0, x0, _start_kernel_trap | |
beqz t0, _start_kernel_trap // If t0 == 0 then we faulted in kernel. |
// trap into machine mode. Therefore, this can only happen | ||
// when causing an exception in the trap handler itself. | ||
// No registers other than t0 and the mscratch CSR are to be | ||
// clobberd before contiuing execution at the address loaded into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// clobberd before contiuing execution at the address loaded into | |
// clobbered before continuing execution at the address loaded into |
// stated above. | ||
|
||
// Atomically swap t0 and mscratch: | ||
csrrw t0, mscratch, t0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
csrrw t0, mscratch, t0 | |
csrrw t0, mscratch, t0 // mscratch=t0; t0=mscratch |
|
||
// Else, invoke the trap handler at the address that was loaded in | ||
// the mscratch CSR. | ||
jr t0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jr t0 | |
jr t0 // Jump to address in t0. |
// tracking that the kernel is executing). | ||
csrrw t0, 0x340, zero // t0=mscratch, mscratch=0 | ||
// Restore t0. We reset mscratch to 0 (kernel trap handler mode) | ||
csrrw t0, mscratch, 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
csrrw t0, mscratch, 0 | |
csrrw t0, mscratch, 0 // t0=mscratch, mscratch=0 |
// to use a temporary register to get that address. So, we save `a1` | ||
// to the kernel stack (in t0) before we can move it to the proper | ||
// spot in the per-process stored state. | ||
sw a1, 0*4(t0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sw a1, 0*4(t0) | |
sw a1, 0*4(t0) // Save a1 on kernel stack. |
// | ||
// Save the PC to the stored state struct | ||
csrr t1, mepc | ||
sw t1, 31*4(a1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sw t1, 31*4(a1) | |
sw t1, 31*4(a1) // Save the PC to the stored state struct |
// | ||
// Save mtval to the stored state struct | ||
csrr t1, mtval | ||
sw t1, 33*4(a1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sw t1, 33*4(a1) | |
sw t1, 33*4(a1) // Save mtval to the stored state struct |
// Save mcause and leave it loaded into a0, as we call a function | ||
// with it below: | ||
csrr a0, mcause | ||
sw a0, 32*4(a1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sw a0, 32*4(a1) | |
sw a0, 32*4(a1) // Save mcause to the stored state struct |
// trap handler so that it does not fire again. If mcause is greater | ||
// than or equal to zero this was not an interrupt (i.e. the most | ||
// significant bit is not 1). | ||
bge a0, zero, 200f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bge a0, zero, 200f | |
bgez a0, 200f // if a0>0 branch to _start_app_trap_continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good other than a lingering TODO comment
/// need to. If the trap happens while and application was executing, we have to | ||
/// save the application state and then resume the `switch_to()` function to | ||
/// correctly return back to the kernel. | ||
/// TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the TODO here? Is this a comment to yourself to write documentation before converting from a draft to non-draft PR?
I appreciate the reviews and agree with @bradjc's suggestions around ASM syntax. The reason this is draft still is that this new version is mostly untested yet (and per LiteX CI seems to be broken currently). The current state of this (why its open now) is because I'm convinced what we want to achieve here is possible through the general approach proposed here, and as a forcing function for me to test and carefully review it again, especially a couple days after I wrote the initial version. |
K, @lschuermann I'm traiging this as waiting-for-author. If you need/want help thinking through this or testing, you know where to find me. |
Superseded by #4009 |
Pull Request Overview
This is a second, slightly less elegant but working(?) stab at #3847.
Tock's core kernel design permits process implementations with system call interfaces other than the ones shipped upstream in the
arch/cortex-m
andarch/rv32i
crates. However, in practice, theUserspaceKernelBoundary
implementation for RV32I has always been tightly coupled to the generic kernel RISC-V trap handler. This has made it difficult to build an alternative process implementation (e.g., as part of Encapsulated Functions), while reusing much of Tock's RISC-V implementation.By defining an "interface" that the global trap handler exposes (i.e., specifying how a custom trap handler can be registered, and which registers get clobbered), we allow foreign process implementations to hook into the
rv32i
arch crate's assembly, and can move all process-specific assembly intoSysCall::switch_to_process
. All process-specific logic is now contained in one assembly block that can be read top-to-bottom. The global trap handler no longer relies on the stack layout of that function, or other data structures insyscall.rs
.Because the kernel trap handler no longer contains any application logic and expects the app to set its own trap handler, implementing a new syscall ABI becomes as simple as temporarily swapping out the trap handler.
In-depth Walkthrough
Currently context switches on RISC-V work as follows:
For traps arriving in kernel mode, the kernel trap handler does an intricate dance to compare the
mscratch
value to0
without clobbering any registers, and then takes a_from_kernel
branch in_start_trap
.When switching to an app we disable interrupts and then swap the current kernel stack pointer into
mscratch
, making it non-zero. We re-enable interrupts when switching into user-mode.When a trap arrives during app execution we perform the same dance as in
0.
, but then branch into_from_app
.The
_from_app
branch saves all of the application state through two levels of pointer indirection:mscratch
Riscv32iStoredState
We then proceed to dump all registers into this state struct, and other core-local state (CSRs) onto known offsets from the stack pointer
This hard-codes many assumptions about Tock's process model, and may preclude implementations that want additional state stored, or do not want to buy into storing the whole register file (e.g., for single function invocations as with Encapsulated Functions, or cooperative userspace apps).
_from_app
proceeds to switch to Machine mode.This may be undesirable for process models which defer interrupt handling, e.g., for cooperative apps.
_from_app
returns from the interrupt handler to an address stored at a well-known offset on the stack.As is obvious from the above, using this trap handler has a significant amount of "buy in" concerning both the process execution semantics and the layout of stack pointer and the necessary
Riscv32iStoredState
.I propose changing the routing of traps caught while executing processes by using
mscratch
not to branch between "kernel" and "app" trap handlers, but between "kernel" and "custom" trap handlers. Ifmscratch != 0
, the global trap handler simply jumps to the address contained in this CSR. This is a minor architectural change and mirrors the semantics of our current approach, but has some advantages:a. storing clobbered registers on the stack,
b. storing the stack pointer in a static variable,
b. disabling interrupts,
d. registering the app trap handler in
mscratch
f. switching to userspace, re-enabling interrupts
a. swap mscratch and
t0
,b. check if
t0 == 0
, in that case execute the kernel trap handler,c. if not, jump to the address in
t0
(_start_app_trap
) entry._start_app_trap
stores the CPU state and register file in theRiscv32iStoredState
and stack layout defined in the very same file / assembly block.a. if this trap was caused by an interrupt, disable it.
_return_to_kernel
symbol directly below via an immediate, and jumping to it viamret
la
, expanding toauipc
orlui
andaddi
)Drawbacks:
mscratch
and need to save it to a static variable. This does not change the net load/stores requires, at previously we'd store and load the address for continuing theswitch_to_process
assembly after the trap handler on the stack.In short, this change increases cohesion and reduces coupling in the
rv32i
crate, and allows foreign process implementations.Testing Strategy
This pull request was tested by CI and needs some careful reviews.
This change-set modifies more parts about the
switch_to_process
assembly than I'd initially hoped for. For instance, it changes the registers that some assembly arguments are loaded in, to reduce the number of clobbers and/or moves that we have. I tried picking these changes apart into multiple, individually working commits, but that turned out to be very tricky.I performed an initial, very unscientific analysis of how the old assembly compares to the new one in terms of instruction count. This is going through an end-to-end context switch, starting and ending with the inline assembly in
switch_to_process
, excluding all user-mode instructions:mepc
as part ofmret
_start_app_trap
handlerswitch_to_process
when exiting trap handler withmret
mepc
as part ofmret
switch_to_process
when exiting trap handler withmret
TODO or Help Wanted
One issue on RISC-V is the presence of trap & interrupt handlers for different modes. A RISC-V platform with supervisor- and/or user-mode will feature the
stvec
andutvec
CSRs. These will not be consulted for disabled interrupts / trap routing, however, and many more of Tock's assumptions, at least around theSysCall
implementation inrv32i
, would break if these were used by enabling interrupts in thesie
/uie
CSRs.Because we have unclear semantics around them, I documented that right now, we expect all traps to be routed to the global trap handler. I removed the
PermissionMode
argument forrv32i::configure_trap_handler
, as only theMachine
branch was ever used.Looking forward to other's opinions on this!
Documentation Updated
/docs
, or no updates are required.Formatting
make prepush
.