|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v1 03/15] xen/riscv: implement vcpu_csr_init()
On 1/7/26 9:46 AM, Jan Beulich wrote: On 24.12.2025 18:03, Oleksii Kurochko wrote:Implement function to initialize VCPU's CSR registers to delegate handling of some traps to VS-mode ( guest ), enable vstimecmp for VS-mode, and allow some AIA-related register (thier vs* copies ) for VS-mode.The henvcfg setting isn't covered here at all, unless I'm failing to make the respective association. Nor is the setting of SMSTATEEN0_HSENVCFG in hstateen0. Overall it feels like the description here is too terse anyway, as the bits set (or not) are a pretty crucial thing for running guests. Then again maybe this is just me, for not being a RISC-V person ... I will add more details to commit message then. --- a/xen/arch/riscv/domain.c +++ b/xen/arch/riscv/domain.c @@ -3,6 +3,67 @@ #include <xen/mm.h> #include <xen/sched.h>+#include <asm/cpufeature.h> Maybe that would be better, but I don’t see much difference, especially if we use the following define: #define HEDELEG_DEFAULT ( BIT(CAUSE_MISALIGNED_FETCH, U) | ... ) It would still be just one instruction to write the value to|hedeleg|. (I think the compiler will likely produce the same optimization with the current implementation.) Then again I'm not quite sure whether the set of CAUSE_* in the header file is actually complete: MCAUSE also can hold the values 16, 18, and 19. Then 14 and 17 could be added as well. I see the sense in adding 18 and 19, since they are defined as "software check" and "hardware error." However, I don’t see much value in providing|CAUSE_*| for 14 and 16–17, as they are just reserved and have no specific meaning. I could add something like|CAUSE_RES_14|,|CAUSE_RES_16|,|CAUSE_RES_17|, but since we aren’t actually handling them, I think it’s fine to update|CAUSE_* |only when there is a real use for them, like with 18 and 19. (Otoh you have CAUSE_MACHINE_ECALL, which I don't think can ever be observed outside of M-mode.) Good point, It seems like you are right and M-ecall can't be observed outside of M-mode and even more it is marked as read only 0 so it is not expected to be delegated to lower privilige mode, but then I don't know why it was added to "Table 29 Bits of hedeleg that must be writable or must be read-only zero.". Also, while it may seem to not matter much, sorting the above by their numeric values would ease comparison against the full set. I will move "hedeleg |= (1U << CAUSE_BREAKPOINT);" up; all others seems are sorted properly. + hstatus = HSTATUS_SPV | HSTATUS_SPVP; + v->arch.hstatus = hstatus;Why would these (or in fact any) bits need setting here?
It could be moved to continue_new_vcpu() where now (in downstream) I have:
csr_write(CSR_HSTATUS, vcpu_guest_cpu_user_regs(current)->hstatus);
reset_stack_and_jump(return_to_new_vcpu);
But I put it here to have vCPU state (all or as much as possible) initialized
in one place.
Isn't hstatus written upon exit from guest context? Setting these bits manually is necessary for the initial entry into a guest context. While it is true that hardware updates hstatus during a trap from a guest, software must set these bits to define the destination state for the subsequent SRET instruction. When a hypervisor prepares to run a guest for the first time, there has been no previous "exit" from that guest to automatically populate the CSRs. Setting these bits is essential for the following reasons: - Defining the target Virtualization Mode (SPV): The SPV bit is used by the SRET instruction to determine the new virtualization mode. If the hypervisor is in HS-mode (V=0) and executes SRET, the hardware sets the new V to the current value of hstatus.SPV. Without manually setting HSTATUS_SPV, the SRET would return the hart to V=0 instead of entering the guest (V=1). - Defining the target Privilege Level (SPVP): The SPVP bit tracks the nominal privilege level (S or U) of the guest. When V=1, this determines if the guest is in VS-mode or VU-mode. - Controlling Hypervisor Load/Store Instructions: SPVP specifically controls the effective privilege of explicit memory accesses made by hypervisor virtual-machine load/store instructions (HLV, HLVX, and HSV). If the hypervisor needs to use these instructions to access guest memory as if it were the guest supervisor, SPVP must be set to 1. But maybe there is no too much sense in this instructions before guest is ran. + hideleg = MIP_VSTIP | MIP_VSEIP | MIP_VSSIP; + v->arch.hideleg = hideleg;Again I think having MIP_VSTIP in the middle (to establish numeric sorting) would be slightly better. Also there's a stray blank after the first |.+ /* + * VS should access only the time counter directly. + * Everything else should trap. + */ + v->arch.hcounteren |= HCOUNTEREN_TM;Why are this and ...+ if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_svpbmt) ) + v->arch.henvcfg |= ENVCFG_PBMTE;... this using |= but the earlier ones simply = ? Unless there is a specific reason, consistency is likely preferable. This was overlooked during refactoring; it seems I simply used|=| instead of||=|. The idea is that if it’s the first initialization,|=| should be used; otherwise, for subsequent writes,||=| is used to avoid clearing previous values. I will update this part to use the same pattern consistently throughout this function.
This is how OpenSBI called this bit from where riscv_encoding.h was taken. SVSLCT stands for Supervisor Virtual Select, referring to the access control of the siselect and vsiselect registers. Also, wouldn't you better keep internal state in line with what hardware actually supports? CSRIND may be read-only-zero in the real register, in which case having the bit set in the "cached" copy can be misleading.
According to the AIA spec:
If extension Smstateen is implemented together with the Advanced Interrupt
Architecture (AIA),
three bits of state-enable register mstateen0 control access to AIA-added
state from privilege modes
less privileged than M-mode:
bit 60 CSRs siselect, sireg, vsiselect, and vsireg
bit 59 all other state added by the AIA and not controlled by bits 60 and 58
bit 58 all IMSIC state, including CSRs stopei and vstopei
What I read as if Smstateen is supported then all the bits are supported by
hardware, and that is why it is enough to check if Smstateen is supported.
But I decided to check what KVM does in the similar case and it seems that I
incorrectly read
the first line of the mentioned about AIA's spec and it is need another one
if-condition:
if (riscv_has_extension_unlikely(RISCV_ISA_EXT_SMSTATEEN)) {
cfg->hstateen0 |= SMSTATEEN0_HSENVCFG;
if (riscv_isa_extension_available(isa, SSAIA))
cfg->hstateen0 |= SMSTATEEN0_AIA_IMSIC |
SMSTATEEN0_AIA |
SMSTATEEN0_AIA_ISEL;
if (riscv_isa_extension_available(isa, SMSTATEEN))
cfg->hstateen0 |= SMSTATEEN0_SSTATEEN0;
}
(This may similarly apply to at least hedeleg and hideleg, btw.)
Regarding the previous bits, I can understand that it would be an issue:
if SSAIA isn’t supported, then it is incorrect to update the corresponding
bits of|hstateen0|.
However, I’m not really sure I understand what the issue is with|h{i,e}deleg|.
All writable bits there don’t depend on hardware support. Am I missing
something?
As to consistency: Further up you use local helper variables (for imo no real reason), when here you don't. Instead this line ends up being too long. I will update the code to have consistency. Thanks. ~ Oleksii
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |