|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4] xen/riscv: dump GPRs and CSRs on unexpected traps
On 28.01.2026 12:01, Oleksii Kurochko wrote:
>
> On 1/28/26 11:34 AM, Oleksii Kurochko wrote:
>>
>> On 1/27/26 10:27 AM, Jan Beulich wrote:
>>> On 27.01.2026 10:14, Jan Beulich wrote:
>>>> On 26.01.2026 12:43, Oleksii Kurochko wrote:
>>>>> Provide additional context when an unexpected exception occurs by
>>>>> dumping
>>>>> the relevant Supervisor, Virtual Supervisor (VS), and Hypervisor CSRs,
>>>>> along with the general-purpose registers associated with the trap.
>>>>>
>>>>> Dumping VS-mode CSRs in addition to host CSRs is beneficial when
>>>>> analysing
>>>>> VS-mode traps. VSCAUSE, VSEPC, VSTVAL, and related VS state are
>>>>> required to
>>>>> properly diagnose unexpected guest traps and potential hypervisor
>>>>> misconfiguration.
>>>>> For example, on an illegal-instruction exception the hardware may
>>>>> record
>>>>> the faulting instruction in VSTVAL. If VSTVAL is zero, VSEPC should
>>>>> always
>>>>> be inspected, and can be used together with objdump to identify the
>>>>> faulting instruction. Dumping VSCAUSE is also useful when the guest
>>>>> does
>>>>> not report it, or when the hypervisor redirects a trap to the guest
>>>>> using
>>>>> VSCAUSE, VSTATUS, and VSTVEC, allowing verification that a
>>>>> subsequent trap
>>>>> is not caused by incorrect VS state.
>>>>>
>>>>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
>>>> Acked-by: Jan Beulich <jbeulich@xxxxxxxx>
>>> Hmm, wait, there's another anomaly:
>>>
>>>> I still have a question though, which can be addressed incrementally.
>>>>
>>>>> --- a/xen/arch/riscv/traps.c
>>>>> +++ b/xen/arch/riscv/traps.c
>>>>> @@ -99,12 +99,70 @@ static const char *decode_cause(unsigned long
>>>>> cause)
>>>>> return decode_trap_cause(cause);
>>>>> }
>>>>> +static void dump_general_regs(const struct cpu_user_regs *regs)
>>>>> +{
>>>>> +#define X(regs, name, delim) \
>>>>> + printk("%-4s: %016lx" delim, #name, (regs)->name)
>>>>> +
>>>>> + X(regs, ra, " "); X(regs, sp, "\n");
>>>>> + X(regs, gp, " "); X(regs, tp, "\n");
>>>>> + X(regs, t0, " "); X(regs, t1, "\n");
>>>>> + X(regs, t2, " "); X(regs, s0, "\n");
>>>>> + X(regs, s1, " "); X(regs, a0, "\n");
>>>>> + X(regs, a1, " "); X(regs, a2, "\n");
>>>>> + X(regs, a3, " "); X(regs, a4, "\n");
>>>>> + X(regs, a5, " "); X(regs, a6, "\n");
>>>>> + X(regs, a7, " "); X(regs, s2, "\n");
>>>>> + X(regs, s3, " "); X(regs, s4, "\n");
>>>>> + X(regs, s5, " "); X(regs, s6, "\n");
>>>>> + X(regs, s7, " "); X(regs, s8, "\n");
>>>>> + X(regs, s9, " "); X(regs, s10, "\n");
>>>>> + X(regs, s11, " "); X(regs, t3, "\n");
>>>>> + X(regs, t4, " "); X(regs, t5, "\n");
>>>>> + X(regs, t6, " "); X(regs, sepc, "\n");
>>>> Does this sepc value differ from ...
>>>>
>>>>> +static void dump_csrs(unsigned long cause)
>>> What is this function parameter for?
>>>
>>>>> +{
>>>>> +#define X(name, csr, fmt, ...) \
>>>>> + v = csr_read(csr); \
>>>>> + printk("%-10s: %016lx" fmt, #name, v, ##__VA_ARGS__)
>>>>> +
>>>>> + unsigned long v;
>>>>> +
>>>>> + X(htval, CSR_HTVAL, " "); X(htinst, CSR_HTINST, "\n");
>>>>> + X(hedeleg, CSR_HEDELEG, " "); X(hideleg, CSR_HIDELEG, "\n");
>>>>> + X(hstatus, CSR_HSTATUS, " [%s%s%s%s%s%s ]\n",
>>>>> + (v & HSTATUS_VTSR) ? " VTSR" : "",
>>>>> + (v & HSTATUS_VTVM) ? " VTVM" : "",
>>>>> + (v & HSTATUS_HU) ? " HU" : "",
>>>>> + (v & HSTATUS_SPVP) ? " SPVP" : "",
>>>>> + (v & HSTATUS_SPV) ? " SPV" : "",
>>>>> + (v & HSTATUS_GVA) ? " GVA" : "");
>>>>> + X(hgatp, CSR_HGATP, "\n");
>>>>> + X(hstateen0, CSR_HSTATEEN0, "\n");
>>>>> + X(stvec, CSR_STVEC, " "); X(vstvec, CSR_VSTVEC, "\n");
>>>>> + X(sepc, CSR_SEPC, " "); X(vsepc, CSR_VSEPC, "\n");
>>>> ... the one logged here? Nothing changes the register between entry
>>>> into the hypervisor and coming here?
>>> Down below here you have
>>>
>>> X(scause, CSR_SCAUSE, " [%s]\n", decode_cause(v));
>>>
>>> which actually (largely) duplicates what do_unexpected_trap() has
>>> already
>>> logged.
>>
>> Missed that, then it would be better to remove this duplication and leave
>> only printing of CSR_SCAUSE in dump_csrs().
>>
>>> If dump_csrs() gained other uses, the dumping of scause likely is
>>> wanted, but then likely no scause value would be available to pass
>>> in? So
>>> maybe its dumping actually wants to be conditional (and the parameter
>>> wants to be a boolean)?
>>
>> Good point. Honestly speaking, I don't know if it will be any other
>> usages
>> except this one. But to keep things generic I think it is good idea to
>> add conditional dumping of scause.
>
> As an alternative, we could simply remove the dump_csrs() argument and always
> print SCAUSE. When running in hypervisor mode, SCAUSE should contain the
> reason
> for the trap. Even it is lets say just hypercall and not something faulty, it
> will contain "Environment call from S-mode" what looks okay to be printed.
>
> I tend to prefer this approach slightly. What do you think?
It means that it'll be logged twice for an unexpected trap. As said, I can
only recommend to limit the volume of the output in such situations, as
sometimes all people may be able to get you is screenshots.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |