[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 21/22] x86/traps: Introduce FRED entrypoints
On 14/08/2025 4:57 pm, Jan Beulich wrote: > On 08.08.2025 22:23, Andrew Cooper wrote: >> --- a/xen/arch/x86/include/asm/asm_defns.h >> +++ b/xen/arch/x86/include/asm/asm_defns.h >> @@ -315,6 +315,71 @@ static always_inline void stac(void) >> subq $-(UREGS_error_code-UREGS_r15+\adj), %rsp >> .endm >> >> +/* >> + * Push and clear GPRs >> + */ >> +.macro PUSH_AND_CLEAR_GPRS >> + push %rdi >> + xor %edi, %edi >> + push %rsi >> + xor %esi, %esi >> + push %rdx >> + xor %edx, %edx >> + push %rcx >> + xor %ecx, %ecx >> + push %rax >> + xor %eax, %eax >> + push %r8 >> + xor %r8d, %r8d >> + push %r9 >> + xor %r9d, %r9d >> + push %r10 >> + xor %r10d, %r10d >> + push %r11 >> + xor %r11d, %r11d >> + push %rbx >> + xor %ebx, %ebx >> + push %rbp >> +#ifdef CONFIG_FRAME_POINTER >> +/* Indicate special exception stack frame by inverting the frame pointer. */ >> + mov %rsp, %rbp >> + notq %rbp >> +#else >> + xor %ebp, %ebp >> +#endif >> + push %r12 >> + xor %r12d, %r12d >> + push %r13 >> + xor %r13d, %r13d >> + push %r14 >> + xor %r14d, %r14d >> + push %r15 >> + xor %r15d, %r15d >> +.endm >> + >> +/* >> + * POP GPRs from a UREGS_* frame on the stack. Does not modify flags. >> + * >> + * @rax: Alternative destination for the %rax value on the stack. >> + */ >> +.macro POP_GPRS rax=%rax >> + pop %r15 >> + pop %r14 >> + pop %r13 >> + pop %r12 >> + pop %rbp >> + pop %rbx >> + pop %r11 >> + pop %r10 >> + pop %r9 >> + pop %r8 >> + pop \rax >> + pop %rcx >> + pop %rdx >> + pop %rsi >> + pop %rdi >> +.endm > Hmm, yes, differences are apparently large enough to warrant the redundancy > with SAVE_ALL / RESTORE_ALL. You may recall this construct from prior attempts I've had to remove SAVE_ALL/RESTORE_ALL, even with the \rax parameter for SVM. I still intend to complete that work at some point. > >> --- a/xen/arch/x86/include/asm/msr.h >> +++ b/xen/arch/x86/include/asm/msr.h >> @@ -202,9 +202,9 @@ static inline unsigned long read_gs_base(void) >> >> static inline unsigned long read_gs_shadow(void) >> { >> - unsigned long base; >> + unsigned long base, cr4 = read_cr4(); >> >> - if ( read_cr4() & X86_CR4_FSGSBASE ) >> + if ( !(cr4 & X86_CR4_FRED) && (cr4 & X86_CR4_FSGSBASE) ) >> { >> asm volatile ( "swapgs" ); >> base = __rdgsbase(); >> @@ -234,7 +234,9 @@ static inline void write_gs_base(unsigned long base) >> >> static inline void write_gs_shadow(unsigned long base) >> { >> - if ( read_cr4() & X86_CR4_FSGSBASE ) >> + unsigned long cr4 = read_cr4(); >> + >> + if ( !(cr4 & X86_CR4_FRED) && (cr4 & X86_CR4_FSGSBASE) ) >> { >> asm volatile ( "swapgs\n\t" >> "wrgsbase %0\n\t" > I don't quite get how these changes fit into this patch. Without the change, read_registers() suffers #UD because of the SWAPGS. This recurses until hitting the guard page, then repeats the same on the #DF stack. And because stacks work nicely under FRED, you eventually hit #DF's guard page. Strictly speaking it's only read_gs_shadow() which we need to change to make exception handling work, but I fixed both at the same time. That said, I have actually cleaned this codepath up with the MSR work because the code gen in read_registers() is terrible. Due to no-strict-aliasing, every store into state-> forces a recalculation of get_cpu_info(), meaning that read_cr4() cannot be hoisted, and there's a branch in every helper. I'm still not sure how best to fit it into this series. > >> --- a/xen/arch/x86/traps.c >> +++ b/xen/arch/x86/traps.c >> @@ -1013,6 +1013,32 @@ void show_execution_state_nmi(const cpumask_t *mask, >> bool show_all) >> printk("Non-responding CPUs: {%*pbl}\n", >> CPUMASK_PR(&show_state_mask)); >> } >> >> +static const char *x86_et_name(unsigned int type) >> +{ >> + static const char *const names[] = { >> + [X86_ET_EXT_INTR] = "EXT_INTR", >> + [X86_ET_NMI] = "NMI", >> + [X86_ET_HW_EXC] = "HW_EXC", >> + [X86_ET_SW_INT] = "SW_INT", >> + [X86_ET_PRIV_SW_EXC] = "PRIV_SW_EXEC", >> + [X86_ET_SW_EXC] = "SW_EXEC", >> + [X86_ET_OTHER] = "OTHER", >> + }; >> + >> + return (type < ARRAY_SIZE(names) && names[type]) ? names[type] : "???"; >> +} >> + >> +static const char *x86_et_other_name(unsigned int vec) > This isn't really a vector, is it? Well - you are decoding the field name regs->fred_ss.vector. Also I see I've typo'd EXEC in two of those names. > >> +{ >> + static const char *const names[] = { >> + [0] = "MTF", >> + [1] = "SYSCALL", >> + [2] = "SYSENTER", >> + }; >> + >> + return (vec < ARRAY_SIZE(names) && names[vec][0]) ? names[vec] : "???"; > Did you mean to check names[ves] for being NULL? Or is this a leftover > from the array being something like names[][10]? Oh, bad copy&paste. Will fix. > >> --- a/xen/arch/x86/x86_64/Makefile >> +++ b/xen/arch/x86/x86_64/Makefile >> @@ -1,6 +1,7 @@ >> obj-$(CONFIG_PV32) += compat/ >> >> obj-bin-y += entry.o >> +obj-bin-y += entry-fred.o > For the ordering here, ... > >> --- /dev/null >> +++ b/xen/arch/x86/x86_64/entry-fred.S >> @@ -0,0 +1,35 @@ >> +/* SPDX-License-Identifier: GPL-2.0-or-later */ >> + >> + .file "x86_64/entry-fred.S" >> + >> +#include <asm/asm_defns.h> >> +#include <asm/page.h> >> + >> + .section .text.entry, "ax", @progbits >> + >> + /* The Ring3 entry point is required to be 4k aligned. */ >> + >> +FUNC(entry_FRED_R3, 4096) > ... doesn't this 4k-alignment requirement suggest we want to put > entry-fred.o first? Perhaps, but that is quite subtle. I did also consider a .text.entry.page_aligned section, but .text.entry only matters for XPTI which (as agreed), I'm not intending to implement in FRED mode unless it proves to be necessary. Also IIRC there's still a symbol bug where _sentrytext takes priority over entry_FRED_R3, so the backtrace is effectively wrong. (These are all bad excuses, but some parts of this series are rather old.) > Also, might it be more natural to use PAGE_SIZE > here? I did debate that, but the spec uses 0xfff, not pages, even if the pipline surely does have an optimisation for chopping 12 metadata bits off the bottom of a pointer. ~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |