[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 23/23] x86/pv: Adjust eflags handling for FRED mode


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 1 Sep 2025 16:17:42 +0200
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 01 Sep 2025 14:18:03 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 28.08.2025 17:04, Andrew Cooper wrote:
> ERETU, unlike IRET, requires the sticky-1 bit (bit 2) be set, and reserved
> bits to be clear.  Notably this means that dom0_construct() must set
> X86_EFLAGS_MBS it in order for a PV dom0 to start.
> 
> Adjust arch_set_info_guest*() and hypercall_iret() which consume flags to
> clamp the reserved bits.
> 
> This is a minor ABI change, but by the same argument as commit
> 9f892f84c279 ("x86/domctl: Stop using XLAT_cpu_user_regs()"), this change will
> happen naturally when the vCPU schedules.

It's no that similar, is it? MBS will be observed set once guest context is
entered, irrespective of any scheduling. So it's entirely benign if we set
it up-front, except of course for a back-to-back set/get.

> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> ---
> CC: Jan Beulich <JBeulich@xxxxxxxx>
> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
> v2:
>  * New
> 
> The handling of VM is complicated.
> 
> It turns out that it's simply ignored by IRET in Long Mode (i.e. clearing it
> commit 0e47f92b0725 ("x86: force EFLAGS.IF on when exiting to PV guests")
> wasn't actually necessary) but ERETU does care.
> 
> But, it's unclear how to handle this in in arch_set_info().  We must preserve
> it for HVM guests (whih can use vm86 mode).  PV32 has special handling but
> only in hypercall_iret(), not in arch_set_info().

I think we need to either reject or clear VM, NT, IOPL, and whatever else
would make ERETU unhappy (for IOPL we already do so). It simply is of no
use to ...

> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1273,7 +1273,7 @@ int arch_set_info_guest(
>          v->arch.user_regs.rax               = c.nat->user_regs.rax;
>          v->arch.user_regs.rip               = c.nat->user_regs.rip;
>          v->arch.user_regs.cs                = c.nat->user_regs.cs;
> -        v->arch.user_regs.rflags            = c.nat->user_regs.rflags;
> +        v->arch.user_regs.rflags            = (c.nat->user_regs.rflags & 
> X86_EFLAGS_ALL) | X86_EFLAGS_MBS;
>          v->arch.user_regs.rsp               = c.nat->user_regs.rsp;
>          v->arch.user_regs.ss                = c.nat->user_regs.ss;
>          v->arch.pv.es                       = c.nat->user_regs.es;
> @@ -1297,7 +1297,7 @@ int arch_set_info_guest(
>          v->arch.user_regs.eax               = c.cmp->user_regs.eax;
>          v->arch.user_regs.eip               = c.cmp->user_regs.eip;
>          v->arch.user_regs.cs                = c.cmp->user_regs.cs;
> -        v->arch.user_regs.eflags            = c.cmp->user_regs.eflags;
> +        v->arch.user_regs.eflags            = (c.cmp->user_regs.eflags & 
> X86_EFLAGS_ALL) | X86_EFLAGS_MBS;
>          v->arch.user_regs.esp               = c.cmp->user_regs.esp;
>          v->arch.user_regs.ss                = c.cmp->user_regs.ss;
>          v->arch.pv.es                       = c.cmp->user_regs.es;

... accept the bits here, just for the first exit to guest mode to fault on
the ERETU. The guest would have a hard time to recover from that, I expect.
Yet perhaps we should do this only conditionally when FRED is active. Then
again a VM migrating from a pre-FRED host to a FRED one might observe the
(minor) behavioral change later on.

> --- a/xen/arch/x86/hvm/domain.c
> +++ b/xen/arch/x86/hvm/domain.c
> @@ -194,7 +194,7 @@ int arch_set_info_hvm_guest(struct vcpu *v, const struct 
> vcpu_hvm_context *ctx)
>          uregs->rsi    = regs->esi;
>          uregs->rdi    = regs->edi;
>          uregs->rip    = regs->eip;
> -        uregs->rflags = regs->eflags;
> +        uregs->rflags = (regs->eflags & X86_EFLAGS_ALL) | X86_EFLAGS_MBS;
>  
>          v->arch.hvm.guest_cr[0] = regs->cr0;
>          v->arch.hvm.guest_cr[3] = regs->cr3;
> @@ -245,7 +245,7 @@ int arch_set_info_hvm_guest(struct vcpu *v, const struct 
> vcpu_hvm_context *ctx)
>          uregs->rsi    = regs->rsi;
>          uregs->rdi    = regs->rdi;
>          uregs->rip    = regs->rip;
> -        uregs->rflags = regs->rflags;
> +        uregs->rflags = (regs->rflags & X86_EFLAGS_ALL) | X86_EFLAGS_MBS;

Why would the HVM code need changing at all? We never ERETU there.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.