[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH for-4.22] x86/hvm: Introduce force_x2apic flag


  • To: Teddy Astie <teddy.astie@xxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 30 Oct 2025 09:24:27 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Isn9i84sEB/9/zrmh85TkuFwHbfLi8sDWH9laFnBZdA=; b=caA+6/TDB45beRUii7eJdW1kkwC0Tqwgf2EB3KJ/mfKiO7B9cuA05xIGdcsmRXmlo1/7IKGdUEoi7G76oUWFHIDLpk551B1SJEVWwnB5qy9TFWFSVED+Rr7sEAHwEsjysyjcYCSVpANwACWCAbZHjwOnxrfDRu0uY3MXs3T6OT6vK8RaN4nOXnIuxgQ5f9a7LIbsmJKiBFJDsumPU24Sw4uT5sakuuBVEStZSLn9manb18Oa+XaZBoWDoY+HlnKfaR1foQBNLFAneIeyYH2VVFZim3tuvy3jfiVAbPbc+sliBaz0KjmV7RvIvKIVYL9y6OCsBnmz4llKgR7GIYo96w==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=kaz42UlKVbS221XXQx37A/SBF23rxlRCsKUuN3gp5HxNhPnR5mfszFOn1RbgAUAQOyTR+EiHkz0anzAPrgAeWdYm4/IH8OrzjkQyNaXPtoDrNnqCMx88Hql5dlcYMrw36+URjGnSMLWODVctfp4/jGqA8kTLBJg26dk6pcEyn+wtlDmvcOcDCXeJbNI+Brrp8LYpKpL7PRkude/6lngq8vJALZw8oSPIwjxbl2lBlqBTxpr9RHv5wpcnC6Z86rzt6njaDcg4aUmx5NdVEmVdeGQd9PfxwycIcwlvO9cvylUWKdGuSfer67DCnEMqDgz4pYNmRl5Ss/oy7OTW1es+EQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
  • Delivery-date: Thu, 30 Oct 2025 08:24:57 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, Oct 29, 2025 at 08:06:36PM +0000, Teddy Astie wrote:
> Le 29/10/2025 à 19:55, Roger Pau Monné a écrit :
> > On Wed, Oct 29, 2025 at 06:26:14PM +0000, Teddy Astie wrote:
> >> Introduce a new flag to force the x2APIC enabled and preventing a
> >> guest from switching back LAPIC to xAPIC mode.
> >>
> >> The semantics of this mode are based IA32_XAPIC_DISABLE_STATUS
> >> architectural MSR of Intel specification.
> >>
> >> Signed-off-by: Teddy Astie <teddy.astie@xxxxxxxxxx>
> > 
> > Thanks, some initial comments below.
> > 
> >> ---
> >> This feature can be useful for various reasons, starting with SEV as
> >> it is complicated (especially with SEV-ES) to handle MMIO, and legacy
> >> xAPIC is one thing that needs MMIO intercepts (and Linux uses it during
> >> boot unless x2APIC is initially enabled, even if it switches to
> >> x2apic afterward). It could also be interesting to reduce the attack
> >> surface of the hypervisor (by only exposing x2apic to the guest).
> >>
> >> As it can allow to have MMIO-less guest (using PVH), perhaps it can
> >> be enough for avoiding the problematic cases of virtualized INVLPGB
> >> (when we have it).
> >>
> >> In my testing, Linux, FreeBSD and PV-shim works fine with it; OVMF
> >> freezes for some reason, NetBSD doesn't support it (no x2apic support
> >> as Xen guest). HVM BIOS gets stuck at SeaBIOS as it expects booting
> >> with xAPIC.
> >>
> >> On Intel platforms, it would be better to expose the
> >> IA32_XAPIC_DISABLE_STATUS architectural MSR to advertise this to
> >> guest, but it's non-trivial as it needs to be properly exposed
> >> through IA32_ARCH_CAPABILITIES which is currently passed-through.
> >>
> >>   docs/man/xl.cfg.5.pod.in              |  7 +++++++
> >>   tools/libs/light/libxl_types.idl      |  1 +
> >>   tools/libs/light/libxl_x86.c          |  4 ++++
> >>   tools/xl/xl_parse.c                   |  1 +
> >>   xen/arch/x86/domain.c                 |  2 +-
> >>   xen/arch/x86/hvm/hvm.c                |  2 ++
> >>   xen/arch/x86/hvm/vlapic.c             | 23 ++++++++++++++++++++++-
> >>   xen/arch/x86/include/asm/domain.h     |  2 ++
> >>   xen/arch/x86/include/asm/hvm/domain.h |  3 +++
> >>   xen/include/public/arch-x86/xen.h     | 12 +++++++++++-
> >>   10 files changed, 54 insertions(+), 3 deletions(-)
> > 
> > Seeing there are no changes to the ACPI tables exposed to the guest,
> > do we want to start exposing X2APIC MADT entries instead of the plain
> > APIC entries?
> > 
> > The ACPI spec seems to suggest that you can expose APIC entries for
> > APICs below 255, for compatibility reasons.  But given that we would
> > force the guest to use X2APIC mode it would certainly need to
> > understand how to process X2APIC MADT entries anyway.
> > 
> > Not sure it makes much of a difference, but wondering whether OSes
> > expect X2APIC MADT entries if the mode is locked to X2APIC.
> > 
> 
> In all OS I checked, they see x2APIC MADT entries as a different format 
> for LAPIC entries and don't really link it with whether x2APIC is used 
> or not.
> 
> But I think it's safe to assume all OS that supports x2APIC has support 
> for x2APIC MADT entries, which could make ACPI table generation simpler 
> (especially for dealing with LAPIC IDs over 255)
> 
> >>
> >> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
> >> index ad1553c5e9..01b41d93c0 100644
> >> --- a/docs/man/xl.cfg.5.pod.in
> >> +++ b/docs/man/xl.cfg.5.pod.in
> >> @@ -3198,6 +3198,13 @@ option.
> >>   
> >>   If using this option is necessary to fix an issue, please report a bug.
> >>   
> >> +=item B<force_x2apic=BOOLEAN>
> >> +
> >> +Force the LAPIC in x2APIC mode and prevent the guest from disabling
> >> +it or switching to xAPIC mode.
> >> +
> >> +This option is disabled by default.
> >> +
> >>   =back
> >>   
> >>   =head1 SEE ALSO
> >> diff --git a/tools/libs/light/libxl_types.idl 
> >> b/tools/libs/light/libxl_types.idl
> >> index d64a573ff3..b95278007e 100644
> >> --- a/tools/libs/light/libxl_types.idl
> >> +++ b/tools/libs/light/libxl_types.idl
> >> @@ -738,6 +738,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
> >>                                  ("arm_sci", libxl_arm_sci),
> >>                                 ])),
> >>       ("arch_x86", Struct(None, [("msr_relaxed", libxl_defbool),
> >> +                               ("force_x2apic", libxl_defbool)
> > 
> > This addition needs a new define in libxl.h to signal it's presence,
> > see LIBXL_HAVE_* defines in there.
> > 
> 
> Something like LIBXL_HAVE_FORCE_X2APIC ?

Yes, something like that.  Not sure we want to add X86 somewhere in
there, but X2APIC is already x86-specific so unlikely to have any
meaning for other arches.

> >>                                 ])),
> >>       # Alternate p2m is not bound to any architecture or guest type, as 
> >> it is
> >>       # supported by x86 HVM and ARM support is planned.
> >> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
> >> index 60d4e8661c..2e0205d2a2 100644
> >> --- a/tools/libs/light/libxl_x86.c
> >> +++ b/tools/libs/light/libxl_x86.c
> >> @@ -26,6 +26,9 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
> >>       if (libxl_defbool_val(d_config->b_info.arch_x86.msr_relaxed))
> >>           config->arch.misc_flags |= XEN_X86_MSR_RELAXED;
> >>   
> >> +    if (libxl_defbool_val(d_config->b_info.arch_x86.force_x2apic))
> >> +        config->arch.misc_flags |= XEN_X86_FORCE_X2APIC;
> >> +
> >>       if (libxl_defbool_val(d_config->b_info.trap_unmapped_accesses)) {
> >>               LOG(ERROR, "trap_unmapped_accesses is not supported on 
> >> x86\n");
> >>               return ERROR_FAIL;
> >> @@ -818,6 +821,7 @@ int libxl__arch_domain_build_info_setdefault(libxl__gc 
> >> *gc,
> >>   {
> >>       libxl_defbool_setdefault(&b_info->acpi, true);
> >>       libxl_defbool_setdefault(&b_info->arch_x86.msr_relaxed, false);
> >> +    libxl_defbool_setdefault(&b_info->arch_x86.force_x2apic, false);
> >>       libxl_defbool_setdefault(&b_info->trap_unmapped_accesses, false);
> >>   
> >>       if (b_info->type == LIBXL_DOMAIN_TYPE_HVM) {
> >> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
> >> index af86d3186d..d84ab7c823 100644
> >> --- a/tools/xl/xl_parse.c
> >> +++ b/tools/xl/xl_parse.c
> >> @@ -3041,6 +3041,7 @@ skip_usbdev:
> >>                       "If it fixes an issue you are having please report 
> >> to "
> >>                       "xen-devel@xxxxxxxxxxxxxxxxxxxx.\n");
> >>   
> >> +    xlu_cfg_get_defbool(config, "force_x2apic", 
> >> &b_info->arch_x86.force_x2apic, 0);
> >>       xlu_cfg_get_defbool(config, "vpmu", &b_info->vpmu, 0);
> >>   
> >>       xlu_cfg_destroy(config);
> >> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> >> index 19fd86ce88..02f650a614 100644
> >> --- a/xen/arch/x86/domain.c
> >> +++ b/xen/arch/x86/domain.c
> >> @@ -704,7 +704,7 @@ int arch_sanitise_domain_config(struct 
> >> xen_domctl_createdomain *config)
> >>           return -EINVAL;
> >>       }
> >>   
> >> -    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
> >> +    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED | 
> >> XEN_X86_FORCE_X2APIC) )
> >>       {
> >>           dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
> >>                   config->arch.misc_flags);
> >> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> >> index 0c60faa39d..73cbac0f22 100644
> >> --- a/xen/arch/x86/hvm/hvm.c
> >> +++ b/xen/arch/x86/hvm/hvm.c
> >> @@ -616,6 +616,8 @@ int hvm_domain_initialise(struct domain *d,
> >>       INIT_LIST_HEAD(&d->arch.hvm.mmcfg_regions);
> >>       INIT_LIST_HEAD(&d->arch.hvm.msix_tables);
> >>   
> >> +    d->arch.hvm.force_x2apic = config->arch.misc_flags & 
> >> XEN_X86_FORCE_X2APIC;
> >> +
> >>       rc = create_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0, NULL, 
> >> NULL);
> >>       if ( rc )
> >>           goto fail;
> >> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> >> index 993e972cd7..ae8df70d2e 100644
> >> --- a/xen/arch/x86/hvm/vlapic.c
> >> +++ b/xen/arch/x86/hvm/vlapic.c
> >> @@ -1116,6 +1116,20 @@ int guest_wrmsr_apic_base(struct vcpu *v, uint64_t 
> >> val)
> >>       if ( !has_vlapic(v->domain) )
> >>           return X86EMUL_EXCEPTION;
> >>   
> >> +    if ( has_force_x2apic(v->domain) )
> >> +    {
> >> +        /*
> >> +        * We implement the same semantics as 
> >> MSR_IA32_XAPIC_DISABLE_STATUS:
> >> +        * LEGACY_XAPIC_DISABLED which rejects any attempt at clearing
> >> +        * IA32_APIC_BASE.EXTD, thus forcing the LAPIC in x2APIC mode.
> >> +        */
> >> +        if ( !(val & APIC_BASE_EXTD) )
> >> +        {
> >> +            gprintk(XENLOG_WARNING, "tried to disable x2APIC while forced 
> >> on\n");
> >> +            return X86EMUL_EXCEPTION;
> >> +        }
> >> +    }
> >> +
> >>       /* Attempting to set reserved bits? */
> >>       if ( val & ~(APIC_BASE_ADDR_MASK | APIC_BASE_ENABLE | APIC_BASE_BSP |
> >>                    (cp->basic.x2apic ? APIC_BASE_EXTD : 0)) )
> >> @@ -1474,7 +1488,14 @@ void vlapic_reset(struct vlapic *vlapic)
> >>       if ( v->vcpu_id == 0 )
> >>           vlapic->hw.apic_base_msr |= APIC_BASE_BSP;
> >>   
> >> -    vlapic_set_reg(vlapic, APIC_ID, (v->vcpu_id * 2) << 24);
> >> +    if ( has_force_x2apic(v->domain) )
> >> +    {
> >> +        vlapic->hw.apic_base_msr |= APIC_BASE_EXTD;
> >> +        set_x2apic_id(vlapic);
> >> +    }
> >> +    else
> >> +        vlapic_set_reg(vlapic, APIC_ID, (v->vcpu_id * 2) << 24);
> >> +
> >>       vlapic_do_init(vlapic);
> >>   }
> >>   
> >> diff --git a/xen/arch/x86/include/asm/domain.h 
> >> b/xen/arch/x86/include/asm/domain.h
> >> index 5df8c78253..771992d156 100644
> >> --- a/xen/arch/x86/include/asm/domain.h
> >> +++ b/xen/arch/x86/include/asm/domain.h
> >> @@ -509,6 +509,8 @@ struct arch_domain
> >>   #define has_pirq(d)        (!!((d)->arch.emulation_flags & 
> >> X86_EMU_USE_PIRQ))
> >>   #define has_vpci(d)        (!!((d)->arch.emulation_flags & X86_EMU_VPCI))
> >>   
> >> +#define has_force_x2apic(d) ((d)->arch.hvm.force_x2apic)
> >> +
> >>   #define gdt_ldt_pt_idx(v) \
> >>         ((v)->vcpu_id >> (PAGETABLE_ORDER - GDT_LDT_VCPU_SHIFT))
> >>   #define pv_gdt_ptes(v) \
> >> diff --git a/xen/arch/x86/include/asm/hvm/domain.h 
> >> b/xen/arch/x86/include/asm/hvm/domain.h
> >> index 333501d5f2..b56fa08b73 100644
> >> --- a/xen/arch/x86/include/asm/hvm/domain.h
> >> +++ b/xen/arch/x86/include/asm/hvm/domain.h
> >> @@ -108,6 +108,9 @@ struct hvm_domain {
> >>       /* Compatibility setting for a bug in x2APIC LDR */
> >>       bool bug_x2apic_ldr_vcpu_id;
> >>   
> >> +    /* LAPIC is forced in x2APIC mode */
> >> +    bool force_x2apic;
> > 
> > This should be a field in the vlapic struct, but seeing this I wonder
> > whether we want to virtualize MSR_IA32_XAPIC_DISABLE_STATUS MSR and
> > set the bit there.  This would also help with migrating the option, as
> > you could then migrate the "locked" status easily by just migrating
> > the contents of the MSR_IA32_XAPIC_DISABLE_STATUS MSR.
> > 
> 
> One issue with MSR_IA32_XAPIC_DISABLE_STATUS is that it is only 
> meaningful on Intel platforms (unless we also virtualize it on AMD ?), 
> and I haven't found a AMD-specific mecanism for exposing it.
> Most operating systems don't try to disable x2apic (unless told to do 
> it) if it is initially enabled ("enabled by firmware").

Yeah, I see the availability of MSR_IA32_XAPIC_DISABLE_STATUS is
exposed in MSR_ARCH_CAPABILITIES, which is only present on Intel
platforms.

I also haven't been able to find a way to expose the APIC is locked to
x2apic mode in any AMD manuals, which is a shame.

For Intel we should expose this when possible in
MSR_IA32_XAPIC_DISABLE_STATUS.  And we need to migrate the selection
in the stream as part of the APIC data.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.