|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC PATCH for-4.22] x86/hvm: Introduce force_x2apic flag
On Wed, Oct 29, 2025 at 08:06:36PM +0000, Teddy Astie wrote:
> Le 29/10/2025 à 19:55, Roger Pau Monné a écrit :
> > On Wed, Oct 29, 2025 at 06:26:14PM +0000, Teddy Astie wrote:
> >> Introduce a new flag to force the x2APIC enabled and preventing a
> >> guest from switching back LAPIC to xAPIC mode.
> >>
> >> The semantics of this mode are based IA32_XAPIC_DISABLE_STATUS
> >> architectural MSR of Intel specification.
> >>
> >> Signed-off-by: Teddy Astie <teddy.astie@xxxxxxxxxx>
> >
> > Thanks, some initial comments below.
> >
> >> ---
> >> This feature can be useful for various reasons, starting with SEV as
> >> it is complicated (especially with SEV-ES) to handle MMIO, and legacy
> >> xAPIC is one thing that needs MMIO intercepts (and Linux uses it during
> >> boot unless x2APIC is initially enabled, even if it switches to
> >> x2apic afterward). It could also be interesting to reduce the attack
> >> surface of the hypervisor (by only exposing x2apic to the guest).
> >>
> >> As it can allow to have MMIO-less guest (using PVH), perhaps it can
> >> be enough for avoiding the problematic cases of virtualized INVLPGB
> >> (when we have it).
> >>
> >> In my testing, Linux, FreeBSD and PV-shim works fine with it; OVMF
> >> freezes for some reason, NetBSD doesn't support it (no x2apic support
> >> as Xen guest). HVM BIOS gets stuck at SeaBIOS as it expects booting
> >> with xAPIC.
> >>
> >> On Intel platforms, it would be better to expose the
> >> IA32_XAPIC_DISABLE_STATUS architectural MSR to advertise this to
> >> guest, but it's non-trivial as it needs to be properly exposed
> >> through IA32_ARCH_CAPABILITIES which is currently passed-through.
> >>
> >> docs/man/xl.cfg.5.pod.in | 7 +++++++
> >> tools/libs/light/libxl_types.idl | 1 +
> >> tools/libs/light/libxl_x86.c | 4 ++++
> >> tools/xl/xl_parse.c | 1 +
> >> xen/arch/x86/domain.c | 2 +-
> >> xen/arch/x86/hvm/hvm.c | 2 ++
> >> xen/arch/x86/hvm/vlapic.c | 23 ++++++++++++++++++++++-
> >> xen/arch/x86/include/asm/domain.h | 2 ++
> >> xen/arch/x86/include/asm/hvm/domain.h | 3 +++
> >> xen/include/public/arch-x86/xen.h | 12 +++++++++++-
> >> 10 files changed, 54 insertions(+), 3 deletions(-)
> >
> > Seeing there are no changes to the ACPI tables exposed to the guest,
> > do we want to start exposing X2APIC MADT entries instead of the plain
> > APIC entries?
> >
> > The ACPI spec seems to suggest that you can expose APIC entries for
> > APICs below 255, for compatibility reasons. But given that we would
> > force the guest to use X2APIC mode it would certainly need to
> > understand how to process X2APIC MADT entries anyway.
> >
> > Not sure it makes much of a difference, but wondering whether OSes
> > expect X2APIC MADT entries if the mode is locked to X2APIC.
> >
>
> In all OS I checked, they see x2APIC MADT entries as a different format
> for LAPIC entries and don't really link it with whether x2APIC is used
> or not.
>
> But I think it's safe to assume all OS that supports x2APIC has support
> for x2APIC MADT entries, which could make ACPI table generation simpler
> (especially for dealing with LAPIC IDs over 255)
>
> >>
> >> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
> >> index ad1553c5e9..01b41d93c0 100644
> >> --- a/docs/man/xl.cfg.5.pod.in
> >> +++ b/docs/man/xl.cfg.5.pod.in
> >> @@ -3198,6 +3198,13 @@ option.
> >>
> >> If using this option is necessary to fix an issue, please report a bug.
> >>
> >> +=item B<force_x2apic=BOOLEAN>
> >> +
> >> +Force the LAPIC in x2APIC mode and prevent the guest from disabling
> >> +it or switching to xAPIC mode.
> >> +
> >> +This option is disabled by default.
> >> +
> >> =back
> >>
> >> =head1 SEE ALSO
> >> diff --git a/tools/libs/light/libxl_types.idl
> >> b/tools/libs/light/libxl_types.idl
> >> index d64a573ff3..b95278007e 100644
> >> --- a/tools/libs/light/libxl_types.idl
> >> +++ b/tools/libs/light/libxl_types.idl
> >> @@ -738,6 +738,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
> >> ("arm_sci", libxl_arm_sci),
> >> ])),
> >> ("arch_x86", Struct(None, [("msr_relaxed", libxl_defbool),
> >> + ("force_x2apic", libxl_defbool)
> >
> > This addition needs a new define in libxl.h to signal it's presence,
> > see LIBXL_HAVE_* defines in there.
> >
>
> Something like LIBXL_HAVE_FORCE_X2APIC ?
Yes, something like that. Not sure we want to add X86 somewhere in
there, but X2APIC is already x86-specific so unlikely to have any
meaning for other arches.
> >> ])),
> >> # Alternate p2m is not bound to any architecture or guest type, as
> >> it is
> >> # supported by x86 HVM and ARM support is planned.
> >> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
> >> index 60d4e8661c..2e0205d2a2 100644
> >> --- a/tools/libs/light/libxl_x86.c
> >> +++ b/tools/libs/light/libxl_x86.c
> >> @@ -26,6 +26,9 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
> >> if (libxl_defbool_val(d_config->b_info.arch_x86.msr_relaxed))
> >> config->arch.misc_flags |= XEN_X86_MSR_RELAXED;
> >>
> >> + if (libxl_defbool_val(d_config->b_info.arch_x86.force_x2apic))
> >> + config->arch.misc_flags |= XEN_X86_FORCE_X2APIC;
> >> +
> >> if (libxl_defbool_val(d_config->b_info.trap_unmapped_accesses)) {
> >> LOG(ERROR, "trap_unmapped_accesses is not supported on
> >> x86\n");
> >> return ERROR_FAIL;
> >> @@ -818,6 +821,7 @@ int libxl__arch_domain_build_info_setdefault(libxl__gc
> >> *gc,
> >> {
> >> libxl_defbool_setdefault(&b_info->acpi, true);
> >> libxl_defbool_setdefault(&b_info->arch_x86.msr_relaxed, false);
> >> + libxl_defbool_setdefault(&b_info->arch_x86.force_x2apic, false);
> >> libxl_defbool_setdefault(&b_info->trap_unmapped_accesses, false);
> >>
> >> if (b_info->type == LIBXL_DOMAIN_TYPE_HVM) {
> >> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
> >> index af86d3186d..d84ab7c823 100644
> >> --- a/tools/xl/xl_parse.c
> >> +++ b/tools/xl/xl_parse.c
> >> @@ -3041,6 +3041,7 @@ skip_usbdev:
> >> "If it fixes an issue you are having please report
> >> to "
> >> "xen-devel@xxxxxxxxxxxxxxxxxxxx.\n");
> >>
> >> + xlu_cfg_get_defbool(config, "force_x2apic",
> >> &b_info->arch_x86.force_x2apic, 0);
> >> xlu_cfg_get_defbool(config, "vpmu", &b_info->vpmu, 0);
> >>
> >> xlu_cfg_destroy(config);
> >> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> >> index 19fd86ce88..02f650a614 100644
> >> --- a/xen/arch/x86/domain.c
> >> +++ b/xen/arch/x86/domain.c
> >> @@ -704,7 +704,7 @@ int arch_sanitise_domain_config(struct
> >> xen_domctl_createdomain *config)
> >> return -EINVAL;
> >> }
> >>
> >> - if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
> >> + if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED |
> >> XEN_X86_FORCE_X2APIC) )
> >> {
> >> dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
> >> config->arch.misc_flags);
> >> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> >> index 0c60faa39d..73cbac0f22 100644
> >> --- a/xen/arch/x86/hvm/hvm.c
> >> +++ b/xen/arch/x86/hvm/hvm.c
> >> @@ -616,6 +616,8 @@ int hvm_domain_initialise(struct domain *d,
> >> INIT_LIST_HEAD(&d->arch.hvm.mmcfg_regions);
> >> INIT_LIST_HEAD(&d->arch.hvm.msix_tables);
> >>
> >> + d->arch.hvm.force_x2apic = config->arch.misc_flags &
> >> XEN_X86_FORCE_X2APIC;
> >> +
> >> rc = create_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0, NULL,
> >> NULL);
> >> if ( rc )
> >> goto fail;
> >> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> >> index 993e972cd7..ae8df70d2e 100644
> >> --- a/xen/arch/x86/hvm/vlapic.c
> >> +++ b/xen/arch/x86/hvm/vlapic.c
> >> @@ -1116,6 +1116,20 @@ int guest_wrmsr_apic_base(struct vcpu *v, uint64_t
> >> val)
> >> if ( !has_vlapic(v->domain) )
> >> return X86EMUL_EXCEPTION;
> >>
> >> + if ( has_force_x2apic(v->domain) )
> >> + {
> >> + /*
> >> + * We implement the same semantics as
> >> MSR_IA32_XAPIC_DISABLE_STATUS:
> >> + * LEGACY_XAPIC_DISABLED which rejects any attempt at clearing
> >> + * IA32_APIC_BASE.EXTD, thus forcing the LAPIC in x2APIC mode.
> >> + */
> >> + if ( !(val & APIC_BASE_EXTD) )
> >> + {
> >> + gprintk(XENLOG_WARNING, "tried to disable x2APIC while forced
> >> on\n");
> >> + return X86EMUL_EXCEPTION;
> >> + }
> >> + }
> >> +
> >> /* Attempting to set reserved bits? */
> >> if ( val & ~(APIC_BASE_ADDR_MASK | APIC_BASE_ENABLE | APIC_BASE_BSP |
> >> (cp->basic.x2apic ? APIC_BASE_EXTD : 0)) )
> >> @@ -1474,7 +1488,14 @@ void vlapic_reset(struct vlapic *vlapic)
> >> if ( v->vcpu_id == 0 )
> >> vlapic->hw.apic_base_msr |= APIC_BASE_BSP;
> >>
> >> - vlapic_set_reg(vlapic, APIC_ID, (v->vcpu_id * 2) << 24);
> >> + if ( has_force_x2apic(v->domain) )
> >> + {
> >> + vlapic->hw.apic_base_msr |= APIC_BASE_EXTD;
> >> + set_x2apic_id(vlapic);
> >> + }
> >> + else
> >> + vlapic_set_reg(vlapic, APIC_ID, (v->vcpu_id * 2) << 24);
> >> +
> >> vlapic_do_init(vlapic);
> >> }
> >>
> >> diff --git a/xen/arch/x86/include/asm/domain.h
> >> b/xen/arch/x86/include/asm/domain.h
> >> index 5df8c78253..771992d156 100644
> >> --- a/xen/arch/x86/include/asm/domain.h
> >> +++ b/xen/arch/x86/include/asm/domain.h
> >> @@ -509,6 +509,8 @@ struct arch_domain
> >> #define has_pirq(d) (!!((d)->arch.emulation_flags &
> >> X86_EMU_USE_PIRQ))
> >> #define has_vpci(d) (!!((d)->arch.emulation_flags & X86_EMU_VPCI))
> >>
> >> +#define has_force_x2apic(d) ((d)->arch.hvm.force_x2apic)
> >> +
> >> #define gdt_ldt_pt_idx(v) \
> >> ((v)->vcpu_id >> (PAGETABLE_ORDER - GDT_LDT_VCPU_SHIFT))
> >> #define pv_gdt_ptes(v) \
> >> diff --git a/xen/arch/x86/include/asm/hvm/domain.h
> >> b/xen/arch/x86/include/asm/hvm/domain.h
> >> index 333501d5f2..b56fa08b73 100644
> >> --- a/xen/arch/x86/include/asm/hvm/domain.h
> >> +++ b/xen/arch/x86/include/asm/hvm/domain.h
> >> @@ -108,6 +108,9 @@ struct hvm_domain {
> >> /* Compatibility setting for a bug in x2APIC LDR */
> >> bool bug_x2apic_ldr_vcpu_id;
> >>
> >> + /* LAPIC is forced in x2APIC mode */
> >> + bool force_x2apic;
> >
> > This should be a field in the vlapic struct, but seeing this I wonder
> > whether we want to virtualize MSR_IA32_XAPIC_DISABLE_STATUS MSR and
> > set the bit there. This would also help with migrating the option, as
> > you could then migrate the "locked" status easily by just migrating
> > the contents of the MSR_IA32_XAPIC_DISABLE_STATUS MSR.
> >
>
> One issue with MSR_IA32_XAPIC_DISABLE_STATUS is that it is only
> meaningful on Intel platforms (unless we also virtualize it on AMD ?),
> and I haven't found a AMD-specific mecanism for exposing it.
> Most operating systems don't try to disable x2apic (unless told to do
> it) if it is initially enabled ("enabled by firmware").
Yeah, I see the availability of MSR_IA32_XAPIC_DISABLE_STATUS is
exposed in MSR_ARCH_CAPABILITIES, which is only present on Intel
platforms.
I also haven't been able to find a way to expose the APIC is locked to
x2apic mode in any AMD manuals, which is a shame.
For Intel we should expose this when possible in
MSR_IA32_XAPIC_DISABLE_STATUS. And we need to migrate the selection
in the stream as part of the APIC data.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |