|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3] x86/dom0: delay setting SMAP after dom0 build is done
On Thu, Aug 01, 2024 at 12:28:06PM +0200, Jan Beulich wrote:
> On 01.08.2024 11:52, Roger Pau Monne wrote:
> > @@ -1907,16 +1890,25 @@ void asmlinkage __init noreturn
> > __start_xen(unsigned long mbi_p)
> > if ( cpu_has_smep && opt_smep != SMEP_HVM_ONLY )
> > setup_force_cpu_cap(X86_FEATURE_XEN_SMEP);
> > if ( boot_cpu_has(X86_FEATURE_XEN_SMEP) )
> > + {
> > set_in_cr4(X86_CR4_SMEP);
> > + cr4_pv32_mask = mmu_cr4_features & XEN_CR4_PV32_BITS;
>
> Could be just "cr4_pv32_mask = X86_CR4_SMEP" now?
Yes, indeed, same below then.
> > + }
> >
> > if ( !opt_smap )
> > setup_clear_cpu_cap(X86_FEATURE_SMAP);
> > if ( cpu_has_smap && opt_smap != SMAP_HVM_ONLY )
> > setup_force_cpu_cap(X86_FEATURE_XEN_SMAP);
> > if ( boot_cpu_has(X86_FEATURE_XEN_SMAP) )
> > - set_in_cr4(X86_CR4_SMAP);
> > -
> > - cr4_pv32_mask = mmu_cr4_features & XEN_CR4_PV32_BITS;
> > + /*
> > + * Set SMAP on the %cr4 mask so that it's set for APs on bringup,
> > but
> > + * don't set for the BSP until domain building is done.
> > + *
> > + * Don't set it in cr4_pv32_mask either, until it's also set on the
> > + * BSP. Otherwise the BUG in cr4_pv32_restore would trigger for
> > events
> > + * received on the BSP.
> > + */
> > + mmu_cr4_features |= X86_CR4_SMAP;
>
> Don't you put APs at risk this way of triggering the BUG in
> cr4_pv32_restore()?
> They'll have the bit set in %cr4, but the bit remains clear in cr4_pv32_mask
> until much later.
As long as the bit is set in %cr4, but not in cr4_pv32_mask the BUG in
cr4_pv32_restore won't hit.
> > @@ -2048,6 +2040,18 @@ void asmlinkage __init noreturn __start_xen(unsigned
> > long mbi_p)
> > if ( !dom0 )
> > panic("Could not set up DOM0 guest OS\n");
> >
> > + /*
> > + * Enable SMAP only after being done with the domain building phase,
> > as the
> > + * PV builder switches to the domain page-tables and must be run with
> > SMAP
> > + * disabled.
> > + */
> > + if ( boot_cpu_has(X86_FEATURE_XEN_SMAP) )
> > + {
> > + ASSERT(mmu_cr4_features & X86_CR4_SMAP);
> > + write_cr4(read_cr4() | X86_CR4_SMAP);
> > + cr4_pv32_mask = mmu_cr4_features & XEN_CR4_PV32_BITS;
> > + }
>
> Similarly for the BSP here: If we take an NMI between setting CR4.SMAP and
> setting the bit in cr4_pv32_mask, cr4_pv32_restore() would hit the BUG
> there if I'm not mistaken.
No, having extra bits set in %cr4 not present in cr4_pv32_mask won't
trigger the BUG. It's the other way around, having bits set in
cr4_pv32_mask but not in %cr4. As long as %cr4 is always updated
ahead of setting the bit in cr4_pv32_mask that's fine.
See the logic in cr4_pv32_restore:
/* Check that _all_ of the bits intended to be set actually are. */
mov %cr4, %rax
and cr4_pv32_mask(%rip), %rax
cmp cr4_pv32_mask(%rip), %rax
je 1j
[...]
BUG
Otherwise none of this would be safe, as it's impossible to set %cr4
and cr4_pv32_mask atomically in a single instruction.
> I further fear that switching things around won't
> help either. The code you remove from create_dom0() looks to have the same
> issue. The only NMI-safe sequence looks to be: Clear both bits from %cr4,
> update cr4_pv32_mask as wanted, and then write %cr4 with the bits from
> cr4_pv32_mask ORed in.
No, that won't be safe, as cr4_pv32_mask would contain bits not set in
%cr4 and cr4_pv32_restore would hit the BUG. The update order is the
%other way around, first set in %cr4, then in cr4_pv32_mask.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |