[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.14] x86/hap: use get_gfn_type in hap_update_paging_modes



On Wed, Jun 17, 2020 at 8:29 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>
> On Wed, Jun 17, 2020 at 06:49:16AM -0600, Tamas K Lengyel wrote:
> > On Wed, Jun 17, 2020 at 2:25 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> 
> > wrote:
> > >
> > > On Tue, Jun 16, 2020 at 12:31:06PM -0700, Tamas K Lengyel wrote:
> > > > While forking VMs running a small RTOS systems (Zephyr) a Xen crash has 
> > > > been
> > > > observed due to a mm-lock order violation while copying the HVM CPU 
> > > > context
> > > > from the parent. This issue has been identified to be due to
> > > > hap_update_paging_modes getting a lock on the gfn using get_gfn. This 
> > > > call also
> > > > creates a shared entry in the fork's memory map for the cr3 gfn. The 
> > > > function
> > > > later calls hap_update_cr3 while holding the paging_lock, which results 
> > > > in the
> > > > lock-order violation in vmx_load_pdptrs when it tries to unshare the 
> > > > above entry.
> > > >
> > > > This issue has not affected VMs running other OS's as a call to 
> > > > vmx_load_pdptrs
> > > > is benign if PAE is not enabled or if EFER_LMA is set and returns before
> > > > trying to unshare and map the page.
> > > >
> > > > Using get_gfn_type to get a lock on the gfn avoids this problem as we 
> > > > can
> > > > populate the fork's gfn with a copied page instead of a shared entry if 
> > > > its
> > > > needed, thus avoiding the lock order violation while holding 
> > > > paging_lock.
> > > >
> > > > Signed-off-by: Tamas K Lengyel <tamas.lengyel@xxxxxxxxx>
> > > > ---
> > > > The bug seems to have been present since commit 4cb6c4f4941, only 
> > > > discovered
> > > > now due to the heavy use of mem_sharing with VM forks. As this is a 
> > > > simple
> > > > bug-fix it would be nice to include it in the 4.14 release.
> > >
> > > I agree it seems like a candidate bugfix to be included. I've added
> > > Paul to the Cc so he is aware.
> > >
> > > > ---
> > > >  xen/arch/x86/mm/hap/hap.c | 17 ++++++++++++-----
> > > >  1 file changed, 12 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
> > > > index 7f84d0c6ea..9ae4c3ae6e 100644
> > > > --- a/xen/arch/x86/mm/hap/hap.c
> > > > +++ b/xen/arch/x86/mm/hap/hap.c
> > > > @@ -748,12 +748,19 @@ static void hap_update_paging_modes(struct vcpu 
> > > > *v)
> > > >      struct domain *d = v->domain;
> > > >      unsigned long cr3_gfn = v->arch.hvm.guest_cr[3] >> PAGE_SHIFT;
> > > >      p2m_type_t t;
> > > > +    p2m_query_t q = P2M_ALLOC;
> > > >
> > > > -    /* We hold onto the cr3 as it may be modified later, and
> > > > -     * we need to respect lock ordering. No need for
> > > > -     * checks here as they are performed by vmx_load_pdptrs
> > > > -     * (the potential user of the cr3) */
> > > > -    (void)get_gfn(d, cr3_gfn, &t);
> > > > +    /*
> > > > +     * We hold onto the cr3 as it may be modified later, and
> > > > +     * we need to respect lock ordering. Unshare here if we have to as 
> > > > to avoid
> > > > +     * a lock-order violation later while we are holding the 
> > > > paging_lock.
> > > > +     * Further checks are performed by vmx_load_pdptrs (the potential 
> > > > user of
> > > > +     * the cr3).
> > > > +     */
> > > > +    if ( hvm_pae_enabled(v) && !hvm_long_mode_active(v) )
> > > > +        q |= P2M_UNSHARE;
> > > > +
> > > > +    (void)get_gfn_type(d, cr3_gfn, &t, q);
> > >
> > > While there I think you can drop the cast to void.
> >
> > Sure.
> >
> > >
> > > In order for this to be more resilient, maybe it would be better to
> > > just use get_gfn_unshare directly and avoid checking what paging mode
> > > the guest is currently using?
> > >
> > > Or would that be too expensive in terms of performance for the not
> > > affected case?
> >
> > That's what I originally considered sending in but yes, in the fuzzing
> > case it would mean a full-page copy for each iteration even on
> > unaffected cases instead of a one-time shared entry setup. That would
> > be a considerable waste.
>
> Right, I'm afraid I don't really like the implementation details of
> vmx_load_pdptrs leaking into hap_update_paging_modes which is a
> generic function, so IMO a cleaner solution would be to always use
> P2M_UNSHARE.

That's not an acceptable route due to the overhead it brings to
unaffected use-cases. I find this reasoning for a purely style
preference we are willing to sacrifice performance. In that case I'm
going to stop sending fixes upstream and simply carry patches
out-of-tree.

Tamas



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.