[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 3/7] x86/hap: improve hypervisor assisted guest TLB flush



On Fri, Feb 28, 2020 at 02:58:42PM +0100, Jan Beulich wrote:
> On 19.02.2020 18:43, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/mm/hap/hap.c
> > +++ b/xen/arch/x86/mm/hap/hap.c
> > @@ -669,32 +669,28 @@ static void hap_update_cr3(struct vcpu *v, int 
> > do_locking, bool noflush)
> >      hvm_update_guest_cr3(v, noflush);
> >  }
> >  
> > +/*
> > + * NB: doesn't actually perform any flush, used just to clear the CPU from 
> > the
> > + * mask and hence signal that the guest TLB flush has been done.
> > + */
> 
> "has been done" isn't correct, as the flush may happen only later
> on (upon next VM entry). I think wording here needs to be as
> precise as possible, however the comment may turn out unnecessary
> altogether:

What about:

/*
 * NB: Dummy function exclusively used as a way to trigger a vmexit,
 * and thus force an ASID/VPID update on vmentry (that will flush the
 * cache).
 */

> > @@ -705,20 +701,22 @@ bool hap_flush_tlb(bool (*flush_vcpu)(void *ctxt, 
> > struct vcpu *v),
> >          if ( !flush_vcpu(ctxt, v) )
> >              continue;
> >  
> > -        paging_update_cr3(v, false);
> > +        hvm_asid_flush_vcpu(v);
> >  
> >          cpu = read_atomic(&v->dirty_cpu);
> > -        if ( is_vcpu_dirty_cpu(cpu) )
> > +        if ( cpu != this_cpu && is_vcpu_dirty_cpu(cpu) )
> >              __cpumask_set_cpu(cpu, mask);
> >      }
> >  
> > -    /* Flush TLBs on all CPUs with dirty vcpu state. */
> > -    flush_tlb_mask(mask);
> > -
> > -    /* Done. */
> > -    for_each_vcpu ( d, v )
> > -        if ( v != current && flush_vcpu(ctxt, v) )
> > -            vcpu_unpause(v);
> > +    /*
> > +     * Trigger a vmexit on all pCPUs with dirty vCPU state in order to 
> > force an
> > +     * ASID/VPID change and hence accomplish a guest TLB flush. Note that 
> > vCPUs
> > +     * not currently running will already be flushed when scheduled 
> > because of
> > +     * the ASID tickle done in the loop above.
> > +     */
> > +    on_selected_cpus(mask, handle_flush, mask, 0);
> > +    while ( !cpumask_empty(mask) )
> > +        cpu_relax();
> 
> Why do you pass 0 as last argument? If you passed 1, you wouldn't
> need the while() here, handle_flush() could be empty, and you'd
> save a perhaps large amount of CPUs to all try to modify two
> cache lines (the one used by on_selected_cpus() itself and the
> one here) instead of just one. Yes, latency from the last CPU
> clearing its bit to you being able to exit from here may be a
> little larger this way, but overall latency may shrink with the
> cut by half amount of atomic ops issued to the bus.

In fact I think passing 0 as the last argument is fine, and
handle_flush can be empty in that case anyway. We don't really care
whether on_selected_cpus returns before all CPUs have executed the
dummy function, as long as all of them have taken a vmexit. Using 0
already guarantees that AFAICT.

> (Forcing an empty function to be called may seem odd, but sending
>  just some IPI [like the event check one] wouldn't be enough, as
>  you want to be sure the other side has actually received the
>  request.)

Exactly, that's the reason for the empty function.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.