[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 7/7] x86/tlb: use Xen L0 assisted TLB flush when available


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Fri, 28 Feb 2020 18:23:51 +0100
  • Authentication-results: esa2.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=roger.pau@xxxxxxxxxx; spf=Pass smtp.mailfrom=roger.pau@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Wei Liu <wl@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Fri, 28 Feb 2020 17:24:03 +0000
  • Ironport-sdr: IjBO+spkEYuERmuKTDbsg3lyg8LUBxzbG2RYe+nuSsA+iIwsB5q5QMvC8vI+RZ6HNcJ7xz4VQe IjTxYLLxIPsE7jXgwjJQTjJB8BE9oScXAlxcm4mnzsam8zrB+niyQtw3y5rxmAVSGbbtAFIQj+ Hf0mRkvWAx/1XBhszzolJCigiSPd3U96Ka/9l2lFemqDLX/f6kFyi6M8QAijKwbT0QinPiam2K biZOQBxk193O9d0Y1r0O4kQ3jlylkYm6WoWQpfOda27zmr42XfAACcuntVAPrU5R8++Cd/+b8n Lkk=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, Feb 28, 2020 at 06:00:44PM +0100, Jan Beulich wrote:
> On 19.02.2020 18:43, Roger Pau Monne wrote:
> > Use Xen's L0 HVMOP_flush_tlbs hypercall in order to perform flushes.
> > This greatly increases the performance of TLB flushes when running
> > with a high amount of vCPUs as a Xen guest, and is specially important
> > when running in shim mode.
> > 
> > The following figures are from a PV guest running `make -j32 xen` in
> > shim mode with 32 vCPUs and HAP.
> > 
> > Using x2APIC and ALLBUT shorthand:
> > real        4m35.973s
> > user        4m35.110s
> > sys 36m24.117s
> > 
> > Using L0 assisted flush:
> > real    1m2.596s
> > user    4m34.818s
> > sys     5m16.374s
> > 
> > The implementation adds a new hook to hypervisor_ops so other
> > enlightenments can also implement such assisted flush just by filling
> > the hook. Note that the Xen implementation completely ignores the
> > dirty CPU mask and the linear address passed in, and always performs a
> > global TLB flush on all vCPUs.
> 
> This isn't because of an implementation choice of yours, but because
> of how HVMOP_flush_tlbs works. I think the statement should somehow
> express this. I also think it wants clarifying that using the
> hypercall is indeed faster even in the case of single-page, single-
> CPU flush

Are you sure about this? I think taking a vmexit is going to be more
costly than executing a local invlpg?

> (which I suspect may not be the case especially as vCPU
> count grows). The stats above prove a positive overall effect, but
> they don't say whether the effect could be even bigger by being at
> least a little selective.

I assume that being able to provide a bitmap with the vCPUs on whether
the TLB flush should be performed would give us some more performance,
but I haven't looked into it yet.

> > @@ -73,6 +74,15 @@ void __init hypervisor_e820_fixup(struct e820map *e820)
> >          ops.e820_fixup(e820);
> >  }
> >  
> > +int hypervisor_flush_tlb(const cpumask_t *mask, const void *va,
> > +                         unsigned int order)
> > +{
> > +    if ( ops.flush_tlb )
> > +        return alternative_call(ops.flush_tlb, mask, va, order);
> > +
> > +    return -ENOSYS;
> > +}
> 
> Please no new -ENOSYS anywhere (except in new ports' top level
> hypercall handlers).

Ack. Is EOPNOTSUPP OK?

> > @@ -256,6 +257,16 @@ void flush_area_mask(const cpumask_t *mask, const void 
> > *va, unsigned int flags)
> >      if ( (flags & ~FLUSH_ORDER_MASK) &&
> >           !cpumask_subset(mask, cpumask_of(cpu)) )
> >      {
> > +        if ( cpu_has_hypervisor &&
> > +             !(flags & ~(FLUSH_TLB | FLUSH_TLB_GLOBAL | FLUSH_VA_VALID |
> > +                         FLUSH_ORDER_MASK)) &&
> > +             !hypervisor_flush_tlb(mask, va, (flags - 1) & 
> > FLUSH_ORDER_MASK) )
> > +        {
> > +            if ( tlb_clk_enabled )
> > +                tlb_clk_enabled = false;
> 
> Why does this need doing here? Couldn't Xen guest setup code
> clear the flag?

Because it's possible that the hypercalls fails, and hence the tlb
clock should be kept enabled. There's no reason to disable it until
Xen knows the assisted flush is indeed available.

I don't mind moving it to Xen guest setup code, but I'm not sure I see
why it would be any better than doing it here. The only reason I guess
is to avoid checking tlb_clk_enabled on every successful assisted
flush?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.