[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 7/7] x86/tlb: use Xen L0 assisted TLB flush when available

To: Roger Pau Monne <roger.pau@xxxxxxxxxx>
From: Jan Beulich <jbeulich@xxxxxxxx>
Date: Fri, 28 Feb 2020 18:00:44 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Wei Liu <wl@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Delivery-date: Fri, 28 Feb 2020 17:00:40 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 19.02.2020 18:43, Roger Pau Monne wrote:
> Use Xen's L0 HVMOP_flush_tlbs hypercall in order to perform flushes.
> This greatly increases the performance of TLB flushes when running
> with a high amount of vCPUs as a Xen guest, and is specially important
> when running in shim mode.
> 
> The following figures are from a PV guest running `make -j32 xen` in
> shim mode with 32 vCPUs and HAP.
> 
> Using x2APIC and ALLBUT shorthand:
> real  4m35.973s
> user  4m35.110s
> sys   36m24.117s
> 
> Using L0 assisted flush:
> real    1m2.596s
> user    4m34.818s
> sys     5m16.374s
> 
> The implementation adds a new hook to hypervisor_ops so other
> enlightenments can also implement such assisted flush just by filling
> the hook. Note that the Xen implementation completely ignores the
> dirty CPU mask and the linear address passed in, and always performs a
> global TLB flush on all vCPUs.

This isn't because of an implementation choice of yours, but because
of how HVMOP_flush_tlbs works. I think the statement should somehow
express this. I also think it wants clarifying that using the
hypercall is indeed faster even in the case of single-page, single-
CPU flush (which I suspect may not be the case especially as vCPU
count grows). The stats above prove a positive overall effect, but
they don't say whether the effect could be even bigger by being at
least a little selective.

> @@ -73,6 +74,15 @@ void __init hypervisor_e820_fixup(struct e820map *e820)
>          ops.e820_fixup(e820);
>  }
>  
> +int hypervisor_flush_tlb(const cpumask_t *mask, const void *va,
> +                         unsigned int order)
> +{
> +    if ( ops.flush_tlb )
> +        return alternative_call(ops.flush_tlb, mask, va, order);
> +
> +    return -ENOSYS;
> +}

Please no new -ENOSYS anywhere (except in new ports' top level
hypercall handlers).

> @@ -256,6 +257,16 @@ void flush_area_mask(const cpumask_t *mask, const void 
> *va, unsigned int flags)
>      if ( (flags & ~FLUSH_ORDER_MASK) &&
>           !cpumask_subset(mask, cpumask_of(cpu)) )
>      {
> +        if ( cpu_has_hypervisor &&
> +             !(flags & ~(FLUSH_TLB | FLUSH_TLB_GLOBAL | FLUSH_VA_VALID |
> +                         FLUSH_ORDER_MASK)) &&
> +             !hypervisor_flush_tlb(mask, va, (flags - 1) & FLUSH_ORDER_MASK) 
> )
> +        {
> +            if ( tlb_clk_enabled )
> +                tlb_clk_enabled = false;

Why does this need doing here? Couldn't Xen guest setup code
clear the flag?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH v5 7/7] x86/tlb: use Xen L0 assisted TLB flush when available
  - From: Roger Pau Monné

References:
- [Xen-devel] [PATCH v5 0/7] x86: improve assisted tlb flush and use it in guest mode
  - From: Roger Pau Monne
- [Xen-devel] [PATCH v5 7/7] x86/tlb: use Xen L0 assisted TLB flush when available
  - From: Roger Pau Monne

Prev by Date: Re: [Xen-devel] [PATCH v5 3/7] x86/hap: improve hypervisor assisted guest TLB flush
Next by Date: Re: [Xen-devel] [PATCH v5 3/7] x86/hap: improve hypervisor assisted guest TLB flush
Previous by thread: [Xen-devel] [PATCH v5 7/7] x86/tlb: use Xen L0 assisted TLB flush when available
Next by thread: Re: [Xen-devel] [PATCH v5 7/7] x86/tlb: use Xen L0 assisted TLB flush when available
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.