[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 3/3] x86/hyperv: L0 assisted TLB flush



On Fri, Feb 14, 2020 at 04:42:47PM +0000, Michael Kelley wrote:
> From: Wei Liu <wei.liu.xen@xxxxxxxxx> On Behalf Of Wei Liu Sent: Friday, 
> February 14, 2020 4:35 AM
> > 
> > Implement L0 assisted TLB flush for Xen on Hyper-V. It takes advantage
> > of several hypercalls:
> > 
> >  * HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST
> >  * HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX
> >  * HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE
> >  * HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX
> > 
> > Pick the most efficient hypercalls available.
> > 
> > Signed-off-by: Wei Liu <liuwe@xxxxxxxxxxxxx>
> > ---
> > v2:
> > 1. Address Roger and Jan's comments re types etc.
> > 2. Fix pointer arithmetic.
> > 3. Misc improvement to code.
> > ---
> >  xen/arch/x86/guest/hyperv/Makefile  |   1 +
> >  xen/arch/x86/guest/hyperv/private.h |   9 ++
> >  xen/arch/x86/guest/hyperv/tlb.c     | 172 +++++++++++++++++++++++++++-
> >  xen/arch/x86/guest/hyperv/util.c    |  74 ++++++++++++
> >  4 files changed, 255 insertions(+), 1 deletion(-)
> >  create mode 100644 xen/arch/x86/guest/hyperv/util.c
> > 
> > diff --git a/xen/arch/x86/guest/hyperv/Makefile 
> > b/xen/arch/x86/guest/hyperv/Makefile
> > index 18902c33e9..0e39410968 100644
> > --- a/xen/arch/x86/guest/hyperv/Makefile
> > +++ b/xen/arch/x86/guest/hyperv/Makefile
> > @@ -1,2 +1,3 @@
> >  obj-y += hyperv.o
> >  obj-y += tlb.o
> > +obj-y += util.o
> > diff --git a/xen/arch/x86/guest/hyperv/private.h 
> > b/xen/arch/x86/guest/hyperv/private.h
> > index 509bedaafa..79a77930a0 100644
> > --- a/xen/arch/x86/guest/hyperv/private.h
> > +++ b/xen/arch/x86/guest/hyperv/private.h
> > @@ -24,12 +24,21 @@
> > 
> >  #include <xen/cpumask.h>
> >  #include <xen/percpu.h>
> > +#include <xen/types.h>
> > 
> >  DECLARE_PER_CPU(void *, hv_input_page);
> >  DECLARE_PER_CPU(void *, hv_vp_assist);
> >  DECLARE_PER_CPU(unsigned int, hv_vp_index);
> > 
> > +static inline unsigned int hv_vp_index(unsigned int cpu)
> > +{
> > +    return per_cpu(hv_vp_index, cpu);
> > +}
> > +
> >  int hyperv_flush_tlb(const cpumask_t *mask, const void *va,
> >                       unsigned int flags);
> > 
> > +/* Returns number of banks, -ev if error */
> > +int cpumask_to_vpset(struct hv_vpset *vpset, const cpumask_t *mask);
> > +
> >  #endif /* __XEN_HYPERV_PRIVIATE_H__  */
> > diff --git a/xen/arch/x86/guest/hyperv/tlb.c 
> > b/xen/arch/x86/guest/hyperv/tlb.c
> > index 48f527229e..f68e14f151 100644
> > --- a/xen/arch/x86/guest/hyperv/tlb.c
> > +++ b/xen/arch/x86/guest/hyperv/tlb.c
> > @@ -19,15 +19,185 @@
> >   * Copyright (c) 2020 Microsoft.
> >   */
> > 
> > +#include <xen/cpu.h>
> >  #include <xen/cpumask.h>
> >  #include <xen/errno.h>
> > 
> > +#include <asm/guest/hyperv.h>
> > +#include <asm/guest/hyperv-hcall.h>
> > +#include <asm/guest/hyperv-tlfs.h>
> > +
> >  #include "private.h"
> > 
> > +/*
> > + * It is possible to encode up to 4096 pages using the lower 12 bits
> > + * in an element of gva_list
> > + */
> > +#define HV_TLB_FLUSH_UNIT (4096 * PAGE_SIZE)
> > +
> > +static unsigned int fill_gva_list(uint64_t *gva_list, const void *va,
> > +                                  unsigned int order)
> > +{
> > +    unsigned long start = (unsigned long)va;
> > +    unsigned long end = start + (PAGE_SIZE << order) - 1;
> > +    unsigned int n = 0;
> > +
> > +    do {
> > +        unsigned long remain = end - start;
> 
> The calculated value here isn't actually the remaining bytes in the
> range to flush -- it's one less than the remaining bytes in the range
> to flush because of the -1 in the calculation of 'end'.   That difference
> will mess up the comparison below against HV_TLB_FLUSH_UNIT
> in the case that there are exactly 4096 page remaining to be
> flushed.  It should take the "=" case, but won't.  Also, the
> '-1' in 'remain - 1' in the else clause becomes unneeded, and
> the 'start = end' assignment then propagates the error.
> 
> In the parallel code in Linux, if you follow the call sequence to get to
> fill_gav_list(), the 'end' argument is really the address of the first byte
> of the first page that isn't in the flush range (i.e., one beyond the true
> 'end') and so is a bit misnamed.
> 
> I think the calculation of 'end' should drop the -1, and perhaps 'end'
> should be renamed.

Thanks for the detailed review. Let me fix this.

Wei.

> 
> Michael
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.