[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC 16/16] xen/arm: Track page accessed between batch of Set/Way operations



On Mon, 8 Oct 2018, Julien Grall wrote:
> At the moment, the implementation of Set/Way operations will go through
> all the entries of the guest P2M and flush them. However, this is very
> expensive and may render unusable a guest OS using them.
> 
> For instance, Linux 32-bit will use Set/Way operations during secondary
> CPU bring-up. As the implementation is really expensive, it may be possible
> to hit the CPU bring-up timeout.
> 
> To limit the Set/Way impact, we track what pages has been of the guest
> has been accessed between batch of Set/Way operations. This is done
> using bit[0] (aka valid bit) of the P2M entry.

This is going to improve performance of ill-mannered guests at the cost
of hurting performance of well-mannered guests. Is it really a good
trade-off? Should this behavior at least be configurable with a Xen
command line?


> This patch adds a new per-arch helper is introduced to perform actions just
> before the guest is first unpaused. This will be used to invalidate the
> P2M to track access from the start of the guest.
> 
> Signed-off-by: Julien Grall <julien.grall@xxxxxxx>
>
> ---
> 
> Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> Cc: Julien Grall <julien.grall@xxxxxxx>
> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
> Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
> Cc: Jan Beulich <jbeulich@xxxxxxxx>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> Cc: Tim Deegan <tim@xxxxxxx>
> Cc: Wei Liu <wei.liu2@xxxxxxxxxx>
> ---
>  xen/arch/arm/domain.c       | 14 ++++++++++++++
>  xen/arch/arm/domain_build.c |  7 +++++++
>  xen/arch/arm/p2m.c          | 32 +++++++++++++++++++++++++++++++-
>  xen/arch/x86/domain.c       |  4 ++++
>  xen/common/domain.c         |  5 ++++-
>  xen/include/asm-arm/p2m.h   |  2 ++
>  xen/include/xen/domain.h    |  2 ++
>  7 files changed, 64 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index feebbf5a92..f439f4657a 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -738,6 +738,20 @@ int arch_domain_soft_reset(struct domain *d)
>      return -ENOSYS;
>  }
>  
> +void arch_domain_creation_finished(struct domain *d)
> +{
> +    /*
> +     * To avoid flushing the whole guest RAM on the first Set/Way, we
> +     * invalidate the P2M to track what has been accessed.
> +     *
> +     * This is only turned when IOMMU is not used or the page-table are
> +     * not shared because bit[0] (e.g valid bit) unset will result
> +     * IOMMU fault that could be not fixed-up.
> +     */
> +    if ( !iommu_use_hap_pt(d) )
> +        p2m_invalidate_root(p2m_get_hostp2m(d));
> +}
> +
>  static int is_guest_pv32_psr(uint32_t psr)
>  {
>      switch (psr & PSR_MODE_MASK)
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index f552154e93..de96516faa 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -2249,6 +2249,13 @@ int __init construct_dom0(struct domain *d)
>      v->is_initialised = 1;
>      clear_bit(_VPF_down, &v->pause_flags);
>  
> +    /*
> +     * XXX: We probably want a command line option to invalidate or not
> +     * the P2M. This is because invalidating the P2M will not work with
> +     * IOMMU, however if this is not done there will be an impact.

Why can't we check on iommu_use_hap_pt(d) like in
arch_domain_creation_finished?

In any case, I agree it is a good idea to introduce a command line
parameter to toggle the p2m_invalidate_root call at domain creation
on/off. There are cases where it should be off even if an IOMMU is
present.

Aside from these two questions, the rest of the patch looks correct.


> +     */
> +    p2m_invalidate_root(p2m_get_hostp2m(d));
>
>      return 0;
>  }
>  
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index a3d4c563b1..8e0c31d7ac 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1079,6 +1079,22 @@ static void p2m_invalidate_table(struct p2m_domain 
> *p2m, mfn_t mfn)
>      p2m->need_flush = true;
>  }
>  
> +/*
> + * Invalidate all entries in the root page-tables. This is
> + * useful to get fault on entry and do an action.
> + */
> +void p2m_invalidate_root(struct p2m_domain *p2m)
> +{
> +    unsigned int i;
> +
> +    p2m_write_lock(p2m);
> +
> +    for ( i = 0; i < P2M_ROOT_LEVEL; i++ )
> +        p2m_invalidate_table(p2m, page_to_mfn(p2m->root + i));
> +
> +    p2m_write_unlock(p2m);
> +}
> +
>  bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn)
>  {
>      struct p2m_domain *p2m = p2m_get_hostp2m(d);
> @@ -1539,7 +1555,8 @@ int p2m_cache_flush_range(struct domain *d, gfn_t 
> start, gfn_t end)
>  
>      for ( ; gfn_x(start) < gfn_x(end); start = next_gfn )
>      {
> -        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
> +        bool valid;
> +        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, &valid);
>  
>          next_gfn = gfn_next_boundary(start, order);
>  
> @@ -1547,6 +1564,13 @@ int p2m_cache_flush_range(struct domain *d, gfn_t 
> start, gfn_t end)
>          if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
>              continue;
>  
> +        /*
> +         * Page with valid bit (bit [0]) unset does not need to be
> +         * cleaned
> +         */
> +        if ( !valid )
> +            continue;
> +
>          /* XXX: Implement preemption */
>          while ( gfn_x(start) < gfn_x(next_gfn) )
>          {
> @@ -1571,6 +1595,12 @@ static void p2m_flush_vm(struct vcpu *v)
>      /* XXX: Handle preemption */
>      p2m_cache_flush_range(v->domain, p2m->lowest_mapped_gfn,
>                            p2m->max_mapped_gfn);
> +
> +    /*
> +     * Invalidate the p2m to track which page was modified by the guest
> +     * between call of p2m_flush_vm().
> +     */
> +    p2m_invalidate_root(p2m);
>  }
>  
>  /*
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index 9371efc8c7..2b6d1c01a1 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -723,6 +723,10 @@ int arch_domain_soft_reset(struct domain *d)
>      return ret;
>  }
>  
> +void arch_domain_creation_finished(struct domain *d)
> +{
> +}
> +
>  /*
>   * These are the masks of CR4 bits (subject to hardware availability) which a
>   * PV guest may not legitimiately attempt to modify.
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 65151e2ac4..b402c635f9 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -1100,8 +1100,11 @@ int domain_unpause_by_systemcontroller(struct domain 
> *d)
>       * Creation is considered finished when the controller reference count
>       * first drops to 0.
>       */
> -    if ( new == 0 )
> +    if ( new == 0 && !d->creation_finished )
> +    {
>          d->creation_finished = true;
> +        arch_domain_creation_finished(d);
> +    }
>  
>      domain_unpause(d);
>  
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index c470f062db..2a4652e7f4 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -225,6 +225,8 @@ int p2m_set_entry(struct p2m_domain *p2m,
>                    p2m_type_t t,
>                    p2m_access_t a);
>  
> +void p2m_invalidate_root(struct p2m_domain *p2m);
> +
>  /*
>   * Clean & invalidate caches corresponding to a region [start,end) of guest
>   * address space.
> diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
> index 5e393fd7f2..8d95ad4bf1 100644
> --- a/xen/include/xen/domain.h
> +++ b/xen/include/xen/domain.h
> @@ -70,6 +70,8 @@ void arch_domain_unpause(struct domain *d);
>  
>  int arch_domain_soft_reset(struct domain *d);
>  
> +void arch_domain_creation_finished(struct domain *d);
> +
>  void arch_p2m_set_access_required(struct domain *d, bool access_required);
>  
>  int arch_set_info_guest(struct vcpu *, vcpu_guest_context_u);
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.