[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/3] x86/mem_sharing: option to skip populating special pages during fork



On Thu, Mar 24, 2022 at 10:50 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>
> On Tue, Mar 22, 2022 at 01:41:37PM -0400, Tamas K Lengyel wrote:
> > Add option to the fork memop to skip populating the fork with special pages.
> > These special pages are only necessary when setting up forks to be fully
> > functional with a toolstack. For short-lived forks where no toolstack is 
> > active
> > these pages are uneccesary.
>
> I'm not sure those are strictly related to having a toolstack. For
> example the vcpu_info has nothing to do with having a toolstack, and
> is only used by the guest in order to receive events or fetch time
> related data. So while a short-lived fork might not make use of those,
> that has nothing to do with having a toolstack or not.

Fair enough, the point is that the short live fork doesn't use these pages.

> >
> > Signed-off-by: Tamas K Lengyel <tamas.lengyel@xxxxxxxxx>
> > ---
> >  xen/arch/x86/include/asm/hvm/domain.h |  4 +++-
> >  xen/arch/x86/mm/mem_sharing.c         | 33 +++++++++++++++++----------
> >  xen/include/public/memory.h           |  4 ++--
> >  3 files changed, 26 insertions(+), 15 deletions(-)
> >
> > diff --git a/xen/arch/x86/include/asm/hvm/domain.h 
> > b/xen/arch/x86/include/asm/hvm/domain.h
> > index 698455444e..446cd06411 100644
> > --- a/xen/arch/x86/include/asm/hvm/domain.h
> > +++ b/xen/arch/x86/include/asm/hvm/domain.h
> > @@ -31,7 +31,9 @@
> >  #ifdef CONFIG_MEM_SHARING
> >  struct mem_sharing_domain
> >  {
> > -    bool enabled, block_interrupts;
> > +    bool enabled;
> > +    bool block_interrupts;
> > +    bool skip_special_pages;
> >
> >      /*
> >       * When releasing shared gfn's in a preemptible manner, recall where
> > diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
> > index 15e6a7ed81..84c04ddfa3 100644
> > --- a/xen/arch/x86/mm/mem_sharing.c
> > +++ b/xen/arch/x86/mm/mem_sharing.c
> > @@ -1643,7 +1643,8 @@ static int bring_up_vcpus(struct domain *cd, struct 
> > domain *d)
> >      return 0;
> >  }
> >
> > -static int copy_vcpu_settings(struct domain *cd, const struct domain *d)
> > +static int copy_vcpu_settings(struct domain *cd, const struct domain *d,
> > +                              bool skip_special_pages)
> >  {
> >      unsigned int i;
> >      struct p2m_domain *p2m = p2m_get_hostp2m(cd);
> > @@ -1660,7 +1661,7 @@ static int copy_vcpu_settings(struct domain *cd, 
> > const struct domain *d)
> >
> >          /* Copy & map in the vcpu_info page if the guest uses one */
> >          vcpu_info_mfn = d_vcpu->vcpu_info_mfn;
> > -        if ( !mfn_eq(vcpu_info_mfn, INVALID_MFN) )
> > +        if ( !skip_special_pages && !mfn_eq(vcpu_info_mfn, INVALID_MFN) )
> >          {
> >              mfn_t new_vcpu_info_mfn = cd_vcpu->vcpu_info_mfn;
> >
> > @@ -1807,17 +1808,18 @@ static int copy_special_pages(struct domain *cd, 
> > struct domain *d)
> >      return 0;
> >  }
> >
> > -static int copy_settings(struct domain *cd, struct domain *d)
> > +static int copy_settings(struct domain *cd, struct domain *d,
> > +                         bool skip_special_pages)
> >  {
> >      int rc;
> >
> > -    if ( (rc = copy_vcpu_settings(cd, d)) )
> > +    if ( (rc = copy_vcpu_settings(cd, d, skip_special_pages)) )
> >          return rc;
> >
> >      if ( (rc = hvm_copy_context_and_params(cd, d)) )
> >          return rc;
> >
> > -    if ( (rc = copy_special_pages(cd, d)) )
> > +    if ( !skip_special_pages && (rc = copy_special_pages(cd, d)) )
> >          return rc;
> >
> >      copy_tsc(cd, d);
> > @@ -1826,9 +1828,11 @@ static int copy_settings(struct domain *cd, struct 
> > domain *d)
> >      return rc;
> >  }
> >
> > -static int fork(struct domain *cd, struct domain *d)
> > +static int fork(struct domain *cd, struct domain *d, uint16_t flags)
> >  {
> >      int rc = -EBUSY;
> > +    bool block_interrupts = flags & XENMEM_FORK_BLOCK_INTERRUPTS;
> > +    bool skip_special_pages = flags & XENMEM_FORK_SKIP_SPECIAL_PAGES;
> >
> >      if ( !cd->controller_pause_count )
> >          return rc;
> > @@ -1856,7 +1860,13 @@ static int fork(struct domain *cd, struct domain *d)
> >      if ( (rc = bring_up_vcpus(cd, d)) )
> >          goto done;
> >
> > -    rc = copy_settings(cd, d);
> > +    if ( !(rc = copy_settings(cd, d, skip_special_pages)) )
>
> Can you set
> cd->arch.hvm.mem_sharing.{block_interrupts,skip_special_pages} earlier
> so that you don't need to pass the options around to copy_settings and
> copy_vcpu_settings?

Would be possible yes, but then we would have to clear them in case
the forking failed at any point. Setting them only at the end when the
fork finished ensures that those fields are only ever valid if the VM
is a fork. Both are valid approaches and I prefer this over the other.

>
> > +    {
> > +        cd->arch.hvm.mem_sharing.block_interrupts = block_interrupts;
> > +        cd->arch.hvm.mem_sharing.skip_special_pages = skip_special_pages;
> > +        /* skip mapping the vAPIC page on unpause if skipping special 
> > pages */
> > +        cd->creation_finished = skip_special_pages;
>
> I think this is dangerous. While it might be true at the moment that
> the arch_domain_creation_finished only maps the vAPIC page, there's no
> guarantee it couldn't do other stuff in the future that could be
> required for the VM to be started.

I understand this domain_creation_finished route could be expanded in
the future to include other stuff, but IMHO we can evaluate what to do
about that when and if it becomes necessary. This is all experimental
to begin with.

> Does it add much overhead to map the vAPIC page?

I don't have numbers but it does add overhead. When we do a fork reset
we loop through all pages in the physmap to determine what needs to be
removed. So having an extra page means that loop is always larger than
it actually needs to be. Considering we do the reset thousands of
times per second per core, you can imagine it adding up over time.

Tamas



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.