[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 04/12] libxenguest: avoid allocating unused deferred-pages bitmap


  • To: Jan Beulich <jbeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Fri, 25 Jun 2021 19:08:54 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WG/EsOZKVfSHR+G9NgGAfbRGt9br8CDW/wU3ZxVJpJQ=; b=XM0qGkr2kmJVgbHo8YnlEXTFQTRIeQxeTeSHphOjiMJe0hxmbGz8LaCsi5F0zuhlDIghlfoYr0gSl4omnaUMYWLsX5PsgDWX3TuzqmzbabaoCA9e9DqBEIxJCgv7ZBs3VoLBv1mhg6aZQnKzInCGOgwy5F/HE54J26FJxLhSRD7iNagEQ4OXa7df+nybZ5m61keE3qSMzZeBm9UHZJlFwEuzI1001B4GvEmzwMmFc3jc6gCdPsvVvJAlgJiy2Y9lscL19sVi0v99B8C8UEY+y4qtLm54NHT5sYbldxIrXQghQwT0r0H5q8PJene11SuIaBETxrYMyGwZEJf/UYUucw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CW0zvnO0ThZng76EKMu/usQQX2j0KIRa7/awSwXNVxadGCDStmei8HJZF1hnhTtzDg75qnqnmt/DWJlt8q+sKXYqo8n+CIfLG3jzRUSZ34JvSdPVqODdYqyxIbez707ZiF9tQTQL5yZO3ag+w3515NddMO0WuLDrf7HEDry2a50HE0kTpgQMKIZtLwTU1GLdMmKFG14hBj9lQl11E+UgkLiXAkCkf22yK2jKY0xkTC1gQ0xV9+2Ad5gkEJPUD6MLTZitzYIhymAp4jAVPkEfRciVpqxpyS5Sec0c1H8Wi3qONLfw0aoMwlFPJjv9MSAeFoJODF81ZZQj4QdkE0OdXg==
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>
  • Delivery-date: Fri, 25 Jun 2021 18:09:16 +0000
  • Ironport-hdrordr: A9a23:bC/JJaypWhvIcxwBjd8QKrPw7b1zdoMgy1knxilNoERuA7elf8 DHpoV56faGskdqZJhAo6HxBEDuewK6yXcY2+gs1PKZLXHbUWaTR72KjrGSsAEIeReOkNK1vJ 0IG8UTZ7PN5BpB/L/HCWKDYrQdKay8gcSVbJDlvhJQpG9RC52I6T0SNu/RKDwKeOAPP+tEKL OMos9AoSPIQwVnUiysbkN1INQriee76q7bXQ==
  • Ironport-sdr: 2CFwX7yGZefsnzZOF5Meh2vsn6mi52bWioaIS2CnybiQ4Ag2zz6bFnosrvlbSl9C4BqhmieF1w qCj5OHC0C1SeCkuM2lst1Zzx3e3xmV4sYgC6d6g1BD4Pcaoy21OFdY41c3RIO89y2xWTuRImj5 LzDHVj69i4XdV9UoCXGu697wHGkOi9O72SBBTz8ufaILY4mYLtKhEycW5w4mZ2pH0Gr4de+8/z JIIktXVKSw2PdCA5NVsLUx1wOYKG0lPaCwqfUVOCMpvAU6TdkU8dSBWlmACk84ib4wrp8Bnahs dq4=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 25/06/2021 14:19, Jan Beulich wrote:
> Like for the dirty bitmap, it is unnecessary to allocate the deferred-
> pages bitmap when all that's ever going to happen is a single all-dirty
> run.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> The clearing of the bitmap at the end of suspend_and_send_dirty() also
> looks unnecessary - am I overlooking anything?

Yes. Remus and COLO.  You don't want accumulate successfully-sent
deferred pages over checkpoints, otherwise you'll eventually be sending
the entire VM every checkpoint.


Answering out of patch order...
> @@ -791,24 +797,31 @@ static int setup(struct xc_sr_context *c
>  {
>      xc_interface *xch = ctx->xch;
>      int rc;
> -    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
> -                                    &ctx->save.dirty_bitmap_hbuf);
>  
>      rc = ctx->save.ops.setup(ctx);
>      if ( rc )
>          goto err;
>  
> -    dirty_bitmap = ctx->save.live || ctx->stream_type != XC_STREAM_PLAIN
> -        ? xc_hypercall_buffer_alloc_pages(
> -              xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->save.p2m_size)))
> -        : (void *)-1L;
> +    if ( ctx->save.live || ctx->stream_type != XC_STREAM_PLAIN )
> +    {
> +        DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
> +                                        &ctx->save.dirty_bitmap_hbuf);
> +
> +        dirty_bitmap =
> +            xc_hypercall_buffer_alloc_pages(
> +                xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->save.p2m_size)));
> +        ctx->save.deferred_pages = bitmap_alloc(ctx->save.p2m_size);
> +
> +        if ( !dirty_bitmap || !ctx->save.deferred_pages )
> +            goto enomem;
> +    }

So this is better than the previous patch.  At least we've got a clean
NULL pointer now.

I could in principle get on board with the optimisation, except its not
safe (see below).

> --- a/tools/libs/guest/xg_sr_save.c
> +++ b/tools/libs/guest/xg_sr_save.c
> @@ -130,7 +130,7 @@ static int write_batch(struct xc_sr_cont
>                                                        
> ctx->save.batch_pfns[i]);
>  
>          /* Likely a ballooned page. */
> -        if ( mfns[i] == INVALID_MFN )
> +        if ( mfns[i] == INVALID_MFN && ctx->save.deferred_pages )
>          {
>              set_bit(ctx->save.batch_pfns[i], ctx->save.deferred_pages);
>              ++ctx->save.nr_deferred_pages;
> @@ -196,8 +196,12 @@ static int write_batch(struct xc_sr_cont
>              {
>                  if ( rc == -1 && errno == EAGAIN )
>                  {
> -                    set_bit(ctx->save.batch_pfns[i], 
> ctx->save.deferred_pages);
> -                    ++ctx->save.nr_deferred_pages;
> +                    if ( ctx->save.deferred_pages )
> +                    {
> +                        set_bit(ctx->save.batch_pfns[i],
> +                                ctx->save.deferred_pages);
> +                        ++ctx->save.nr_deferred_pages;
> +                    }

These two blocks are the only two which modify deferred_pages.

It occurs to me that this means deferred_pages is PV-only, because of
the stub implementations of x86_hvm_pfn_to_gfn() and
x86_hvm_normalise_page().  Furthermore, this is likely to be true for
any HVM-like domains even on other architectures.

If these instead were hard errors when !deferred_pages, then that at
least get the logic into an acceptable state. 

However, the first hunk demonstrates that deferred_pages gets used even
in the non-live case.  In particular, it is sensitive to errors with the
guests' handling of its own P2M.  Also, I can't obviously spot anything
which will correctly fail migration if deferred pages survive the final
iteration.

~Andrew




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.