[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

LAST_CHECKPOINT and save/restore/migrate compatibility (was Re: [Xen-devel] [PATCH 3 of 4] libxc: provide notification of final checkpoint to restore end)


  • To: Ian Campbell <ian.campbell@xxxxxxxxxx>
  • From: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
  • Date: Fri, 1 Apr 2011 12:02:28 +0100
  • Cc: Shriram Rajagopalan <rshriram@xxxxxxxxx>, Brendan Cully <brendan@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Fri, 01 Apr 2011 04:04:15 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; b=GUeUX6oAWa2Y3WAnNqILlmsAuAdk9xviYd1unTrRybwuYOPkZZEnpleyIIXRIQXy2p wpqK9p/H8ECkffY+l/+XTYn86RNnwnFE1NSFHSC/Vd2ZSPn5mn6Cf9/6RQ7TOQZU2IVx S2IdYekol9+ZsshPF1fRo63gc+OLsAo/X4+is=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On Mon, Sep 6, 2010 at 11:03 AM, Ian Campbell <ian.campbell@xxxxxxxxxx> wrote:
> # HG changeset patch
> # User Ian Campbell <ian.campbell@xxxxxxxxxx>
> # Date 1283766891 -3600
> # Node ID bdf8ce09160d715451e1204babe5f80886ea6183
> # Parent  5f96f36feebdb87eaadbbcab0399f32eda86f735
> libxc: provide notification of final checkpoint to restore end
>
> When the restore code sees this notification it will restore the
> currently in-progress checkpoint when it completes.
>
> This allows the restore end to finish up without waiting for a
> spurious timeout on the receive fd and thereby avoids unnecessary
> error logging in the case of a successful migration or restore.
>
> In the normal migration or restore case the first checkpoint is always
> the last. For a rolling checkpoint (such as Remus) the notification is
> currently unused but could be used in the future for example to
> provide a controlled failover for reasons other than error

Unfortunatley, this introduces a backwards-compatibility problem,
since older versions of the tools never send LAST_CHECKPOINT.

Would it make sense to invert the logic on this -- i.e., default to
only one checkpoint, and send "MORE_CHECKPOINTS" if it expects this
*not* to be the last (either by expecting the checkpoint() function to
do it, or by sending it from xc_domain_save)?

Now that 4.1 is out, we have to consider version compatibilities.  But
we should be able to make it so that there would only be if:
1) You're running REMUS
2) One of your toolstacks is 4.1.0, and one is not.

That seems like it shouldn't be too bad.

Thoughts?

 -George

>
> Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
> Acked-by: Brendan Cully <brendan@xxxxxxxxx>
>
> diff -r 5f96f36feebd -r bdf8ce09160d tools/libxc/xc_domain_restore.c
> --- a/tools/libxc/xc_domain_restore.c   Mon Sep 06 10:54:51 2010 +0100
> +++ b/tools/libxc/xc_domain_restore.c   Mon Sep 06 10:54:51 2010 +0100
> @@ -42,6 +42,7 @@ struct restore_ctx {
>     xen_pfn_t *p2m; /* A table mapping each PFN to its new MFN. */
>     xen_pfn_t *p2m_batch; /* A table of P2M mappings in the current region. 
>  */
>     int completed; /* Set when a consistent image is available */
> +    int last_checkpoint; /* Set when we should commit to the current 
> checkpoint when it completes. */
>     struct domain_info_context dinfo;
>  };
>
> @@ -765,6 +766,11 @@ static int pagebuf_get_one(xc_interface
>         // DPRINTF("console pfn location: %llx\n", buf->console_pfn);
>         return pagebuf_get_one(xch, ctx, buf, fd, dom);
>
> +    case XC_SAVE_ID_LAST_CHECKPOINT:
> +        ctx->last_checkpoint = 1;
> +        // DPRINTF("last checkpoint indication received");
> +        return pagebuf_get_one(xch, ctx, buf, fd, dom);
> +
>     default:
>         if ( (count > MAX_BATCH_SIZE) || (count < 0) ) {
>             ERROR("Max batch size exceeded (%d). Giving up.", count);
> @@ -1296,10 +1302,23 @@ int xc_domain_restore(xc_interface *xch,
>             goto out;
>         }
>         ctx->completed = 1;
> -        /* shift into nonblocking mode for the remainder */
> -        if ( (flags = fcntl(io_fd, F_GETFL,0)) < 0 )
> -            flags = 0;
> -        fcntl(io_fd, F_SETFL, flags | O_NONBLOCK);
> +
> +        /*
> +         * If more checkpoints are expected then shift into
> +         * nonblocking mode for the remainder.
> +         */
> +        if ( !ctx->last_checkpoint )
> +        {
> +            if ( (flags = fcntl(io_fd, F_GETFL,0)) < 0 )
> +                flags = 0;
> +            fcntl(io_fd, F_SETFL, flags | O_NONBLOCK);
> +        }
> +    }
> +
> +    if ( ctx->last_checkpoint )
> +    {
> +        // DPRINTF("Last checkpoint, finishing\n");
> +        goto finish;
>     }
>
>     // DPRINTF("Buffered checkpoint\n");
> diff -r 5f96f36feebd -r bdf8ce09160d tools/libxc/xc_domain_save.c
> --- a/tools/libxc/xc_domain_save.c      Mon Sep 06 10:54:51 2010 +0100
> +++ b/tools/libxc/xc_domain_save.c      Mon Sep 06 10:54:51 2010 +0100
> @@ -1616,6 +1616,20 @@ int xc_domain_save(xc_interface *xch, in
>         }
>     }
>
> +    if ( !callbacks->checkpoint )
> +    {
> +        /*
> +         * If this is not a checkpointed save then this must be the first and
> +         * last checkpoint.
> +         */
> +        i = XC_SAVE_ID_LAST_CHECKPOINT;
> +        if ( wrexact(io_fd, &i, sizeof(int)) )
> +        {
> +            PERROR("Error when writing last checkpoint chunk");
> +            goto out;
> +        }
> +    }
> +
>     /* Zero terminate */
>     i = 0;
>     if ( wrexact(io_fd, &i, sizeof(int)) )
> diff -r 5f96f36feebd -r bdf8ce09160d tools/libxc/xg_save_restore.h
> --- a/tools/libxc/xg_save_restore.h     Mon Sep 06 10:54:51 2010 +0100
> +++ b/tools/libxc/xg_save_restore.h     Mon Sep 06 10:54:51 2010 +0100
> @@ -131,6 +131,7 @@
>  #define XC_SAVE_ID_TMEM_EXTRA         -6
>  #define XC_SAVE_ID_TSC_INFO           -7
>  #define XC_SAVE_ID_HVM_CONSOLE_PFN    -8 /* (HVM-only) */
> +#define XC_SAVE_ID_LAST_CHECKPOINT    -9 /* Commit to restoring after 
> completion of current iteration. */
>
>  /*
>  ** We process save/restore/migrate in batches of pages; the below
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.