[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: [PATCH] libxl: do slow resume after failed migration attempt



On Wed, 2011-02-16 at 11:47 +0000, Ian Campbell wrote:
> # HG changeset patch
> # User Ian Campbell <ian.campbell@xxxxxxxxxx>
> # Date 1297856874 0
> # Node ID 1728ed4bbec9e82ca13c2639c8e4ef8b4dc231b6
> # Parent  aa466613328f5de78fdfc968473cb06e948c1f5d
> libxl: do slow resume after failed migration attempt
> 
> both of the current callers for libxl_domain_resume are calling after
> a migration has failed, one is failure to suspend on the sender and
> the other is failure to start on the destination, both leading to a
> resume attempt on the sender.
> 
> However in the first case, failure to suspend, there is no guarantee
> that the guest has made it as far as the suspend hypercall and
> therefore the fast resume method, which frobs the hypercall return to
> indicate a cancelled suspend, cannot safely be used since it will
> corrupt %eax/%rax.
> 
> For the second case, failure to start on destination, I don't think it
> really matters if the resume is fast or slow.
> 
> Therefore always use the slow/uncooperative version of xc_domain_resume from
> libxl_domain_resume.
> 
> This makes a PV domain which failed to suspend (e.g. because the core
> Linux PM infrastructure within the guest didn't allow it) recover
> gracefully.

a PVHVM domain never suffered from this because libxl_domain_resume
bails due to a libxl__domain_is_hvm check. I'm not 100% clear whether
this is correct but I didn't change it. My test with a PVHVM guest which
acknowledges the suspend but doesn't go on to do anything seems to work.

Ian.

> 
> Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
> 
> diff -r aa466613328f -r 1728ed4bbec9 tools/libxl/libxl.c
> --- a/tools/libxl/libxl.c     Tue Feb 15 13:40:50 2011 +0000
> +++ b/tools/libxl/libxl.c     Wed Feb 16 11:47:54 2011 +0000
> @@ -226,7 +226,7 @@ int libxl_domain_resume(libxl_ctx *ctx, 
>          rc = ERROR_NI;
>          goto out;
>      }
> -    if (xc_domain_resume(ctx->xch, domid, 1)) {
> +    if (xc_domain_resume(ctx->xch, domid, 0)) {
>          LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, 
>                          "xc_domain_resume failed for domain %u", 
>                          domid);



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.