WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: [PATCH] libxl: do slow resume after failed migration att

To: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] Re: [PATCH] libxl: do slow resume after failed migration attempt
From: Ian Campbell <Ian.Campbell@xxxxxxxxxxxxx>
Date: Wed, 16 Feb 2011 11:49:17 +0000
Delivery-date: Wed, 16 Feb 2011 03:49:54 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1728ed4bbec9e82ca13c.1297856876@xxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Citrix Systems, Inc.
References: <1728ed4bbec9e82ca13c.1297856876@xxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Wed, 2011-02-16 at 11:47 +0000, Ian Campbell wrote:
> # HG changeset patch
> # User Ian Campbell <ian.campbell@xxxxxxxxxx>
> # Date 1297856874 0
> # Node ID 1728ed4bbec9e82ca13c2639c8e4ef8b4dc231b6
> # Parent  aa466613328f5de78fdfc968473cb06e948c1f5d
> libxl: do slow resume after failed migration attempt
> 
> both of the current callers for libxl_domain_resume are calling after
> a migration has failed, one is failure to suspend on the sender and
> the other is failure to start on the destination, both leading to a
> resume attempt on the sender.
> 
> However in the first case, failure to suspend, there is no guarantee
> that the guest has made it as far as the suspend hypercall and
> therefore the fast resume method, which frobs the hypercall return to
> indicate a cancelled suspend, cannot safely be used since it will
> corrupt %eax/%rax.
> 
> For the second case, failure to start on destination, I don't think it
> really matters if the resume is fast or slow.
> 
> Therefore always use the slow/uncooperative version of xc_domain_resume from
> libxl_domain_resume.
> 
> This makes a PV domain which failed to suspend (e.g. because the core
> Linux PM infrastructure within the guest didn't allow it) recover
> gracefully.

a PVHVM domain never suffered from this because libxl_domain_resume
bails due to a libxl__domain_is_hvm check. I'm not 100% clear whether
this is correct but I didn't change it. My test with a PVHVM guest which
acknowledges the suspend but doesn't go on to do anything seems to work.

Ian.

> 
> Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
> 
> diff -r aa466613328f -r 1728ed4bbec9 tools/libxl/libxl.c
> --- a/tools/libxl/libxl.c     Tue Feb 15 13:40:50 2011 +0000
> +++ b/tools/libxl/libxl.c     Wed Feb 16 11:47:54 2011 +0000
> @@ -226,7 +226,7 @@ int libxl_domain_resume(libxl_ctx *ctx, 
>          rc = ERROR_NI;
>          goto out;
>      }
> -    if (xc_domain_resume(ctx->xch, domid, 1)) {
> +    if (xc_domain_resume(ctx->xch, domid, 0)) {
>          LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, 
>                          "xc_domain_resume failed for domain %u", 
>                          domid);



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel