WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] pv guests die after failed migration

To: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
Subject: Re: [Xen-devel] pv guests die after failed migration
From: Andreas Olsowski <andreas.olsowski@xxxxxxxxxxx>
Date: Fri, 23 Sep 2011 11:15:48 +0200
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Fri, 23 Sep 2011 02:17:22 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1316764045.23371.100.camel@xxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4E786015.80603@xxxxxxxxxxx> <1316546879.5182.26.camel@xxxxxxxxxxxxxxxxxxxx> <4E7C37BD.2000706@xxxxxxxxxxx> <1316764045.23371.100.camel@xxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.21) Gecko/20110831 Icedove/3.1.13
On 09/23/2011 09:47 AM, Ian Campbell wrote:
This seems to be taking the non-cancelled resume path, does this patch
help at all:

diff -r d7b14b76f1eb tools/libxl/libxl.c
--- a/tools/libxl/libxl.c       Thu Sep 22 14:26:08 2011 +0100
+++ b/tools/libxl/libxl.c       Fri Sep 23 08:45:28 2011 +0100
@@ -246,7 +246,7 @@ int libxl_domain_resume(libxl_ctx *ctx,
          rc = ERROR_NI;
          goto out;
      }
-    if (xc_domain_resume(ctx->xch, domid, 0)) {
+    if (xc_domain_resume(ctx->xch, domid, 1)) {
          LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR,
                          "xc_domain_resume failed for domain %u",
                          domid);

I don't think that's a solution but if this patch works then it may
indicate a problem with xc_domain_resume_any.


For the curent xen-4.1-testing.hg the patch had to be modified to a different position:

--- a/tools/libxl/libxl.c  Thu Sep 22 14:26:08 2011 +0100
+++ b/tools/libxl/libxl.c  Fri Sep 23 08:45:28 2011 +0100
@@ -229,7 +229,7 @@
         rc = ERROR_NI;
         goto out;
     }
-    if (xc_domain_resume(ctx->xch, domid, 0)) {
+    if (xc_domain_resume(ctx->xch, domid, 1)) {
         LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR,
                         "xc_domain_resume failed for domain %u",
                         domid);


I did a clean/make/install after that, compilation worked fine.

I then tested the migration towards an unsuitable target again and it did what you thought it could do. The guest resumes correctly at sender

###############
root@xenturio1:~# xl -vvv migrate thishopefullywontfail xenturio2
Saving to migration stream new xl format (info 0x0/0x0/407)
migration target: Ready to receive domain.
Loading new save file incoming migration stream (new xl fmt info 0x0/0x0/407)
 Savefile contains xl domain config
xc: detail: Had 0 unexplained entries in p2m table
xc: Saving memory: iter 0 (last sent 0 skipped 0): 133120/133120  100%
xc: detail: delta 9529ms, dom0 93%, target 1%, sent 449Mb/s, dirtied 1Mb/s 502 pages xc: Saving memory: iter 1 (last sent 130592 skipped 480): 133120/133120 100% xc: detail: delta 37ms, dom0 91%, target 2%, sent 444Mb/s, dirtied 30Mb/s 34 pages xc: Saving memory: iter 2 (last sent 502 skipped 0): 133120/133120 100%
xc: detail: Start last iteration
libxl: debug: libxl_dom.c:384:libxl__domain_suspend_common_callback issuing PV suspend request via XenBus control node libxl: debug: libxl_dom.c:389:libxl__domain_suspend_common_callback wait for the guest to acknowledge suspend request libxl: debug: libxl_dom.c:434:libxl__domain_suspend_common_callback guest acknowledged suspend request libxl: debug: libxl_dom.c:438:libxl__domain_suspend_common_callback wait for the guest to suspend libxl: debug: libxl_dom.c:450:libxl__domain_suspend_common_callback guest has suspended
xc: detail: SUSPEND shinfo 0007fafc
xc: detail: delta 204ms, dom0 3%, target 0%, sent 4Mb/s, dirtied 25Mb/s 156 pages
xc: Saving memory: iter 3 (last sent 30 skipped 4): 133120/133120  100%
xc: detail: delta 3ms, dom0 0%, target 0%, sent 1703Mb/s, dirtied 1703Mb/s 156 pages
xc: detail: Total pages sent= 131280 (0.99x)
xc: detail: (of which 0 were fixups)
xc: detail: All memory is saved
xc: detail: Save exit rc=0
libxl: error: libxl.c:900:validate_virtual_disk failed to stat /dev/xen-data/thishopefullywontfail-root: No such file or directory
cannot add disk 0 to domain: -6
migration target: Domain creation failed (code -3).
libxl: error: libxl_utils.c:408:libxl_read_exactly file/stream truncated reading ready message from migration receiver stream libxl: info: libxl_exec.c:72:libxl_report_child_exitstatus migration target process [16608] exited with error status 3
Migration failed, resuming at sender.


root@xenturio1:~# xl console thishopefullywontfail
PM: freeze of devices complete after 0.197 msecs
PM: late freeze of devices complete after 0.067 msecs
PM: early thaw of devices complete after 0.074 msecs
PM: thaw of devices complete after 0.077 msecs

root@thishopefullywontfail:~#
#####################

So that works


There is no mention of the migration failing in the guest log though, maybe when a final patch is made it should log that failing migration?



with best regards



andreas

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel