WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Live migration fails when available memory exactly equal to

To: <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] Live migration fails when available memory exactly equal to required memory on target system
From: "Graham, Simon" <Simon.Graham@xxxxxxxxxxx>
Date: Mon, 17 Jul 2006 15:17:19 -0400
Delivery-date: Mon, 17 Jul 2006 12:20:04 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acap1Z+TcHSMBHAUSC+IUHm0x1ExXg==
Thread-topic: Live migration fails when available memory exactly equal to required memory on target system
In diagnosing live migration failures (with 3.0.testing), I have noticed
that a common failure is a lack of resources on the target system _and_
that this only seems to happen when the available resources at the time
of the migration are exactly what is required for the VM being migrated.
For example, here's a xend.log extract from a failed case:

[2006-07-17 14:38:56 xend] DEBUG (balloon:128) Balloon: free 265; need
265; done.
[2006-07-17 14:38:56 xend] DEBUG (XendCheckpoint:148) [xc_restore]:
/usr/lib/xen/bin/xc_restore 10 4 112 67584 1 2
[2006-07-17 14:38:57 xend] ERROR (XendCheckpoint:242) xc_linux_restore
start: max_pfn = 10800
[2006-07-17 14:38:57 xend] ERROR (XendCheckpoint:242) Failed allocation
for dom 112: 67584 pages order 0 addr_bits 0
[2006-07-17 14:38:57 xend] ERROR (XendCheckpoint:242) Failed to increase
reservation by 42000 KB: 12
[2006-07-17 14:38:57 xend] ERROR (XendCheckpoint:242) Restore exit with
rc=1

The nr_pfns parameter to xc_restore shows that we need 264MB -
balloon.py added a slop of 1MB to that to come up with the 265 number.

Immediately following this failed attempt, I tried again:

[2006-07-17 14:38:58 xend] DEBUG (balloon:134) Balloon: free 264; need
265; retries: 10.
[2006-07-17 14:38:58 xend] DEBUG (balloon:143) Balloon: setting dom0
target to 1235.
[2006-07-17 14:38:58 xend.XendDomainInfo] DEBUG (XendDomainInfo:945)
Setting memory target of domain Domain-0 (0) to 1235 MiB.
[2006-07-17 14:38:58 xend] DEBUG (balloon:128) Balloon: free 265; need
265; done.
[2006-07-17 14:38:58 xend] DEBUG (XendCheckpoint:148) [xc_restore]:
/usr/lib/xen/bin/xc_restore 10 4 113 67584 1 2
[2006-07-17 14:38:59 xend] ERROR (XendCheckpoint:242) xc_linux_restore
start: max_pfn = 10800
[2006-07-17 14:38:59 xend] ERROR (XendCheckpoint:242) Increased domain
reservation by 42000 KB

This time, we can see that there was only 264MB free so we had to kick
the balloon driver to free up 1MB - once this was done (and we had
exactly 265MB free again), we were able to increase the reservation for
the target DomU to the requested amount...

The above is fairly reproducible but I'm not sure where to go next to
figure out where the issue really is (or, indeed, if there really is an
issue -- maybe this is just one of those inherently racy things;
however, I find it odd that it only seems to happen when the initial
free is exactly the same as the desired; I have plenty of other cases
where there is way more and way less memory available all of which seem
to work just fine).

Any suggestions?
/simgr


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>