WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

[Xen-users] Re: Live migration failed

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] Re: Live migration failed
From: Irwan Hadi <iblist18@xxxxxxxxx>
Date: Tue, 22 Dec 2009 19:14:52 -0700
Delivery-date: Tue, 22 Dec 2009 18:15:40 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=k1dCCv1HcbGHheK8ZPENO0ZogHbKJZn6ZuvADHE+EFk=; b=BmrnRA5064eQNLZev/6txCiifmx5bRn/26PmCnvqIZwt+ENx2qAngardgZcYnh8ye5 Z+re2HKW9DeB7xFa5bCFy8iKffBd5AVzTmRNGuxeTtpJiD4GMR1PeBZKDoUsNS8yc45a W7k2vIV+aUzesRTYGWUyTRw3rO7DamfOlg3h0=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=v3yV6LTeBn6G22X/akf5tSouOYJwdc2zSgibokXtVOMNpSncnz/MB04nvgS6F00r7K iSbxnfc7LmmHO1HoaCHqM+gBJ0Hg5cWD/ePXEumOJt59RWfoaohIXNcZmjEh6PrH1f6H 5wv0Fm2c3my85SJKgLvr8ycSmiHTeHKJuivxY=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <d3a3de0f0912221758n7f53995p24435360f357838a@xxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <d3a3de0f0912221758n7f53995p24435360f357838a@xxxxxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
Actually after further research, it looks like this maybe a known
issue that affect when the originating dom0 has bigger memory than the
receiving dom0
In my case, the vmhost1 dom0 has much bigger memory, and I'm in the
process of standardizing the dom0 memory that we have in the grub.conf
when I hit this bug.
I suppose until this bug is fix, I will have to do xm save/restore so
that the domu won't crashed...

Something weird I found though is that after domu crashed and
rebooted, the live migration of it then will work fine.
Does anyone else ever have the same issue?

https://bugzilla.redhat.com/show_bug.cgi?id=511135


On Tue, Dec 22, 2009 at 6:58 PM, Irwan Hadi <iblist18@xxxxxxxxx> wrote:
> I'm trying to do live migration between two xen hosts. Both are
> running Centos 5.4 , and both are running Xen 3.4.0 from gitco.
> The backend storage is NFS served thru NetworkAppliance filer.
>
> The issue is sometimes the live migration failed , and the domain
> being migrated will crashed and rebooted.. The error that I got is as
> follow:
> Does anyone know what causing it?
>
>
> ============================================================
> # xm migrate --live domu1  vmhost2
> Error: /usr/lib64/xen/bin/xc_save 16 72 0 0 1 failed
> Usage: xm migrate <Domain> <Host>
>
> Migrate a domain to another machine.
>
> Options:
>
> -h, --help           Print this help.
> -l, --live           Use live migration.
> -p=portnum, --port=portnum
>                     Use specified port for migration.
> -n=nodenum, --node=nodenum
>                     Use specified NUMA node on target.
> -s, --ssl            Use ssl connection for migration.
>
> #
>
> ============================================================
> at vmhost1 (the originating VM host)
> [2009-12-22 18:41:39 6417] DEBUG (balloon:172) Balloon: 4936 KiB free;
> 0 to scrub; need 18432; retries: 20.
> [2009-12-22 18:41:39 6417] DEBUG (balloon:187) Balloon: setting dom0
> target to 8671 MiB.
> [2009-12-22 18:41:39 6417] DEBUG (XendDomainInfo:1302) Setting memory
> target of domain Domain-0 (0) to 8671 MiB.
> [2009-12-22 18:41:40 6417] DEBUG (balloon:166) Balloon: 19400 KiB
> free; need 18432; done.
> [2009-12-22 18:41:40 6417] DEBUG (XendCheckpoint:110) [xc_save]:
> /usr/lib64/xen/bin/xc_save 16 72 0 0 1
> [2009-12-22 18:41:40 6417] INFO (XendCheckpoint:417) xc_save: failed
> to get the suspend evtchn port
> [2009-12-22 18:41:40 6417] INFO (XendCheckpoint:417)
> [2009-12-22 18:41:40 6417] INFO (XendCheckpoint:417) Had 0 unexplained
> entries in p2m table
> [2009-12-22 18:41:55 6417] INFO (XendCheckpoint:417) Saving memory
> pages: iter 1  95%^M 1: sent 510088, skipped 3959, delta 15263ms, dom0
> 5
> 8%, target 0%, sent 1095Mb/s, dirtied 13Mb/s 6431 pages
> [2009-12-22 18:41:55 6417] INFO (XendCheckpoint:417) Saving memory
> pages: iter 2  98%^M 2: sent 6370, skipped 26, delta 197ms, dom0 48%,
> ta
> rget 0%, sent 1059Mb/s, dirtied 11Mb/s 71 pages
> [2009-12-22 18:41:55 6417] INFO (XendCheckpoint:417) Saving memory
> pages: iter 3   0%^M 3: sent 71, skipped 0, delta 7ms, dom0 100%,
> target
>  0%, sent 332Mb/s, dirtied 0Mb/s 0 pages
> [2009-12-22 18:41:55 6417] INFO (XendCheckpoint:417) Saving memory
> pages: iter 4   0%^M 4: sent 0, skipped 0, Start last iteration
> [2009-12-22 18:41:55 6417] DEBUG (XendCheckpoint:388) suspend
> [2009-12-22 18:41:55 6417] DEBUG (XendCheckpoint:113) In
> saveInputHandler suspend
> [2009-12-22 18:41:55 6417] DEBUG (XendCheckpoint:115) Suspending 72 ...
> [2009-12-22 18:41:55 6417] DEBUG (XendDomainInfo:511)
> XendDomainInfo.shutdown(suspend)
> [2009-12-22 18:41:55 6417] DEBUG (XendDomainInfo:1708)
> XendDomainInfo.handleShutdownWatch
> [2009-12-22 18:41:55 6417] DEBUG (XendDomainInfo:1708)
> XendDomainInfo.handleShutdownWatch
> [2009-12-22 18:41:55 6417] WARNING (XendDomainInfo:1877) Domain has
> crashed: name=migrating-domu1 id=72.
> [2009-12-22 18:41:55 6417] DEBUG (XendDomainInfo:2723)
> XendDomainInfo.destroy: domid=72
> [2009-12-22 18:41:55 6417] INFO (XendCheckpoint:121) Domain 72 suspended.
> [2009-12-22 18:41:55 6417] DEBUG (XendCheckpoint:130) Written done
> [2009-12-22 18:41:55 6417] INFO (XendCheckpoint:417) ERROR Internal
> error: Domain not in suspended state
> [2009-12-22 18:41:55 6417] INFO (XendCheckpoint:417) ERROR Internal
> error: Domain appears not to have suspended
> [2009-12-22 18:41:56 6417] INFO (XendCheckpoint:417) Save exit rc=1
> [2009-12-22 18:41:56 6417] ERROR (XendCheckpoint:164) Save failed on
> domain domu1 (72) - resuming.
> Traceback (most recent call last):
>  File "/usr/lib64/python2.4/site-packages/xen/xend/XendCheckpoint.py",
> line 132, in save
>    forkHelper(cmd, fd, saveInputHandler, False)
>  File "/usr/lib64/python2.4/site-packages/xen/xend/XendCheckpoint.py",
> line 405, in forkHelper
>    raise XendError("%s failed" % string.join(cmd))
> XendError: /usr/lib64/xen/bin/xc_save 16 72 0 0 1 failed
> [2009-12-22 18:41:56 6417] DEBUG (XendDomainInfo:2779)
> XendDomainInfo.resumeDomain(72)
> [2009-12-22 18:41:56 6417] DEBUG (XendDomainInfo:2198) Destroying device model
> [2009-12-22 18:41:56 6417] DEBUG (XendDomainInfo:2205) Releasing devices
> [2009-12-22 18:41:56 6417] DEBUG (XendDomainInfo:2218) Removing vif/0
> [2009-12-22 18:41:56 6417] DEBUG (XendDomainInfo:1133)
> XendDomainInfo.destroyDevice: deviceClass = vif, device = vif/0
> [2009-12-22 18:41:56 6417] DEBUG (XendDomainInfo:2218) Removing vbd/51712
> [2009-12-22 18:41:56 6417] DEBUG (XendDomainInfo:1133)
> XendDomainInfo.destroyDevice: deviceClass = tap, device = vbd/51712
> [2009-12-22 18:41:56 6417] DEBUG (XendDomainInfo:2218) Removing console/0
> [2009-12-22 18:41:56 6417] DEBUG (XendDomainInfo:1133)
> XendDomainInfo.destroyDevice: deviceClass = console, device =
> console/0
> [2009-12-22 18:41:56 6417] DEBUG (XendDomainInfo:2203) No device model
> [2009-12-22 18:41:56 6417] DEBUG (XendDomainInfo:2205) Releasing devices
>
>
> ============================================================
> at vmhost2 (the receiving VM host)
>
> [2009-12-22 18:41:40 6582] DEBUG (XendDomainInfo:2295)
> XendDomainInfo.constructDomain
> [2009-12-22 18:41:40 6582] DEBUG (balloon:166) Balloon: 97457736 KiB
> free; need 4096; done.
> [2009-12-22 18:41:40 6582] DEBUG (XendDomain:452) Adding Domain: 1
> [2009-12-22 18:41:40 6582] DEBUG (XendDomainInfo:3051) Storing VM
> details: {'on_xend_stop': 'ignore', 'shadow_memory': '0', 'uuid':
> '8ca7c0
> ac-e8ce-eaf5-fe38-70374c6cc1af', 'on_reboot': 'restart', 'start_time':
> '1261015586.75', 'on_poweroff': 'destroy', 'bootloader_args': '', 'o
> n_xend_start': 'ignore', 'on_crash': 'restart', 'xend/restart_count':
> '0', 'vcpus': '2', 'vcpu_avail': '3', 'bootloader': '/usr/bin/pygrub'
> , 'image': "(linux (kernel ) (notes (FEATURES
> 'writable_page_tables|writable_descriptor_tables|auto_translated_physmap|pae_pgdir_above_4gb|
> supervisor_mode_kernel') (VIRT_BASE 18446744071562067968)
> (GUEST_VERSION 2.6) (PADDR_OFFSET 18446744071562067968) (GUEST_OS
> linux) (HYPERCA
> LL_PAGE 18446744071564189696) (LOADER generic) (ENTRY
> 18446744071564165120) (XEN_VERSION xen-3.0)))", 'name': 'domu1'}
> [2009-12-22 18:41:40 6582] INFO (XendDomainInfo:2159) createDevice:
> console : {'protocol': 'vt100', 'location': '2', 'uuid':
> '9e2b597a-e688
> -787e-bacf-50a04d624fc1'}
> [2009-12-22 18:41:40 6582] DEBUG (DevController:95) DevController:
> writing {'state': '1', 'backend-id': '0', 'backend':
> '/local/domain/0/ba
> ckend/console/1/0'} to /local/domain/1/device/console/0.
> [2009-12-22 18:41:40 6582] DEBUG (DevController:97) DevController:
> writing {'domain': 'domu1', 'frontend': '/local/domain/1/device/co
> nsole/0', 'uuid': '9e2b597a-e688-787e-bacf-50a04d624fc1',
> 'frontend-id': '1', 'state': '1', 'location': '2', 'online': '1',
> 'protocol': 'vt
> 100'} to /local/domain/0/backend/console/1/0.
> [2009-12-22 18:41:40 6582] INFO (XendDomainInfo:2159) createDevice:
> tap : {'protocol': 'x86_64-abi', 'uuid':
> '42d54a8b-b537-6cac-966f-8c437
> a6717c4', 'bootable': '1', 'dev': 'xvda:disk', 'uname':
> 'tap:aio:/nfsvol1/domu1', 'mode': 'w', 'backend': '0'}
> [2009-12-22 18:41:40 6582] DEBUG (DevController:95) DevController:
> writing {'virtual-device': '51712', 'protocol': 'x86_64-abi',
> 'device-ty
> pe': 'disk', 'backend-id': '0', 'state': '1', 'backend':
> '/local/domain/0/backend/tap/1/51712'} to
> /local/domain/1/device/vbd/51712.
> [2009-12-22 18:41:40 6582] DEBUG (DevController:97) DevController:
> writing {'domain': 'domu1', 'frontend': '/local/domain/1/device/vb
> d/51712', 'uuid': '42d54a8b-b537-6cac-966f-8c437a6717c4', 'bootable':
> '1', 'dev': 'xvda', 'state': '1', 'params': 'aio:/nfsvol1/domu1',
> 'mode': 'w', 'online': '1', 'frontend-id': '1', 'type': 'tap'} to
> /local/domain/0/backend/tap/1/51712.
> [2009-12-22 18:41:40 6582] INFO (XendDomainInfo:2159) createDevice:
> vif : {'bridge': 'xenbr212', 'mac': '00:16:3e:04:0d:39', 'script':
> '/et
> c/xen/scripts/vif-bridge', 'uuid':
> '747b1a8a-3e77-682b-d70a-b8673581b6c8', 'backend': '0'}
> [2009-12-22 18:41:40 6582] DEBUG (DevController:95) DevController:
> writing {'backend-id': '0', 'mac': '00:16:3e:04:0d:39', 'handle': '0',
> '
> state': '1', 'backend': '/local/domain/0/backend/vif/1/0'} to
> /local/domain/1/device/vif/0.
> [2009-12-22 18:41:40 6582] DEBUG (DevController:97) DevController:
> writing {'bridge': 'xenbr212', 'domain': 'domu1', 'handle': '0', '
> uuid': '747b1a8a-3e77-682b-d70a-b8673581b6c8', 'script':
> '/etc/xen/scripts/vif-bridge', 'mac': '00:16:3e:04:0d:39',
> 'frontend-id': '1', 'st
> ate': '1', 'online': '1', 'frontend': '/local/domain/1/device/vif/0'}
> to /local/domain/0/backend/vif/1/0.
> [2009-12-22 18:41:40 6582] DEBUG (DevController:95) DevController:
> writing {'backend-id': '0', 'mac': '00:16:3e:04:0d:39', 'handle': '0',
> '
> state': '1', 'backend': '/local/domain/0/backend/vif/1/0'} to
> /local/domain/1/device/vif/0.
> [2009-12-22 18:41:40 6582] DEBUG (DevController:97) DevController:
> writing {'bridge': 'xenbr212', 'domain': 'domu1', 'handle': '0', '
> uuid': '747b1a8a-3e77-682b-d70a-b8673581b6c8', 'script':
> '/etc/xen/scripts/vif-bridge', 'mac': '00:16:3e:04:0d:39',
> 'frontend-id': '1', 'st
> ate': '1', 'online': '1', 'frontend': '/local/domain/1/device/vif/0'}
> to /local/domain/0/backend/vif/1/0.
> [2009-12-22 18:41:40 6582] DEBUG (XendDomainInfo:1621) Storing domain
> details: {'image/entry': '18446744071564165120', 'console/port': '2',
>  'image/loader': 'generic', 'vm':
> '/vm/8ca7c0ac-e8ce-eaf5-fe38-70374c6cc1af',
> 'control/platform-feature-multiprocessor-suspend': '1', 'imag
> e/guest-os': 'linux', 'cpu/1/availability': 'online',
> 'image/features/writable-descriptor-tables': '1', 'image/virt-base':
> '184467440715620
> 67968', 'memory/target': '2097152', 'image/guest-version': '2.6',
> 'image/features/supervisor-mode-kernel': '1', 'console/limit':
> '1048576',
>  'image/paddr-offset': '18446744071562067968', 'image/hypercall-page':
> '18446744071564189696', 'cpu/0/availability': 'online', 'image/featu
> res/pae-pgdir-above-4gb': '1', 'image/features/writable-page-tables':
> '1', 'console/type': 'xenconsoled', 'image/features/auto-translated-p
> hysmap': '1', 'name': 'domu1', 'domid': '1', 'image/xen-version':
> 'xen-3.0', 'store/port': '1'}
> [2009-12-22 18:41:40 6582] DEBUG (XendCheckpoint:261)
> restore:shadow=0x0, _static_max=0x100000000, _static_min=0x0,
> [2009-12-22 18:41:40 6582] DEBUG (balloon:166) Balloon: 97457616 KiB
> free; need 2097152; done.
> [2009-12-22 18:41:40 6582] DEBUG (XendCheckpoint:278) [xc_restore]:
> /usr/lib64/xen/bin/xc_restore 16 1 1 2 0 0 0
> [2009-12-22 18:41:40 6582] INFO (XendCheckpoint:417) xc_domain_restore
> start: p2m_size = 7d800
> [2009-12-22 18:41:40 6582] INFO (XendCheckpoint:417) Reloading memory
> pages:   0%
> [2009-12-22 18:41:56 6582] INFO (XendCheckpoint:417) ERROR Internal
> error: Error when reading batch size
> [2009-12-22 18:41:57 6582] INFO (XendCheckpoint:417) Restore exit with rc=1
> [2009-12-22 18:41:57 6582] DEBUG (XendDomainInfo:2723)
> XendDomainInfo.destroy: domid=1
> [2009-12-22 18:41:57 6582] ERROR (XendDomainInfo:2737)
> XendDomainInfo.destroy: domain destruction failed.
> Traceback (most recent call last):
>  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py",
> line 2730, in destroy
>    xc.domain_pause(self.domid)
> Error: (3, 'No such process')
> [2009-12-22 18:41:57 6582] DEBUG (XendDomainInfo:2203) No device model
> [2009-12-22 18:41:57 6582] DEBUG (XendDomainInfo:2205) Releasing devices
> [2009-12-22 18:41:57 6582] DEBUG (XendDomainInfo:1133)
> XendDomainInfo.destroyDevice: deviceClass = console, device =
> console/0
> [2009-12-22 18:41:57 6582] ERROR (XendDomain:1149) Restore failed
> Traceback (most recent call last):
>  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomain.py",
> line 1147, in domain_restore_fd
>    return XendCheckpoint.restore(self, fd, paused=paused,
> relocating=relocating)
>  File "/usr/lib64/python2.4/site-packages/xen/xend/XendCheckpoint.py",
> line 282, in restore
>    forkHelper(cmd, fd, handler.handler, True)
>  File "/usr/lib64/python2.4/site-packages/xen/xend/XendCheckpoint.py",
> line 405, in forkHelper
>    raise XendError("%s failed" % string.join(cmd))
> XendError: /usr/lib64/xen/bin/xc_restore 16 1 1 2 0 0 0 failed
> '], ['VDI']]]])
>
> ============================================================
>
>
> Thanks
>

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>