[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] migration of pv guest fails from small to large host


  • To: xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: Olaf Hering <olaf@xxxxxxxxx>
  • Date: Fri, 1 Jul 2011 12:41:48 +0200
  • Delivery-date: Fri, 01 Jul 2011 03:42:59 -0700
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

This issue was initially reported to happen on different sized HP ProLiant
systems running SLES11SP1 on dom0 and domU.

Migration of pv guests fails, the guest crashes on the target host once the
guest is unpaused after transit. It happens when the guest is started on a
small systen, then migrated from that small system to a large system.
If the guest is started on a large system, then migrated to a small system and
back to the large system, the migration will be successful.


The symptoms on the target host differ with the systems I have access to,
which are listed below. It is not possible to take a core dump.
The pv guest has one vcpu and 256MB, one network interface and a disk.


I have currently no idea what to look for. The xenctx patch for dumping
pagetables showed no differences between src/dst guest after transit to the
target host (I have to verify this on my hosts).


involved hardware:

bolen: ProLiant DL580 G7, 32GB, CPU E7540 @ 2.00GHz
falla: ProLiant DL360 G6, 8GB, CPU E5540 @ 2.53GHz 
drnek: ProLiant DL170h G6, 6GB, CPU E5504 @ 2.00GHz
gubaidulina: Intel SDV S3E37, 192GB, CPU 000 @ 2.40GHz (unknown cpu 0x206f1)

(other target hosts from different vendors with large amount of memory were 
reported to fail as well.)
I still trying to test a non-HP system as source host.

involved software:
host: sles11sp1, xen 4.0. Also xen-unstable 4.2 hg rev23640
pv gust: sles11sp1


migration with this command on bolen, falla, drnek:
"xm migrate sles11sp_para_1 gubaidulina" fails on gubaidulina:

[2011-06-30 21:21:32 21858] WARNING (XendDomainInfo:2061) Domain has crashed: 
name=sles11sp1_para_1 id=1.
[2011-06-30 21:21:32 21858] ERROR (XendDomainInfo:2318) core dump failed: id = 
1 name = sles11sp1_para_1: (1, 'Internal error', "Couldn't map 
p2m_frame_list_list (errno 1) (1 = Operation not permitted)")
[2011-06-30 21:21:32 21858] DEBUG (XendDomainInfo:3084) XendDomainInfo.destroy: 
domid=1
[2011-06-30 21:21:32 21858] DEBUG (XendDomainInfo:2403) Destroying device model
[2011-06-30 21:21:32 21858] INFO (image:702) sles11sp1_para_1 device model 
terminated

xm dmesg shows no errors.

notes from a "bisect" with limiting Xen memory:
gubaidulina booted with mem=64G, migration from bolen succeeds.
gubaidulina booted with mem=96G, migration from bolen fails.
gubaidulina booted with mem=80G, migration from bolen fails.
gubaidulina booted with mem=72G, migration from bolen fails.
gubaidulina booted with mem=68G, migration from bolen fails.
gubaidulina booted with mem=65G, migration from bolen succeeds.
now testing more after migration:
gubaidulina booted with mem=66G, migration from bolen fails, no coredump 
message, no coredump
gubaidulina booted with mem=66G, second migration from bolen succeeds. xm 
shutdown crashes guest, no coredump
gubaidulina booted with mem=65G, migration from bolen succeeds. xm shutdown 
succeeds

Olaf

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.