Hi list,
I seem to have encountered a bug that's been reported a few times on
this list but there's no bug in the bugzilla and no one seems to have
reported a resolution.
I have a three node RHEL cluster running some paravirtualised virtual
machines, each using a CLVM logical volume block device as their
storage. There's no cluster file systems involved and the block device
for each virtual machine is accessible on all three dom0 servers.
All dom0 and all domU are x86_64 RHEL 5.2 (also tried CentOS 5.2).
Live migration works perfectly when there's only one virtual machine
involved. However, if two virtual machines are running on one server
and I try to migrate one away to another server, xend starts to
migrate the state (copies all the memory, etc) and then I get this
error on the domU console:
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
netif_release_rx_bufs: 0 xfer, 62 noxfer, 194 unused
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
Apologies for the long email, but I'll also include below the xend.log
output from the source dom0 server. I've seen this before on the list
and it always relates to network-based shared storage, whether that's
iSCSI, DRBD or GNBD (my case). As far as I can tell, the migration
works fine and the VM's state transfers completely but then has a
problem trying to relinquish device 51712 (which is the xvda disk).
The 'exception looking up device number for xvda' also has me
suspicious.
Any help is much appreciated!
Regards,
Tom
xend.log output follows:
[2008-08-24 00:27:43 xend 5252] DEBUG (balloon:127) Balloon: 26652 KiB
free; need 25600; done.
[2008-08-24 00:27:43 xend 5252] DEBUG (XendCheckpoint:89) [xc_save]: /
usr/lib64/xen/bin/xc_save 22 9 0 0 1
[2008-08-24 00:27:43 xend 5252] INFO (XendCheckpoint:351) ERROR
Internal error: Couldn't enable shadow mode
[2008-08-24 00:27:43 xend 5252] INFO (XendCheckpoint:351) Save exit rc=1
[2008-08-24 00:27:43 xend 5252] ERROR (XendCheckpoint:133) Save failed
on domain nodea (9).
Traceback (most recent call last):
File "/usr/lib64/python2.4/site-packages/xen/xend/
XendCheckpoint.py", line 110, in save
forkHelper(cmd, fd, saveInputHandler, False)
File "/usr/lib64/python2.4/site-packages/xen/xend/
XendCheckpoint.py", line 339, in forkHelper
raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib64/xen/bin/xc_save 22 9 0 0 1 failed
[2008-08-24 00:27:43 xend.XendDomainInfo 5252] DEBUG (XendDomainInfo:
1601) XendDomainInfo.resumeDomain(9)
[2008-08-24 00:27:43 xend.XendDomainInfo 5252] INFO (XendDomainInfo:
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo:
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo:
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo:
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo:
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo:
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo:
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo:
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] DEBUG (XendDomainInfo:
1614) XendDomainInfo.resumeDomain: devices released
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] DEBUG (XendDomainInfo:
791) Storing domain details: {'console/ring-ref': '2057005', 'console/
port': '2', 'name': 'migrating-nodea', 'console/limit': '1048576',
'vm': '/vm/b845f914-33a3-e1cf-551e-01b6d346b92b', 'domid': '9', 'cpu/0/
availability': 'online', 'memory/target': '6144000', 'store/ring-ref':
'2049294', 'store/port': '1'}
[2008-08-24 00:27:44 xend 5252] DEBUG (DevController:110)
DevController: writing {'backend-id': '0', 'mac': '00:16:3e:6c:ae:9f',
'handle': '0', 'state': '1', 'backend': '/local/domain/0/backend/vif/
9/0'} to /local/domain/9/device/vif/0.
[2008-08-24 00:27:44 xend 5252] DEBUG (DevController:112)
DevController: writing {'bridge': 'br102', 'domain': 'migrating-
nodea', 'handle': '0', 'script': '/etc/xen/scripts/vif-bridge',
'state': '1', 'frontend': '/local/domain/9/device/vif/0', 'mac':
'00:16:3e:6c:ae:9f', 'online': '1', 'frontend-id': '9'} to /local/
domain/0/backend/vif/9/0.
[2008-08-24 00:27:44 xend 5252] DEBUG (blkif:24) exception looking up
device number for xvda: [Errno 2] No such file or directory: '/dev/xvda'
[2008-08-24 00:27:44 xend 5252] DEBUG (DevController:110)
DevController: writing {'backend-id': '0', 'virtual-device': '51712',
'device-type': 'disk', 'state': '1', 'backend': '/local/domain/0/
backend/vbd/9/51712'} to /local/domain/9/device/vbd/51712.
[2008-08-24 00:27:44 xend 5252] DEBUG (DevController:112)
DevController: writing {'domain': 'migrating-nodea', 'frontend': '/
local/domain/9/device/vbd/51712', 'format': 'raw', 'dev': 'xvda',
'state': '1','params': '/dev/int_vg/os_nodea', 'mode': 'w', 'online':
'1', 'frontend-id': '9', 'type': 'phy'} to /local/domain/0/backend/vbd/
9/51712.
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] DEBUG (XendDomainInfo:
1626) XendDomainInfo.resumeDomain: devices created
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] ERROR (XendDomainInfo:
1631) XendDomainInfo.resume: xc.domain_resume failed on domain 9.
Traceback (most recent call last):
File "/usr/lib64/python2.4/site-packages/xen/xend/
XendDomainInfo.py", line 1628, in resumeDomain
xc.domain_resume(self.domid, fast)
Error: (1, 'Internal error', "Couldn't map start_info")
[2008-08-24 00:27:44 xend 5252] DEBUG (XendCheckpoint:136)
XendCheckpoint.save: resumeDomain
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo:
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo:
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo:
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:45 xend.XendDomainInfo 5252] INFO (XendDomainInfo:
1722) Dev 51712 still active, looping...
-------many repeats-------
[2008-08-24 00:28:14 xend.XendDomainInfo 5252] INFO (XendDomainInfo:
1728) Dev still active but hit max loop timeout
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|