Hello,
I have been struggling through the task of moving our infrastructure
over to Xen VMs. We were initially using Ubuntu packages for both dom0
and our domUs, but experienced extreme instability so we moved to
CentOS, which has been much more reliable for dom0. Since we already
had a bunch of Ubuntu VMs, we left them using the Ubuntu 2.4.24-19-xen
kernel, but this has turned out to be a mistake -- we get frequent
kernel oopses during heavy disk I/O. We modified the kernel to add
NFS-root support, but that is the only change we made to the original
config. All of our domUs mount their root file systems over NFS.
My problem is that I tried to upgrade the domU kernels to the latest
kernel.org stable release (2.6.26.5) and did manage to get it working
after some initial trouble (TCP checksum offloading was breaking NFS).
However, the new kernel will not live migrate anymore. When I execute
the live migrate command:
# xm migrate --live testvm 192.168.1.20
Migration hangs forever. The VM changes name to "migrate-testvm" and
keeps running normally on the system it was on, and appears as "testvm"
with state "-br---" on the destination machine with 0 CPU time. I left
tcpdump running on the destination machine and captured an 84MB pcap
file which looked pretty normal up until all traffic just completely
stopped. If I just change the "kernel=" line in the config script to
the Ubuntu kernel migration works again.
Here's my VM configuration:
-------------------
name = 'testvm'
kernel = '/xen_vm/global/kernels/vmlinuz-2.6.26.5'
ramdisk = '/xen_vm/global/kernels/initrd.img-xen-latest'
memory = '256'
disk = ['tap:aio:/xen_vm/global/swaps/testvm.img,xvda1,w']
vif = [
'mac=00:16:3e:5b:8d:5d,bridge=xenbr0',
'mac=00:16:3e:99:9b:e7,bridge=xenbr1'
]
on_poweroff = 'destroy'
on_reboot = 'restart'
on_crash = 'restart'
extra = '2 console=hvc0 root=/dev/nfs ip=:192.168.1.12::::eth1:'
nfs_server = '192.168.1.12'
nfs_root = '/xen_vm/testvm'
-------------------
xend.log on source:
-------------------
[2008-09-18 15:51:11 xend 3751] DEBUG (balloon:127) Balloon: 786956 KiB
free; need 2048; done.
[2008-09-18 15:51:11 xend 3751] DEBUG (XendCheckpoint:89) [xc_save]:
/usr/lib/xen/bin/xc_save 33 38 0 0 1
-------------------
xend.log on destination:
-------------------
...
[2008-09-18 15:51:11 xend.XendDomainInfo 3331] DEBUG
(XendDomainInfo:1350) XendDomainInfo.construct: None
[2008-09-18 15:51:11 xend 3331] DEBUG (balloon:127) Balloon: 262832 KiB
free; need 2048; done.
...
[2008-09-18 15:51:11 xend 3331] DEBUG (blkif:24) exception looking up
device number for xvda1: [Errno 2] No such file or directory: '/dev/xvda1'
[2008-09-18 15:51:11 xend 3331] DEBUG (DevController:110) DevController:
writing {'backend-id': '0', 'virtual-device': '51713', 'device-type':
'disk', 'state': '1', 'backend': '/local/domain/0/backend/tap/10/51713'}
to /local/domain/10/device/vbd/51713.
...
[2008-09-18 15:51:12 xend 3331] DEBUG (XendCheckpoint:198)
restore:shadow=0x0, _static_max=0x100, _static_min=0x100,
[2008-09-18 15:51:12 xend 3331] DEBUG (balloon:127) Balloon: 262832 KiB
free; need 262144; done.
[2008-09-18 15:51:12 xend 3331] DEBUG (XendCheckpoint:215) [xc_restore]:
/usr/lib/xen/bin/xc_restore 24 10 1 2 0 0 0
-------------------
Xen version: xen-3.0-x86_32p
dom0: 2.6.18-92.1.10.el5xen
Anybody know what would cause this, or have any suggestions for tracking
down the problem? I did find a post from someone who was seeing
identical behavior who claimed he fixed it by enabling CPU Hotplug
support, but I already have that enabled in the kernel.
Thanks,
Trevor
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|