Greetings:
I'm experiencing some really strange behavior with an OpenSuse 10.3 guest
running in Xen. Every 48-72 hours, the machine starts running at a very
high load average, dumping tons of messages in the message log, finally
becoming completly inaccessible. When the guest finally becomes unusable,
the host "xm top" display shows 399% CPU utilzation, and contstant NET
and VBD activity, but the host cannot even "shutdown" the guest - I have
to destroy it to make it stop.
The host machine is a Dell Poweredge 2950 III server, running OpenSuse 11.1,
64 bit, kernel 2.6.27.45-0.1-xen, and Xen package xen-3.3.1_18546_24-0.4.13 .
It has 20GB of RAM, a quad-core 2GHz Intel CPU, and a Dell Perc5 RAID. It
runs other guest machines with no problem.
The guest machine is running OpenSuse 10.3, kernel 2.6.22.19-0.4-xenpae, in
32 bit mode, with Xen package xen-3.1.0_15042-51.3.
The guest machine is a clone of a running phyical machine that I'm trying to
virtualize. I did the creation of the drive, the attach, and so forth, on
the Xen host, then I did an rsync of the 10.3 physical machine's filesystems
onto the 11.1 host. I removed and reinstalled the Xen kernel package as
suggested on the net, and, against even my predictions, got the guest to
boot. And it works great... for a few days or so.
But, then, what happens is that the guest starts to go crazy. I see rapidly
repeating messages like this start to appear in the syslog /var/log/messages:
Nov 20 15:35:55 guestc kernel: b_state=0x00000029, b_size=4096
Nov 20 15:35:55 guestc kernel: device blocksize: 4096
Nov 20 15:35:55 guestc kernel: __find_get_block_slow() failed. block=210137505,
b
_blocknr=20676879
Occasionally these messages show up garbled, like this:
Nov 20 15:35:55 guestc kernel: __find_get_block_slow() failed.
block=21_f__f__f__
f___f__f_f_e_f_f____f_f_f_f_____f___f_f_____f__f__f_f___f__f__f_f__f__f____f__f_
f_f___f_f__f_____f__f__f__f__f_f_____f_f_f____f______f__f__f__f____f__f____f__f_
f__f___f__f___f__f__f__f_f_f__f__f____f__f____f__f___f___f__f_f___f__f__f_f_f__f
_f___f___f__f__f__f_f___f___f__f__f___f__f_e_f__f_f__f__f__f______f__f______f__f
__f__f_f___f_f___f_f_____f__f_f__f___f__f_f____f_f__f__f_f___f__f___f__f__f_f___
f__f_____f__f__f__f___state_f__f___f_f___f______f_fe___f___f_____f___f____f_____
f__f__f_f__f__f___f__f__f_____f______f__f____f_f___f_f_f____f___f__f___f____f__f
__f____f__f_____f___f_f_____f__f_____f__f__f_f_f________f___f___f_f__f__f__f__f_
f_f_____f_f_f__e_f__f___f__f__f__f_f_f___f___f___f__f__f__state=0x000000__f__f_s
tate=0x00000029, b_size=4096
And then, of course, I can't even get in to the guest at all, via network
or xm console. xm shutdown does nothing, and I must xm destroy the guest.
After re-creating the guest, everything runs fine again, until another few
days have passed.
Today I was actually in the guest when this happened. An rsync was running,
and that process was pegged, with the guest showing a load average of 5.0
from within the guest, and "xm top" showing a usage of 199% (2 of the 4 CPUS?)
I couldn't kill the rsync process, and the messages above were flooding into
the syslog. The guest could not shut all the way down even with "init 0",
and, eventually, I had to destroy it again.
Here is the machine config:
name="guestc"
uuid="91919191-3676-3f68-bada-993e5adb1088"
memory=8192
maxmem=8192
vcpus=4
on_poweroff="destroy"
on_reboot="restart"
on_crash="destroy"
localtime=0
keymap="en-us"
builder="linux"
bootloader="/usr/lib/xen/boot/domUloader.py"
bootargs="--entry=xvda2:/boot/vmlinuz-xenpae,/boot/initrd-xenpae"
extra=" "
disk=[ 'file:/a/disks/guestc/disk0,xvda,w', 'phy:sdc1,sdc1,w', ]
vif=[ 'mac=00:16:3e:52:f9:96,bridge=br0', ]
vfb=['type=vnc,vncunused=1']
Now, I get that I'm doing some unorthodox things here. Cloning a physical
machine into a virtual machine. Running 10.3 as a guest under an 11.1 host.
A 32-bit guest on a 64-bit host. But the thing DOES run, and I feel like
I'm SO CLOSE to making this work, so I'm really hopeful that someone can
recognize these symptoms and help me find a solution, rather than just
pointing out the obviously edge-case aspects to this situation here.
Any ideas or guidance would be greatly appreciated!
Thank you!
Glen
Glen Barney
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|