xen-users
RE: [Xen-users] xen save error
BTW, back when I was using printk's in the user kernel to determine if it was
getting the message to suspend or not, I found it really odd that I could
remove the line in the kernel that responds to the receipt of the suspend
command (keeping the one that says "I have suspended") and it actually works
better - usually going 12 to 20 migrations between failures. I was somewhat
surprised that removing the response that the message was receive would work.
I would have figured something would have been waiting on the receipt of that
message. This makes me wonder if there is some timing issue going on where the
kernel is told to suspend and then the domain stops getting CPU time before it
is able to complete suspending.
-- Ray
-----Original Message-----
From: Cole, Ray
Sent: Thursday, September 15, 2005 1:29 PM
To: 'Ian Pratt'; bryant.johan@xxxxxxxxx
Cc: xen-users@xxxxxxxxxxxxxxxxxxx; ian.pratt@xxxxxxxxxxxx
Subject: RE: [Xen-users] xen save error
Sure.
I've got 2 machines where I've installed 2.0-testing from
xen-2.0-testing-install.tar.gz. I downloaded it this morning to make sure I
had the latest. Ran install.sh.
I made sure grub is pointing to xen-2.0.gz, which is in turn a symbolic link to
xen-2.0-testing.gz. I did a depmod for 2.6.12-xen0 and xenU and created
initrd's for both. Also modified grub to use the 2.6.12-xen0 kernel with it
and rebooted. uname -a confirmed I'm using 2.6.12-xen0. Domain 0 was
orignally a RedHat AS 4.0 installation ('minimal installation' selected).
I then copied a Fedora Core 4 installation image to an NFS mount location.
Also created a swap file (have tried with and without) on the NFS link. I
created the .cfg file for the domain - nothing special about it. /Domains/t is
where the NFS mount is made. The .cfg is later in this email.
The FC4 image has had /lib/tls renamed to tls.disabled, although I still get a
warning when booting the user domain that I've got /lib/tls. I don't know,
maybe the initrd has it.
Anyway...from I started xend/xfrd. I start up the FC4 domain using
2.6.12-xenU. This domain uses autofs extensively (all /home entries are
automounted) and NIS. I log in to the user domain using a remote xterm (ssh
into the domain, start xterm). I then start 'top' so I can see that the domain
is still alive.
I then do:
xm migrate --live rayfed4 {new_machine}
back and forth between the two machines that have identical Xen 2.0 Testing
installations. I can generally go back and forth about 4 or 5 times before one
of the migrate commands tells me it had an error (can't suspend). I had at one
time put printk's into the user kernel (after downloading the 2.0 testing
source, of course..) and confirmed that the kernel receives the message to
suspend, but the suspend work the kernel schedules never gets executed. I wait
about 10 seconds between migration attempts.
Below is my .cfg:
# -*- mode: python; -*-
#============================================================================
# Python configuration setup for 'xm create'.
# This script sets the parameters used when a domain is created using 'xm
create'.
# You use a separate script for each domain you want to create, or
# you can set the parameters for the domain on the xm command line.
#============================================================================
#----------------------------------------------------------------------------
# Kernel image file.
kernel = "/boot/vmlinuz-2.6.12-xenU"
# Optional ramdisk.
ramdisk = "/boot/initrd-2.6.12-xenU.img"
# The domain build function. Default is 'linux'.
#builder='linux'
# Initial memory allocation (in megabytes) for the new domain.
memory = 192
# A name for your domain. All domains must have different names.
name = "rayfed4"
# Which CPU to start domain on?
#cpu = -1 # leave to Xen to pick
#----------------------------------------------------------------------------
# Define network interfaces.
# Number of network interfaces. Default is 1.
#nics=1
# Optionally define mac and/or bridge for the network interfaces.
# Random MACs are assigned if not given.
#vif = [ 'mac=aa:00:00:00:00:11, bridge=xen-br0' ]
vif = [ 'mac=52:54:00:12:34:56' ]
#----------------------------------------------------------------------------
# Define the disk devices you want the domain to have access to, and
# what you want them accessible as.
# Each disk entry is of the form phy:UNAME,DEV,MODE
# where UNAME is the device, DEV is the device name the domain will see,
# and MODE is r for read-only, w for read-write.
#disk = [ 'file:/dev/md2,md2,w' ]
#disk = [ 'file:/dev/md3,sda1,w', 'file:/dev/md4,sda2,w' ]
disk = [ 'file:/Domains/t/Fed4.img,sda1,w',
'file:/Domains/t/Fed4Swap.img,sda2,w' ]
# Set root device.
root = "/dev/sda1 ro"
#nfs_root = '/full/path/to/root/directory'
# Sets runlevel 4.
extra = "3"
#----------------------------------------------------------------------------
# Set according to whether you want the domain restarted when it exits.
# The default is 'onreboot', which restarts the domain when it shuts down
# with exit code reboot.
# Other values are 'always', and 'never'.
#restart = 'onreboot'
#============================================================================
-----Original Message-----
From: Ian Pratt [mailto:m+Ian.Pratt@xxxxxxxxxxxx]
Sent: Thursday, September 15, 2005 1:04 PM
To: Cole, Ray; bryant.johan@xxxxxxxxx
Cc: xen-users@xxxxxxxxxxxxxxxxxxx; ian.pratt@xxxxxxxxxxxx
Subject: RE: [Xen-users] xen save error
> I can get xen-2.0-testing to fail on live migrations with
> virtually no load about 10% of the time with live migration
> :-) Seems it becomes unable to suspend the user domain
> kernel - kernel gets the message, but never gets a chance to
> process it. I'm not saying 2.0-testing won't resolve the
> problem John is seeing, but I'm not sure I would quite make
> the statement that it has been 'battle tested' :-)
Can you say more about your configuration? I haven't heard of migrate
problems on 2.0-testing. Almost all the development effort is focussed
on 3.0, but if it's a reproduceable problem someone might take a look.
Migration on 2.0-testing has been tested pretty thoroughly, so it must
be something to do with your configuration or other xm operations you've
done on the domain since you started it.
Ian
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- Re: [Xen-users] xen save error, (continued)
RE: [Xen-users] xen save error, Cole, Ray
RE: [Xen-users] xen save error, Cole, Ray
RE: [Xen-users] xen save error, Cole, Ray
RE: [Xen-users] xen save error, Ian Pratt
RE: [Xen-users] xen save error, Cole, Ray
RE: [Xen-users] xen save error,
Cole, Ray <=
RE: [Xen-users] xen save error, Cole, Ray
RE: [Xen-users] xen save error, Cole, Ray
|
|
|