WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Cannot start domains after FC6->F7 upgrade

To: "Daniel P. Berrange" <berrange@xxxxxxxxxx>
Subject: Re: [Xen-users] Cannot start domains after FC6->F7 upgrade
From: Gerry Reno <greno@xxxxxxxxxxx>
Date: Thu, 28 Jun 2007 21:57:34 -0400
Cc: Mark Williamson <mark.williamson@xxxxxxxxxxxx>, xen-users@xxxxxxxxxxxxxxxxxxx, Nico Kadel-Garcia <nkadel@xxxxxxxxx>
Delivery-date: Thu, 28 Jun 2007 18:55:59 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <20070629012650.GB13857@xxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <46841233.2050000@xxxxxxxxxxx> <46843656.1070200@xxxxxxxxx> <468438B3.1060809@xxxxxxxxxxx> <200706290130.02958.mark.williamson@xxxxxxxxxxxx> <20070629012650.GB13857@xxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 1.5.0.12 (X11/20070530)
Daniel P. Berrange wrote:
On Fri, Jun 29, 2007 at 01:30:01AM +0100, Mark Williamson wrote:
  
Did you reboot with the new Xen kernel? Or did you pull your kernel or
assemble it form elsewhere?
        
Yes.  Standard xen kernel.  But had to use the FC6 xen kernel because
the F7 xen kernel crashes on our hardware.

      
My local mirror does not have FC7 yet, so I haven't been able to play
with it.
        
I've opened a Fedora bug:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=246169
      
Chiming in here; I don't have a RH bugzilla account.

The matching rules for kernel / tools / xen versions are a bit complex.  The 
userspace tools and Xen itself must *always* be matched.

Since around 3.0.4 (I think?) it's been possible to mix and match dom0 kernel 
versions.  i.e. any dom0 kernel since 3.0.4 should run on any Xen from 3.0.4.  
Before that release the dom0 and Xen interfaces were tied together and you 
couldn't really mix and match /at all/.  Although you couldn't strictly mix 
and match, you could sometimes get dom0 to boot on Xen, but things didn't 
work properly - this may be what you're seeing, I guess.
    

Since we distribute the hypervisor as part of kernel-xen, this is kinda
academic. For Fedora we guarentee that we will never update the Xen 
hypervisor version within the course of a major release, so as a general
rule users don't have to worry about incompatabilities between 'xen' and
'kernel-xen' if applying updates to the distro.  The problem in this case 
was that there was a mix of packages from two different distros - F7 and 
FC6 between xen & kernel-xen RPMs which is not guarenteed to work because
of lack of a stable ABI in Xen. 

This whole issue could be avoided if just a tiny bit more care were taken
in Xen hypervisor development to maintain a back-compat ABI.  You don't 
expect your GLibC to break when you update to a new kernel - but that's 
exactly the situation we're in with Xen where updating a HV breaks your
userspace :-(

Dan.
  
Yes, the situation for us is that I upgraded our servers by d/l the fedora-release* rpms, installing them and then doing a 'yum -y upgrade'.  And guess what?  Worked great.  Except I was not aware that the new libata drivers did not properly support our old highpoint ATA controllers.  So at first boot, instant crash.  So play around with this for a while and then start opening bugs on the F7 kernels.  I had had some problems getting the 2.6.20 series kernels booting on the highpoint controllers but that was with the old IDE drivers and some parameters in the kernel had changed and changed the tolerance w/regard to our bus timing.  So we were getting 'unknown bus timing' error there.  I worked with Sergei and Chuck and we were able to solve that issue on FC6 with a BIOS tweak on our hardware.  Actually overclocking the bus did the trick.  But with F7 (which I was expecting to just fix all of this), things got even worse.  The new libata drivers failed badly w/regard to the highpoint controllers.  So the only option was to boot F7 using the old FC6 kernels.  This worked, or so I thought, until I began restoring the system to operational status by bringing up all the Xen guests.  Kaboom!  No way could I get any of the domains to start.  So now we are really stuck.  I'm assuming that Alan and Sergei are probably trying to get libata fixed but Alan didn't seem too optimistic that this would happen soon.  I really do not want to try a bare metal restore of the server back to FC6.  So now I'm trying to figure out what other options might be in the picture.  I'm thinking along the lines of maybe seeing if VMWare could run the xen images.  I already tried some things using qemu, but the networking is way too slow.  Like 5x to 10x slower than Xen.  So maybe I need to go the other way and look at openvz and just toss some processes into separate ve's and do that until things get straightened out with libata.

Gerry

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users