[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xenbus and the message of doom



On Fri, 2011-12-16 at 11:33 +0000, Olaf Hering wrote:
> On Thu, Dec 15, Konrad Rzeszutek Wilk wrote:
> 
> > On Thu, Dec 15, 2011 at 08:20:23PM +0100, Stefan Bader wrote:
> > > I was investigating a bug report[1] about newer kernels (>3.1) not 
> > > booting as
> > > HVM guests on Amazon EC2. For some reason git bisect did give the some 
> > > pain, but
> > > it lead me at least close and with some crash dump data I think I figured 
> > > the
> > > problem.
> > 
> > Stefan, thanks for finding this.
> > 
> > Olaf, what are your thoughts? Should I prep a patch to revert the patch
> > below and then we can work on 3.3 and rethink this in 3.3? The clock is
> > ticking for 3.2 and there is not much runway to fix stuff.
> 
> Sometimes guest changes expose bugs in the host. Its my understanding
> that hosts should be kept uptodate so that it can serve both old and new
> guests well.

In an ideal world yes but we need to balance this against breaking stuff
which is still widely used. It seems like in this case we may have
gotten the balance wrong because people are reporting bugs.

What's wrong with only doing this reset if we know we are kexec'd? If
that can't be automatically detected then e.g. using an explicit
reset_watches command line option. You could even make a tenuous
argument for hanging this off reset_devices?

[...]
> Perhaps we should figure out what exactly EC2 is using as host and why
> it only breaks with upstream kernels.

and in the meantime we leave upstream (and any distros which picks up a
new enough kernel) on EC2? I think at this stage in the rc cycle we'd be
better off reverting and trying again for 3.3.

>  So far I havent received reports
> for SLES11 guests. SP1 got an update recently, so their HVM guests would
> have seen the hang as well. The not yet released SP2 sends
> XS_RESET_WATCHES as well since quite some time.

This thread contains 3 reports of actual regressions. Stefan explicitly
reported an EC2 bug and a reproduction on 3.4.3 and Alessandro by
implication has reported breakage in his environment.

Ian.


> 
> 
> Olaf
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.