This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Re: blocking Xen 3.X production use: soft lockup bugs

To: Ian Pratt <m+Ian.Pratt@xxxxxxxxxxxx>
Subject: Re: [Xen-devel] Re: blocking Xen 3.X production use: soft lockup bugs
From: Steve Traugott <stevegt@xxxxxxxxxxxxx>
Date: Wed, 2 Aug 2006 17:42:19 -0700
Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 02 Aug 2006 17:42:51 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <A95E2296287EAD4EB592B5DEEFCE0E9D572307@xxxxxxxxxxxxxxxxxxxxxxxxxxx>; from m+Ian.Pratt@xxxxxxxxxxxx on Thu, Aug 03, 2006 at 12:36:35AM +0100
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <A95E2296287EAD4EB592B5DEEFCE0E9D572307@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.2.5i
On Thu, Aug 03, 2006 at 12:36:35AM +0100, Ian Pratt wrote:
> Using 'xm list', is the guest burning CPU? 

I was watching for that, haven't spotted any significant CPU usage
yet; seems to be hung rather than spinning.

> What about dom0?

That I haven't been watching for.  ;-)  Will do.

> The soft lockup messages appear to be benign in that the domain seems to
> be continuing quite happily after printing them -- its quite possible
> that the system was sufficiently busy that the domain VCPU just didn't
> get scheduled for a while, triggering the warning message. Are you sure
> they're actually related to the more serious problem you're
> experiencing? 

I can't prove that the network-related soft lockups I'm seeing on the
x330's are the same soft lockups related to filesystem damage we saw
on the Netengines -- we stopped using Netengines for Xen 3 when we hit
that (they run Xen 2 fine).  Now that I know what to look for, I'll go
back and re-create the Xen 3 environment on the Netengines so I can
reproduce the problem there.

> Have you tried using -unstable and hence xen's new scheduler? This is
> less likely to provoke soft lockup false alarms.

Haven't tried unstable yet, since this is for the production
infrastructure for my family's business; am in the process of
rebuilding with testing changeset 9762 though.  (is that really tip?
hg log says Jun 29th for that changeset, even after a pull...)

Thanks again,


Stephen G. Traugott  (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
http://www.stevegt.com -- http://Infrastructures.Org

Xen-devel mailing list