WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Subject: [Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3
From: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
Date: Tue, 12 Jul 2011 08:22:59 -0700
Cc: julie Sullivan <kernelmail.jms@xxxxxxxxx>, chengxu@xxxxxxxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, kulkarni.ravi4@xxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
Delivery-date: Tue, 12 Jul 2011 15:08:31 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20110712151550.GA3397@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20110710231449.GQ6014@xxxxxxxxxxxxxxxxxx> <20110711162450.GA22913@xxxxxxxxxxxx> <20110711171337.GK2245@xxxxxxxxxxxxxxxxxx> <20110711193021.GA2996@xxxxxxxxxxxx> <20110711201508.GN2245@xxxxxxxxxxxxxxxxxx> <20110711210954.GA15745@xxxxxxxxxxxx> <20110712105506.GB2253@xxxxxxxxxxxxxxxxxx> <20110712141228.GA7831@xxxxxxxxxxxx> <20110712144936.GD2326@xxxxxxxxxxxxxxxxxx> <20110712151550.GA3397@xxxxxxxxxxxxxxxxxx>
Reply-to: paulmck@xxxxxxxxxxxxxxxxxx
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.20 (2009-06-14)
On Tue, Jul 12, 2011 at 08:15:50AM -0700, Paul E. McKenney wrote:
> On Tue, Jul 12, 2011 at 07:49:36AM -0700, Paul E. McKenney wrote:
> > On Tue, Jul 12, 2011 at 10:12:28AM -0400, Konrad Rzeszutek Wilk wrote:
> > > > >   [<c042d0f5>] task_waking_fair+0x14  <--
> > > > 
> > > > Hmmm...  This is a 32-bit system, isn't it?
> > > 
> > > Yes. I ran this little loop:
> > > 
> > > #!/bin/bash
> > > 
> > > ID=`xl list | grep Fedora | awk '  { print $2}'`
> > > 
> > > rm -f cpu*.log
> > > while (true) do
> > >   xl pause $ID
> > >    /usr/lib64/xen/bin/xenctx -s 
> > > /mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 0 >> 
> > > cpu0.log
> > >    /usr/lib64/xen/bin/xenctx -s 
> > > /mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 1 >> 
> > > cpu1.log
> > >    /usr/lib64/xen/bin/xenctx -s 
> > > /mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 2 >> 
> > > cpu2.log
> > >    /usr/lib64/xen/bin/xenctx -s 
> > > /mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 3 >> 
> > > cpu3.log
> > >   xl unpause $ID
> > > done
> > > 
> > > To get an idea what the CPU is doing before it hits the task_waking_fair
> > > and there isn't anything daming. Here are the logs:
> > > 
> > > http://darnok.org/xen/cpu1.log
> > 
> > OK, a fair amount of variety, then lots and lots of task_waking_fair(),
> > so I still feel good about asking you for the following.
> 
> But...  But...  But...
> 
> Just how accurate are these stack traces?  For example, do you have
> frame pointers enabled?  If not, could you please enable them?
> 
> The reason that I ask is that the wakeme_after_rcu() looks like it is
> being invoked from softirq, which would be grossly illegal and could
> cause any manner of misbehavior.  Did someone put a synchronize_rcu()
> into an RCU callback or something?  Or did I do something really really
> braindead inside the RCU implementation?
> 
> (I am looking into this last question, but would appreciate any and all
> help with the other questions!)

OK, I was confusing Julie's, Ravi's, and Konrad's situations.
The wakeme_after_rcu() is in fact OK to call from sofirq -- if and
only if the scheduler is actually running.  This is what happens if
you do a synchronize_rcu() given your CONFIG_TREE_RCU setup -- an RCU
callback is posted that, when invoked, awakens the task that invoked
synchronize_rcu().

And, based on http://darnok.org/xen/log-rcu-stall, Konrad's system
appears to be well past the point where the scheduler is initialized.

So I am coming back around to the loop in task_waking_fair().

Though the patch I sent out earlier might help, for example, if early
invocation of RCU callbacks is somehow messing up the scheduler's
initialization.

                                                        Thanx, Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel