WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] cpuidle causing Dom0 soft lockups

To: Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: RE: [Xen-devel] cpuidle causing Dom0 soft lockups
From: "Yu, Ke" <ke.yu@xxxxxxxxx>
Date: Wed, 3 Feb 2010 22:46:54 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Wed, 03 Feb 2010 06:50:03 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4B695ADB020000780002D70F@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4B58402E020000780002B3FE@xxxxxxxxxxxxxxxxxx> <C77DE51B.6F89%keir.fraser@xxxxxxxxxxxxx> <4B67E85E020000780002D1A0@xxxxxxxxxxxxxxxxxx> <8B81FACE836F9248894A7844CC0BA814250B6A12F0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4B695ADB020000780002D70F@xxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcqkucRki/KST3quSAm4sPrCJPksFgAHGZ5w
Thread-topic: [Xen-devel] cpuidle causing Dom0 soft lockups
Kevin has explained details on your comments, so I just pick some point for 
more explanation.

>
>I would not think that dealing with the xtime_lock scalability issue in
>timer_interrupt() should be *that* difficult. In particular it should be
>possibly to assign an on-duty CPU (permanent or on a round-robin
>basis) that deals with updating jiffies/wallclock, and all other CPUs
>just update their local clocks. I had thought about this before, but
>never found a strong need to experiment with that.
>
>Jan

This  is good. Eliminating global lock is always good practice for scalability, 
especially that there will be more and more CPUs in the future. I would expect 
this to be the best solution to the softlockup issue.

And If the global xtime_lock can be eliminated, the cpuidle patch may not be 
needed anymore.

>>Could you please try the attached patch. this patch try to avoid entering
>deep C state when there is vCPU local irq disabled, and polling event channel.
>When tested in my 64 CPU box, this issue is gone with this patch.
>
>We could try it, but I'm not convinced of the approach. Why is the
>urgent determination dependent upon event delivery being disabled
>on the respective vCPU? If at all, it should imo be polling *or* event
>delivery disabled, not *and*.

Then rationale of this patch is: disabling vCPU local irq usually means vCPU 
have urgent task to finish ASAP, and don't want to be interrupted. 

As first step patch, I am a bit conservative to combine them by *and*. Once it 
is verified working, I can extend this hint to *or *, as long as the *or*does 
not include unwanted case that hurt the power saving significantly.

>
>Also, iterating over all vCPU-s in that function doesn't seem very
>scalable. It would seem more reasonable for the scheduler to track
>how many "urgent" vCPU-s a pCPU currently has.

Yes, we can do this optimization. Current patch is just for quick verification 
purpose.

Regards
Ke

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel