WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] cpuidle causing Dom0 soft lockups

To: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Subject: Re: [Xen-devel] cpuidle causing Dom0 soft lockups
From: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>
Date: Thu, 04 Feb 2010 07:31:02 +0100
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxxxx>, "Yu, Ke" <ke.yu@xxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Wed, 03 Feb 2010 22:32:33 -0800
Dkim-signature: v=1; a=rsa-sha256; c=simple/simple; d=ts.fujitsu.com; i=juergen.gross@xxxxxxxxxxxxxx; q=dns/txt; s=s1536b; t=1265265189; x=1296801189; h=from:sender:reply-to:subject:date:message-id:to:cc: mime-version:content-transfer-encoding:content-id: content-description:resent-date:resent-from:resent-sender: resent-to:resent-cc:resent-message-id:in-reply-to: references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:list-owner:list-archive; z=From:=20Juergen=20Gross=20<juergen.gross@xxxxxxxxxxxxxx> |Subject:=20Re:=20[Xen-devel]=20cpuidle=20causing=20Dom0 =20soft=20lockups|Date:=20Thu,=2004=20Feb=202010=2007:31: 02=20+0100|Message-ID:=20<4B6A69A6.4010901@xxxxxxxxxxxxxx >|To:=20"Tian,=20Kevin"=20<kevin.tian@xxxxxxxxx>|CC:=20"x en-devel@xxxxxxxxxxxxxxxxxxx"=20<xen-devel@xxxxxxxxxxxxxx e.com>,=20=0D=0A=20"Yu,=20Ke"=20<ke.yu@xxxxxxxxx>,=0D=0A =20Jan=20Beulich=20<JBeulich@xxxxxxxxxx>,=20=0D=0A=20Keir =20Fraser=20<keir.fraser@xxxxxxxxxxxxx>|MIME-Version:=201 .0|Content-Transfer-Encoding:=208bit|In-Reply-To:=20<73BD C2BA3DA0BD47BAAEE12383D407EF35C2F587@xxxxxxxxxxxxxxxxxxxx intel.com>|References:=20<4B58402E020000780002B3FE@xxxxxx 2.novell.com>=09<C77DE51B.6F89%keir.fraser@xxxxxxxxxxxxx> =09<4B67E85E020000780002D1A0@xxxxxxxxxxxxxxxxxx>=09<8B81F ACE836F9248894A7844CC0BA814250B6A12F0@xxxxxxxxxxxxxxxxxxx .intel.com>=09<4B695ADB020000780002D70F@xxxxxxxxxxxxxxxxx m>=09<73BDC2BA3DA0BD47BAAEE12383D407EF35C2F436@shzsmsx502 .ccr.corp.intel.com>=09<4B6969AB.60605@xxxxxxxxxxxxxx>=20 <73BDC2BA3DA0BD47BAAEE12383D407EF35C2F587@xxxxxxxxxxxxxxx corp.intel.com>; bh=8S8i1l8QLtlliupMSMqriyvRjA6uTX9dfJWlFGfeMF4=; b=Ay6gpymS9ah6SVSJEdg4CYXKtQ5mRmCvHIafVtrG7D7MUA159R8s1w1g PDuAEc0Whaf5IScBLGAYv+ZxgGL8z/c2lGHaleay2zw1vafbzBbvSraly Snm0T4hse3Q3jibNuQ4dsH+nYZsMNXichhRV3jmX8WSKMAO+HVgVyDh+v XJnJGg4YJ1qH6V1vFHlyHAd5WP747UtrANAQmvx2X8PRrqnooKayCOlxb tJQONhXFm+zYNUL36jSnDPDzWDK9q;
Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:Message-ID:Date:From:Organization: User-Agent:MIME-Version:To:CC:Subject:References: In-Reply-To:X-Enigmail-Version:Content-Type: Content-Transfer-Encoding; b=ssEE20q4IS8T9cP3IgGNU7YyLHOlNAzaNj9RoaUGYIzRtqEa0gG+hOOs cFKmVDwtrpAXNbdr/Hdsyj63E35aSbFFg2VBurZeQKBL8W+UrLiNE2piS Ydbgs97kzfE4E0JXWNdqW2IT2RmYeZwTeEvxHJ7b9ihSamDKu91rq6WMX HKYMtd+I0xh5rJPFd9FDmaQqhxC04wsmPtgNXsQkBB1XZm98RnZkJTH6r KJ8RGaU263pLoI6UFxFS/VlruWVHj;
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <73BDC2BA3DA0BD47BAAEE12383D407EF35C2F587@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Fujitsu Technology Solutions
References: <4B58402E020000780002B3FE@xxxxxxxxxxxxxxxxxx> <C77DE51B.6F89%keir.fraser@xxxxxxxxxxxxx> <4B67E85E020000780002D1A0@xxxxxxxxxxxxxxxxxx> <8B81FACE836F9248894A7844CC0BA814250B6A12F0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4B695ADB020000780002D70F@xxxxxxxxxxxxxxxxxx> <73BDC2BA3DA0BD47BAAEE12383D407EF35C2F436@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4B6969AB.60605@xxxxxxxxxxxxxx> <73BDC2BA3DA0BD47BAAEE12383D407EF35C2F587@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla-Thunderbird 2.0.0.22 (X11/20090707)
Tian, Kevin wrote:
>> From: Juergen Gross [mailto:juergen.gross@xxxxxxxxxxxxxx] 
>> Sent: 2010年2月3日 20:19
>>
>> Tian, Kevin wrote:
>>>> From: Jan Beulich
>>>> Sent: 2010年2月3日 18:16
>>>>
>>>>>>> "Yu, Ke" <ke.yu@xxxxxxxxx> 02.02.10 18:07 >>>
>>>>>> Just fyi, we now also have seen an issue on a 24-CPU 
>> system that went
>>>>>> away with cpuidle=0 (and static analysis of the hang 
>> hinted in that
>>>>>> direction). All I can judge so far is that this likely has 
>>>> something to do
>>>>>> with our kernel's intensive use of the poll hypercall (i.e. 
>>>> we see vCPU-s
>>>>>> not waking up from the call despite there being pending 
>> unmasked or
>>>>>> polled for events).
>>>>> We just identified the cause of this issue, and is trying to 
>>>> find appropriate way to fix it.
>>>>
>>>> Hmm, while I agree that the scenario you describe can be a 
>> problem, I
>>>> don't think it can explain the behavior on the 24-CPU system pointed
>>>> out above, nor the one Juergen Gross pointed out yesterday.
>>> Is 24-CPU system observed with same likelihood as 64-CPU system to
>>> hang at boot time, or less frequent? Ke just did some 
>> theoretical analysis
>>> by assuming some values. There could be other factors added 
>> to latency
>>> and each system may have different characteristics too. We can't
>>> draw conclusion whether smaller system will face same issue, 
>> by simply
>>> changing CPU number in Ke's formula. :-) Possibly you can 
>> provide cpuidle
>>> information on your 24-core system for further comparison.
>> My 4-core system hangs _always_. For minutes. If I press any key on the
>> console it will resume booting with soft lockup messages (all cpus were
>> in xen_safe_halt).
>> Sometimes another hang occurs, sometimes the system will come 
>> up without
>> further hangs.
>>
>> Juergen
>>
> 
> interesting. Then did you also observe hang disappeared by disabling
> cpuidle? Your case really looks like some missed event scenario, in
> which key press just kicks cpu alive...

Yes, cpuidle=0 made the problem disappear.

Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technolgy Solutions               e-mail: juergen.gross@xxxxxxxxxxxxxx
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel