xen-devel
Re: [Xen-devel] cpuidle causing Dom0 soft lockups
To: |
"Tian, Kevin" <kevin.tian@xxxxxxxxx> |
Subject: |
Re: [Xen-devel] cpuidle causing Dom0 soft lockups |
From: |
Juergen Gross <juergen.gross@xxxxxxxxxxxxxx> |
Date: |
Thu, 04 Feb 2010 07:31:02 +0100 |
Cc: |
"xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxxxx>, "Yu, Ke" <ke.yu@xxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx> |
Delivery-date: |
Wed, 03 Feb 2010 22:32:33 -0800 |
Dkim-signature: |
v=1; a=rsa-sha256; c=simple/simple; d=ts.fujitsu.com; i=juergen.gross@xxxxxxxxxxxxxx; q=dns/txt; s=s1536b; t=1265265189; x=1296801189; h=from:sender:reply-to:subject:date:message-id:to:cc: mime-version:content-transfer-encoding:content-id: content-description:resent-date:resent-from:resent-sender: resent-to:resent-cc:resent-message-id:in-reply-to: references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:list-owner:list-archive; z=From:=20Juergen=20Gross=20<juergen.gross@xxxxxxxxxxxxxx> |Subject:=20Re:=20[Xen-devel]=20cpuidle=20causing=20Dom0 =20soft=20lockups|Date:=20Thu,=2004=20Feb=202010=2007:31: 02=20+0100|Message-ID:=20<4B6A69A6.4010901@xxxxxxxxxxxxxx >|To:=20"Tian,=20Kevin"=20<kevin.tian@xxxxxxxxx>|CC:=20"x en-devel@xxxxxxxxxxxxxxxxxxx"=20<xen-devel@xxxxxxxxxxxxxx e.com>,=20=0D=0A=20"Yu,=20Ke"=20<ke.yu@xxxxxxxxx>,=0D=0A =20Jan=20Beulich=20<JBeulich@xxxxxxxxxx>,=20=0D=0A=20Keir =20Fraser=20<keir.fraser@xxxxxxxxxxxxx>|MIME-Version:=201 .0|Content-Transfer-Encoding:=208bit|In-Reply-To:=20<73BD C2BA3DA0BD47BAAEE12383D407EF35C2F587@xxxxxxxxxxxxxxxxxxxx intel.com>|References:=20<4B58402E020000780002B3FE@xxxxxx 2.novell.com>=09<C77DE51B.6F89%keir.fraser@xxxxxxxxxxxxx> =09<4B67E85E020000780002D1A0@xxxxxxxxxxxxxxxxxx>=09<8B81F ACE836F9248894A7844CC0BA814250B6A12F0@xxxxxxxxxxxxxxxxxxx .intel.com>=09<4B695ADB020000780002D70F@xxxxxxxxxxxxxxxxx m>=09<73BDC2BA3DA0BD47BAAEE12383D407EF35C2F436@shzsmsx502 .ccr.corp.intel.com>=09<4B6969AB.60605@xxxxxxxxxxxxxx>=20 <73BDC2BA3DA0BD47BAAEE12383D407EF35C2F587@xxxxxxxxxxxxxxx corp.intel.com>; bh=8S8i1l8QLtlliupMSMqriyvRjA6uTX9dfJWlFGfeMF4=; b=Ay6gpymS9ah6SVSJEdg4CYXKtQ5mRmCvHIafVtrG7D7MUA159R8s1w1g PDuAEc0Whaf5IScBLGAYv+ZxgGL8z/c2lGHaleay2zw1vafbzBbvSraly Snm0T4hse3Q3jibNuQ4dsH+nYZsMNXichhRV3jmX8WSKMAO+HVgVyDh+v XJnJGg4YJ1qH6V1vFHlyHAd5WP747UtrANAQmvx2X8PRrqnooKayCOlxb tJQONhXFm+zYNUL36jSnDPDzWDK9q; |
Domainkey-signature: |
s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:Message-ID:Date:From:Organization: User-Agent:MIME-Version:To:CC:Subject:References: In-Reply-To:X-Enigmail-Version:Content-Type: Content-Transfer-Encoding; b=ssEE20q4IS8T9cP3IgGNU7YyLHOlNAzaNj9RoaUGYIzRtqEa0gG+hOOs cFKmVDwtrpAXNbdr/Hdsyj63E35aSbFFg2VBurZeQKBL8W+UrLiNE2piS Ydbgs97kzfE4E0JXWNdqW2IT2RmYeZwTeEvxHJ7b9ihSamDKu91rq6WMX HKYMtd+I0xh5rJPFd9FDmaQqhxC04wsmPtgNXsQkBB1XZm98RnZkJTH6r KJ8RGaU263pLoI6UFxFS/VlruWVHj; |
Envelope-to: |
www-data@xxxxxxxxxxxxxxxxxxx |
In-reply-to: |
<73BDC2BA3DA0BD47BAAEE12383D407EF35C2F587@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> |
List-help: |
<mailto:xen-devel-request@lists.xensource.com?subject=help> |
List-id: |
Xen developer discussion <xen-devel.lists.xensource.com> |
List-post: |
<mailto:xen-devel@lists.xensource.com> |
List-subscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe> |
List-unsubscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe> |
Organization: |
Fujitsu Technology Solutions |
References: |
<4B58402E020000780002B3FE@xxxxxxxxxxxxxxxxxx> <C77DE51B.6F89%keir.fraser@xxxxxxxxxxxxx> <4B67E85E020000780002D1A0@xxxxxxxxxxxxxxxxxx> <8B81FACE836F9248894A7844CC0BA814250B6A12F0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4B695ADB020000780002D70F@xxxxxxxxxxxxxxxxxx> <73BDC2BA3DA0BD47BAAEE12383D407EF35C2F436@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4B6969AB.60605@xxxxxxxxxxxxxx> <73BDC2BA3DA0BD47BAAEE12383D407EF35C2F587@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> |
Sender: |
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx |
User-agent: |
Mozilla-Thunderbird 2.0.0.22 (X11/20090707) |
Tian, Kevin wrote:
>> From: Juergen Gross [mailto:juergen.gross@xxxxxxxxxxxxxx]
>> Sent: 2010年2月3日 20:19
>>
>> Tian, Kevin wrote:
>>>> From: Jan Beulich
>>>> Sent: 2010年2月3日 18:16
>>>>
>>>>>>> "Yu, Ke" <ke.yu@xxxxxxxxx> 02.02.10 18:07 >>>
>>>>>> Just fyi, we now also have seen an issue on a 24-CPU
>> system that went
>>>>>> away with cpuidle=0 (and static analysis of the hang
>> hinted in that
>>>>>> direction). All I can judge so far is that this likely has
>>>> something to do
>>>>>> with our kernel's intensive use of the poll hypercall (i.e.
>>>> we see vCPU-s
>>>>>> not waking up from the call despite there being pending
>> unmasked or
>>>>>> polled for events).
>>>>> We just identified the cause of this issue, and is trying to
>>>> find appropriate way to fix it.
>>>>
>>>> Hmm, while I agree that the scenario you describe can be a
>> problem, I
>>>> don't think it can explain the behavior on the 24-CPU system pointed
>>>> out above, nor the one Juergen Gross pointed out yesterday.
>>> Is 24-CPU system observed with same likelihood as 64-CPU system to
>>> hang at boot time, or less frequent? Ke just did some
>> theoretical analysis
>>> by assuming some values. There could be other factors added
>> to latency
>>> and each system may have different characteristics too. We can't
>>> draw conclusion whether smaller system will face same issue,
>> by simply
>>> changing CPU number in Ke's formula. :-) Possibly you can
>> provide cpuidle
>>> information on your 24-core system for further comparison.
>> My 4-core system hangs _always_. For minutes. If I press any key on the
>> console it will resume booting with soft lockup messages (all cpus were
>> in xen_safe_halt).
>> Sometimes another hang occurs, sometimes the system will come
>> up without
>> further hangs.
>>
>> Juergen
>>
>
> interesting. Then did you also observe hang disappeared by disabling
> cpuidle? Your case really looks like some missed event scenario, in
> which key press just kicks cpu alive...
Yes, cpuidle=0 made the problem disappear.
Juergen
--
Juergen Gross Principal Developer Operating Systems
TSP ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967
Fujitsu Technolgy Solutions e-mail: juergen.gross@xxxxxxxxxxxxxx
Domagkstr. 28 Internet: ts.fujitsu.com
D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|