[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] cpuidle causing Dom0 soft lockups


  • To: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
  • From: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>
  • Date: Thu, 04 Feb 2010 07:31:02 +0100
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxxxx>, "Yu, Ke" <ke.yu@xxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
  • Delivery-date: Wed, 03 Feb 2010 22:32:33 -0800
  • Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:Message-ID:Date:From:Organization: User-Agent:MIME-Version:To:CC:Subject:References: In-Reply-To:X-Enigmail-Version:Content-Type: Content-Transfer-Encoding; b=ssEE20q4IS8T9cP3IgGNU7YyLHOlNAzaNj9RoaUGYIzRtqEa0gG+hOOs cFKmVDwtrpAXNbdr/Hdsyj63E35aSbFFg2VBurZeQKBL8W+UrLiNE2piS Ydbgs97kzfE4E0JXWNdqW2IT2RmYeZwTeEvxHJ7b9ihSamDKu91rq6WMX HKYMtd+I0xh5rJPFd9FDmaQqhxC04wsmPtgNXsQkBB1XZm98RnZkJTH6r KJ8RGaU263pLoI6UFxFS/VlruWVHj;
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Tian, Kevin wrote:
>> From: Juergen Gross [mailto:juergen.gross@xxxxxxxxxxxxxx] 
>> Sent: 2010年2月3日 20:19
>>
>> Tian, Kevin wrote:
>>>> From: Jan Beulich
>>>> Sent: 2010年2月3日 18:16
>>>>
>>>>>>> "Yu, Ke" <ke.yu@xxxxxxxxx> 02.02.10 18:07 >>>
>>>>>> Just fyi, we now also have seen an issue on a 24-CPU 
>> system that went
>>>>>> away with cpuidle=0 (and static analysis of the hang 
>> hinted in that
>>>>>> direction). All I can judge so far is that this likely has 
>>>> something to do
>>>>>> with our kernel's intensive use of the poll hypercall (i.e. 
>>>> we see vCPU-s
>>>>>> not waking up from the call despite there being pending 
>> unmasked or
>>>>>> polled for events).
>>>>> We just identified the cause of this issue, and is trying to 
>>>> find appropriate way to fix it.
>>>>
>>>> Hmm, while I agree that the scenario you describe can be a 
>> problem, I
>>>> don't think it can explain the behavior on the 24-CPU system pointed
>>>> out above, nor the one Juergen Gross pointed out yesterday.
>>> Is 24-CPU system observed with same likelihood as 64-CPU system to
>>> hang at boot time, or less frequent? Ke just did some 
>> theoretical analysis
>>> by assuming some values. There could be other factors added 
>> to latency
>>> and each system may have different characteristics too. We can't
>>> draw conclusion whether smaller system will face same issue, 
>> by simply
>>> changing CPU number in Ke's formula. :-) Possibly you can 
>> provide cpuidle
>>> information on your 24-core system for further comparison.
>> My 4-core system hangs _always_. For minutes. If I press any key on the
>> console it will resume booting with soft lockup messages (all cpus were
>> in xen_safe_halt).
>> Sometimes another hang occurs, sometimes the system will come 
>> up without
>> further hangs.
>>
>> Juergen
>>
> 
> interesting. Then did you also observe hang disappeared by disabling
> cpuidle? Your case really looks like some missed event scenario, in
> which key press just kicks cpu alive...

Yes, cpuidle=0 made the problem disappear.

Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technolgy Solutions               e-mail: juergen.gross@xxxxxxxxxxxxxx
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.