[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] RE: latest xen-unstable fails to boot on Dell D630 (likely hpet/Cstate problem)


  • To: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>, "Yu, Ke" <ke.yu@xxxxxxxxx>, "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • From: "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx>
  • Date: Wed, 9 Dec 2009 12:39:39 +0800
  • Accept-language: en-US
  • Acceptlanguage: en-US
  • Cc:
  • Delivery-date: Tue, 08 Dec 2009 20:41:39 -0800
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>
  • Thread-index: Acp4eB9a4CWEH8IWQ4u/BYjmKhFotgAB4JPQ
  • Thread-topic: latest xen-unstable fails to boot on Dell D630 (likely hpet/Cstate problem)

Dan Magenheimer wrote:
> FYI, 20073+20093+20149 boots properly and xend starts
> WITH max_cstate=2, but dom0 FAILs to boot unless
> max_cstate=2 is added as a Xen boot parameter. 

Could you attach the failure log ?  In addition, does this system have ioapic 
support ? I think hpet doesn't use MSI, right ? 
Xiantao


> So I still think something changed at 20073 that
> causes Merom+RHEL5dom0 to fail to boot due to not
> recovering from deep C-state (after dom0 runs
> /sbin/hwclock ... Ke Yu knows how to reproduce
> the problem).
> 
> Thanks,
> Dan
> 
>> -----Original Message-----
>> From: Zhang, Xiantao [mailto:xiantao.zhang@xxxxxxxxx]
>> Sent: Tuesday, December 08, 2009 6:44 PM
>> To: Dan Magenheimer; Yu, Ke; Xen-Devel (E-mail)
>> Subject: RE: latest xen-unstable fails to boot on Dell D630 (likely
>> hpet/Cstate problem) 
>> 
>> 
>> Dan,
>> Don't use Cset20073 for testing separately, since it needs
>> two minor fixes check-ined by the Cset #20093 and #20149.
>> Except this, Keir also has a typo in Cset #20076 fixed by
>> Cset #20092. In addition, one serious issue is also
>> introduced in #Cset20084 which is fixed in Cset #20140.  I
>> remembered Pod also has issues which can crash hypervisor
>> before Cset #20100. Thus, it is too hard to identify this
>> issue through bisect before #Cset20149, since these issues
>> are introduced and fixed crossedly.   Certainly, if you want
>> to test Cset #20073, you at least have to apply the
>> Cset#20093 and #20149 on top of it.  :)
>> Xiantao
>> 
>> 
>> Dan Magenheimer wrote:
>>>> But I'll give bisecting a try.
>>> 
>>> Looks like the problem has been around for awhile.  It appears
>>> the problem starts at c/s 20073.  Xiantao cc'ed since 20073 was his
>>> patch. 
>>> 
>>> 20070 boots OK without max_cstate=2
>>> 
>>> 20072 boots most of the way without max_cstate=2 but crashes
>>>       before a login prompt (when xend is starting I think)
>>> 
>>> 20073 FAILS to boot without max_cstate=2 but crashes       before a
>>> login prompt 
>>> 
>>> 20082 FAILS to boot without max_cstate=2 but crashes
>>>       before a login prompt with max_cstate=2
>>> 
>>> 20143 FAILS to boot without max_cstate=2 but boots OK       with
>>> max_cstate=2 
>>> 
>>> Note that I have NOT bisected tools, just the hypervisor
>>> so the crashes are likely due to a newer xend failing on
>>> an older hypervisor (which is irrelevant to this problem).
>>> 
>>>> -----Original Message-----
>>>> From: Dan Magenheimer
>>>> Sent: Tuesday, December 08, 2009 10:42 AM
>>>> To: Yu, Ke; Xen-Devel (E-mail)
>>>> Subject: RE: latest xen-unstable fails to boot on Dell D630
>>>> (likely hpet/Cstate problem) 
>>>> 
>>>> 
>>>>> case, if convenient, could you help to do some bisect to see
>>>>> which cset cause this bug?
>>>> 
>>>> I can do this, but because it is often no longer easy to
>>>> bisect Xen because of interdependencies with other
>>>> components, I was hoping that Keir or you or someone might
>>>> have some idea of what changeset might have caused the regression.
>>>> But I'll give bisecting a try.
>>>> 
>>>>> max_cstate=2), when dom0 hangs, is xen still alive, E.g. can
>>>>> Xen response to three Ctrl-'A' in serial?
>>>> 
>>>> Unfortunately, I can't seem to get a Xen console working on
>>>> the Merom machine, and the problem can't be reproduced on
>>>> my other machine where the Xen console is working (because
>>>> Conroe doesn't support deep C).
>>>> 
>>>>> -----Original Message-----
>>>>> From: Yu, Ke [mailto:ke.yu@xxxxxxxxx]
>>>>> Sent: Tuesday, December 08, 2009 12:08 AM
>>>>> To: Dan Magenheimer; Xen-Devel (E-mail)
>>>>> Subject: RE: latest xen-unstable fails to boot on Dell D630
>>>>> (likely hpet/Cstate problem) 
>>>>> 
>>>>> 
>>>>>> -----Original Message-----
>>>>>> In this thread, I observed that I was unable to
>>>>>> provoke deep C state (C3) on my Dell D630, which has
>>>>>> a Intel Merom (dual-core laptop) processor.  At that
>>>>>> time, when I tried enabling hpetbroadcast, dom0 boot failed.
>>>>>> 
>>>>>> http://lists.xensource.com/archives/html/xen-devel/2009-10/ms
>>>>>> g01027.html 
>>>>>> 
>>>>>> As it turned out, all RHEL5-based (maybe RHEL4- also) dom0
>>>>>> default installation run /sbin/hwclock, which IIRC takes
>>>>>> the RTC away from Xen and gives it to dom0.  Since the
>>>>>> Xen hpet emulation does not do RTC emulation, bad things
>>>>>> then happen when a deep Cstate is entered (dom0 apparently
>>>>>> never wakes up).  I think Ke Yu has also reproduced this problem.
>>>>>> 
>>>>>> Sometime in the last few weeks, some patch in xen-unstable
>>>>>> apparently changed some defaults and xen-unstable will
>>>>>> no longer boot with this processor/dom0, with or without
>>>>>> hpetbroadcast on the Xen command line.  However, specifying
>>>>>> max_cstate=2 on the Xen command line allows a successful
>>>>>> dom0 boot, so I suspect the problem is the same (or at
>>>>>> least very similar).
>>>>>> 
>>>>>> I did a quick scan for hpet changes and found c/s 20497,
>>>>>> but backing it out made no difference.
>>>>>> 
>>>>>> I have a workaround for now, but since it is likely that
>>>>>> many customers (including all of Oracle's OVS customers)
>>>>>> use a RHEL5-based dom0 boot sequence, and Merom processors
>>>>>> work fine otherwise, it would be nice to get this identified
>>>>>> and fixed before 4.0.
>>>>> 
>>>>> Let's firstly figure out which component the issue resides.
>>>>> 
>>>>> Firstly, in the default boot (i.e. without specifying
>>>>> max_cstate=2), when dom0 hangs, is xen still alive, E.g. can
>>>>> Xen response to three Ctrl-'A' in serial?
>>>>> 
>>>>> If only dom0 hangs, it is probably that RTC malfunction make
>>>>> incorrect dom0 time and lead dom0 fail to boot. Then RTC
>>>>> emulation in hypervisor should fix this issue.
>>>>> 
>>>>> If Xen also hangs, it should be another bug, i.e. hpet
>>>>> broadcast does not wake up CPU in deep C states. in this
>>>>> case, if convenient, could you help to do some bisect to see
>>>>> which cset cause this bug? 
>>>>> 
>>>>> Best Regards
>>>>> Ke


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.