WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] RE: latest xen-unstable fails to boot on Dell D630 (likely h

To: "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx>, "Yu, Ke" <ke.yu@xxxxxxxxx>, "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] RE: latest xen-unstable fails to boot on Dell D630 (likely hpet/Cstate problem)
From: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date: Tue, 8 Dec 2009 18:31:56 -0800 (PST)
Cc:
Delivery-date: Tue, 08 Dec 2009 18:34:45 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <706158FABBBA044BAD4FE898A02E4BC201CF7CA011@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
FYI, 20073+20093+20149 boots properly and xend starts
WITH max_cstate=2, but dom0 FAILs to boot unless
max_cstate=2 is added as a Xen boot parameter.

So I still think something changed at 20073 that
causes Merom+RHEL5dom0 to fail to boot due to not
recovering from deep C-state (after dom0 runs
/sbin/hwclock ... Ke Yu knows how to reproduce
the problem).

Thanks,
Dan

> -----Original Message-----
> From: Zhang, Xiantao [mailto:xiantao.zhang@xxxxxxxxx]
> Sent: Tuesday, December 08, 2009 6:44 PM
> To: Dan Magenheimer; Yu, Ke; Xen-Devel (E-mail)
> Subject: RE: latest xen-unstable fails to boot on Dell D630 (likely
> hpet/Cstate problem)
> 
> 
> Dan, 
> Don't use Cset20073 for testing separately, since it needs 
> two minor fixes check-ined by the Cset #20093 and #20149.  
> Except this, Keir also has a typo in Cset #20076 fixed by 
> Cset #20092. In addition, one serious issue is also 
> introduced in #Cset20084 which is fixed in Cset #20140.  I 
> remembered Pod also has issues which can crash hypervisor 
> before Cset #20100. Thus, it is too hard to identify this 
> issue through bisect before #Cset20149, since these issues 
> are introduced and fixed crossedly.   Certainly, if you want 
> to test Cset #20073, you at least have to apply the 
> Cset#20093 and #20149 on top of it.  :)   
> Xiantao
> 
> 
> Dan Magenheimer wrote:
> >> But I'll give bisecting a try.
> > 
> > Looks like the problem has been around for awhile.  It appears
> > the problem starts at c/s 20073.  Xiantao cc'ed since 20073
> > was his patch.
> > 
> > 20070 boots OK without max_cstate=2
> > 
> > 20072 boots most of the way without max_cstate=2 but crashes
> >       before a login prompt (when xend is starting I think)
> > 
> > 20073 FAILS to boot without max_cstate=2 but crashes
> >       before a login prompt
> > 
> > 20082 FAILS to boot without max_cstate=2 but crashes
> >       before a login prompt with max_cstate=2
> > 
> > 20143 FAILS to boot without max_cstate=2 but boots OK
> >       with max_cstate=2
> > 
> > Note that I have NOT bisected tools, just the hypervisor
> > so the crashes are likely due to a newer xend failing on
> > an older hypervisor (which is irrelevant to this problem).
> > 
> >> -----Original Message-----
> >> From: Dan Magenheimer
> >> Sent: Tuesday, December 08, 2009 10:42 AM
> >> To: Yu, Ke; Xen-Devel (E-mail)
> >> Subject: RE: latest xen-unstable fails to boot on Dell D630 (likely
> >> hpet/Cstate problem) 
> >> 
> >> 
> >>> case, if convenient, could you help to do some bisect to see
> >>> which cset cause this bug?
> >> 
> >> I can do this, but because it is often no longer easy to
> >> bisect Xen because of interdependencies with other
> >> components, I was hoping that Keir or you or someone might
> >> have some idea of what changeset might have caused the regression.
> >> But I'll give bisecting a try.
> >> 
> >>> max_cstate=2), when dom0 hangs, is xen still alive, E.g. can
> >>> Xen response to three Ctrl-'A' in serial?
> >> 
> >> Unfortunately, I can't seem to get a Xen console working on
> >> the Merom machine, and the problem can't be reproduced on
> >> my other machine where the Xen console is working (because
> >> Conroe doesn't support deep C).
> >> 
> >>> -----Original Message-----
> >>> From: Yu, Ke [mailto:ke.yu@xxxxxxxxx]
> >>> Sent: Tuesday, December 08, 2009 12:08 AM
> >>> To: Dan Magenheimer; Xen-Devel (E-mail)
> >>> Subject: RE: latest xen-unstable fails to boot on Dell 
> D630 (likely
> >>> hpet/Cstate problem) 
> >>> 
> >>> 
> >>>> -----Original Message-----
> >>>> In this thread, I observed that I was unable to
> >>>> provoke deep C state (C3) on my Dell D630, which has
> >>>> a Intel Merom (dual-core laptop) processor.  At that
> >>>> time, when I tried enabling hpetbroadcast, dom0 boot failed.
> >>>> 
> >>>> http://lists.xensource.com/archives/html/xen-devel/2009-10/ms
> >>>> g01027.html 
> >>>> 
> >>>> As it turned out, all RHEL5-based (maybe RHEL4- also) dom0
> >>>> default installation run /sbin/hwclock, which IIRC takes
> >>>> the RTC away from Xen and gives it to dom0.  Since the
> >>>> Xen hpet emulation does not do RTC emulation, bad things
> >>>> then happen when a deep Cstate is entered (dom0 apparently
> >>>> never wakes up).  I think Ke Yu has also reproduced this problem.
> >>>> 
> >>>> Sometime in the last few weeks, some patch in xen-unstable
> >>>> apparently changed some defaults and xen-unstable will
> >>>> no longer boot with this processor/dom0, with or without
> >>>> hpetbroadcast on the Xen command line.  However, specifying
> >>>> max_cstate=2 on the Xen command line allows a successful
> >>>> dom0 boot, so I suspect the problem is the same (or at
> >>>> least very similar).
> >>>> 
> >>>> I did a quick scan for hpet changes and found c/s 20497,
> >>>> but backing it out made no difference.
> >>>> 
> >>>> I have a workaround for now, but since it is likely that
> >>>> many customers (including all of Oracle's OVS customers)
> >>>> use a RHEL5-based dom0 boot sequence, and Merom processors
> >>>> work fine otherwise, it would be nice to get this identified
> >>>> and fixed before 4.0.
> >>> 
> >>> Let's firstly figure out which component the issue resides.
> >>> 
> >>> Firstly, in the default boot (i.e. without specifying
> >>> max_cstate=2), when dom0 hangs, is xen still alive, E.g. can
> >>> Xen response to three Ctrl-'A' in serial?
> >>> 
> >>> If only dom0 hangs, it is probably that RTC malfunction make
> >>> incorrect dom0 time and lead dom0 fail to boot. Then RTC
> >>> emulation in hypervisor should fix this issue.
> >>> 
> >>> If Xen also hangs, it should be another bug, i.e. hpet
> >>> broadcast does not wake up CPU in deep C states. in this
> >>> case, if convenient, could you help to do some bisect to see
> >>> which cset cause this bug?
> >>> 
> >>> Best Regards
> >>> Ke
> 
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel