WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] RE: Question about Xen S3 and resume code - Linux dom0 never

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Subject: [Xen-devel] RE: Question about Xen S3 and resume code - Linux dom0 never exits the xen_safe_halt hypercall after resume
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Tue, 21 Jun 2011 07:22:02 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Yu, Ke" <ke.yu@xxxxxxxxx>
Delivery-date: Mon, 20 Jun 2011 16:22:58 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20110620123626.GA2973@xxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20110616225739.GA8714@xxxxxxxxxxxx> <625BA99ED14B2D499DC4E29D8138F1505D2C2DD530@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <20110620123626.GA2973@xxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcwvRrPeZu/fR35lSwiQr2MjXXEXmAAVicyw
Thread-topic: Question about Xen S3 and resume code - Linux dom0 never exits the xen_safe_halt hypercall after resume
> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@xxxxxxxxxx]
> Sent: Monday, June 20, 2011 8:36 PM
> 
> > ideally ACPI S3/S5 has nothing to do with ACPI processor driver which is for
> Cx/Px.
> 
> Right..
> >
> > >
> > > (which is in the devel/acpi-s3.v0 branch).
> > >
> > > the hypervisor, after an S3 resume sits forever in the default_idle. The
> > > Linux dom0 is stuck looping (I think) around SCHEDOP_block hypercall.
> > >
> > > http://darnok.org/xen/devel.acpi-s3.v1.serial.log
> > >
> > > If that patch above is present and I've cpufreq=xen on the Xen
> > > hypervisor then Linux kernel gets unstuck and returns to userspace:
> > >
> > > http://darnok.org/xen/devel.acpi-s3.v0.serial.log
> >
> > Compare your logs, the major difference is:
> >
> > [  168.754739] calling  i2c-8+ @ 3096
> > [  168.758200] call i2c-8+ returned 0 after 0 usecs
> > <<< 1st case stuck here
> > [  168.762882] calling  card0-VGA-1+ @ 3096
> > [  168.766867] call card0-VGA-1+ returned 0 after 0 usecs
> > [  168.772085] calling  ttm+ @ 3096
> > [  168.775360] call ttm+ returned 0 after 0 usecs
> > [  168.779870] PM: resume of devices complete after 13117.603 msecs
> > [  168.786006] PM: Finishing wakeup.
> > <<<2nd case forward progress
> >
> > It looks that VGA card resume has some problem on resume, which then
> 
> In both cases - with the patch and without..

that's expected since device suspend is always invoked in the S3 path.

> 
> > makes dom0 stay in idle loop and thus block hypercall, and then due to
> > no runnable vcpu so Xen most time in idle_loop too. In earlier log there're
> > some stack trace in i915 driver. Perhaps you can try a different machine
> 
> Or remove the i915 just to eliminate that.

So any result there? :-)

> > or try native S3 on same box to make sure it's not mixed with native issues.
> >
> > >
> > > (however, if I set cpuidle=0 cpufreq=none on the hypervisor line and
> > > have the 9f301b0a0081676dfc71b7f0898295e6bcba391a patch it still
> > > gets stuck).
> > >
> > > I figured that the primary reason the guest is allowed to
> > > exit is SCHEDOP_block loop is b/c the pm_idle call is set to the
> > > acp_processor_idle which does "something" extra after the machine comes
> > > out of a S3 suspend.
> >
> > If that's the case I think you should disable CONFIG_ACPI_PROCESSOR in
> dom0
> > before incorporating Xen specific version (the patch you tried). We don't 
> > want
> > dom0 to play with Cx directly b/c it's the responsibility of Xen.
> 
> Huh? You misunderstood me. The 'acpi_processor_idle' is the hypervisor's
> idle loop. It can be running inside of that one, or the 'default_idle' loop. 
> Hence

running inside which one? I'd think only default_idle invokes it when current 
cpu
is actually idle.

> my question why would that specific hypervisor idle loop make dom0 run nicely
> while the default one would not.

this is counterintuitive to me honestly speaking. I'd more think that 
acpi_processor_idle may cause some issue than pure "sti;hlt" because acpi
version has more logic to handle. In earlier day when it's still in 
stabilization
phase, we did observe some non-exit case from deep Cstate but this never
happens on pure hlt.

IOW, I don't take this idle path as a necessary step to make S3 resume working,
which is simply related when the cpu has nothing to do... 

> 
> In dom0, irregardless of the patches, the 'default_idle' is run which makes 
> the
> xen_safe_halt paravirt call.

OK, that matches my expectation then.

> 
> >
> > Of course we still need figure out why same issues occur with cpuidle=0/
> > cpufreq=none, which however can be revisited after the basic S3 works. :-)
> 
> Right. The end result of those parameters is that the 'default_idle' in the
> hypervisor is choosen instead of the 'acpi_processor_idle' one.
> >
> > >
> > > Any ideas?
> >
> > No other ideas for now. From historical view Xen S3 was supported before
> 
> Hmm, I am actually tempted to start commenting out code in the
> acpi_processor_idle
> and seeing what will cause it to have the same failure as 'default_idle'.

you can also try "max_cstates=1" to see any difference, which is expected to
has similar effect as safe_halt().

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel