[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] RE: Question about Xen S3 and resume code - Linux dom0 never exits the xen_safe_halt hypercall after resume

> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@xxxxxxxxxx]
> Sent: Friday, June 17, 2011 6:58 AM
> I've been eyeing the ACPI S3/S5 code to see what would be necessary to
> retool, and while testing I found something strange..
> I've stuck the code on
>  git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git devel/acpi-s3.v1
> I found out that if I don't have this patch:
> commit 9f301b0a0081676dfc71b7f0898295e6bcba391a
> Author: Yu Ke <ke.yu@xxxxxxxxx>
> Date:   Thu Jun 16 17:15:26 2011 -0400
>     xen/acpi: add xen acpi processor driver

ideally ACPI S3/S5 has nothing to do with ACPI processor driver which is for 

> (which is in the devel/acpi-s3.v0 branch).
> the hypervisor, after an S3 resume sits forever in the default_idle. The
> Linux dom0 is stuck looping (I think) around SCHEDOP_block hypercall.
> http://darnok.org/xen/devel.acpi-s3.v1.serial.log
> If that patch above is present and I've cpufreq=xen on the Xen
> hypervisor then Linux kernel gets unstuck and returns to userspace:
> http://darnok.org/xen/devel.acpi-s3.v0.serial.log

Compare your logs, the major difference is:

[  168.754739] calling  i2c-8+ @ 3096
[  168.758200] call i2c-8+ returned 0 after 0 usecs
<<< 1st case stuck here
[  168.762882] calling  card0-VGA-1+ @ 3096
[  168.766867] call card0-VGA-1+ returned 0 after 0 usecs
[  168.772085] calling  ttm+ @ 3096
[  168.775360] call ttm+ returned 0 after 0 usecs
[  168.779870] PM: resume of devices complete after 13117.603 msecs
[  168.786006] PM: Finishing wakeup.
<<<2nd case forward progress

It looks that VGA card resume has some problem on resume, which then
makes dom0 stay in idle loop and thus block hypercall, and then due to
no runnable vcpu so Xen most time in idle_loop too. In earlier log there're
some stack trace in i915 driver. Perhaps you can try a different machine
or try native S3 on same box to make sure it's not mixed with native issues.

> (however, if I set cpuidle=0 cpufreq=none on the hypervisor line and
> have the 9f301b0a0081676dfc71b7f0898295e6bcba391a patch it still
> gets stuck).
> I figured that the primary reason the guest is allowed to
> exit is SCHEDOP_block loop is b/c the pm_idle call is set to the
> acp_processor_idle which does "something" extra after the machine comes
> out of a S3 suspend.

If that's the case I think you should disable CONFIG_ACPI_PROCESSOR in dom0
before incorporating Xen specific version (the patch you tried). We don't want
dom0 to play with Cx directly b/c it's the responsibility of Xen.

Of course we still need figure out why same issues occur with cpuidle=0/
cpufreq=none, which however can be revisited after the basic S3 works. :-)

> Any ideas?

No other ideas for now. From historical view Xen S3 was supported before
Cx/Px, and so it's expected to work correctly w/o ACPI processor driver. So 
please make sure CONFIG_ACPI_PROCESSOR not enabled in your test for
basic S3.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.