[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen4.2 S3 regression?



Any suggestions on how best to chase this down?

The first S3 suspend/resume cycle works, but the second does not.

On the second try, I never get any interrupts delivered to ahci.
(at least according to /proc/interrupts)


syslog traces from the first (good) and the second (bad) are attached,
as well as the output from the "*" debug Ctrl+a handler in both cases.




On Tue, Aug 7, 2012 at 12:48 PM, Ben Guthro <ben@xxxxxxxxxx> wrote:
> No - the issue seems to follow xen-4.2
>
> my test matrix looks as such:
>
> Xen      Linux                        S3 result
> 4.0.3     3.2.23                       OK
> 4.0.3     3.5                            OK
> 4.2        3.2.23                       FAIL
> 4.2        3.5                            FAIL
> 4.2        3.2.23 pci=nomsi    OK
> 4.2        3.5 pci=nomsi         (untested)
>
>
>
>
> On Tue, Aug 7, 2012 at 12:33 PM, Konrad Rzeszutek Wilk
> <konrad.wilk@xxxxxxxxxx> wrote:
>> On Tue, Aug 07, 2012 at 12:21:22PM -0400, Ben Guthro wrote:
>>> It looks like this regression may be related to MSI handling.
>>>
>>> "pci=nomsi" on the kernel command line seems to bypass the issue.
>>>
>>> Clearly, legacy interrupts are not ideal.
>>
>> This is with v3.5 kernel right? With the earlier one you did not have
>> this issue?
>>>
>>>
>>> On Tue, Aug 7, 2012 at 11:04 AM, Ben Guthro <ben@xxxxxxxxxx> wrote:
>>> > I have been doing some experiments in upgrading the Xen version in a
>>> > future version of XenClient Enterprise, and I've been running into a
>>> > regression that I'm wondering if anyone else has seen.
>>> >
>>> > dom0 suspend/resume (S3) does not seem to be working for me.
>>> >
>>> > In swapping out components of the system, the common failure seems to
>>> > be when I use Xen-4.2 (upgraded from Xen-4.0.3)
>>> >
>>> > The first suspend seems to mostly work...but subsequent ones always
>>> > resume improperly.
>>> > By "improperly" - I see I/O failures, and stalls of many processes.
>>> >
>>> > Below is a log excerpt of 2 S3 attempts.
>>> >
>>> >
>>> > Has anyone else seen these failures?
>>> >
>>> > - Ben
>>> >
>>> >
>>> > (XEN) Preparing system for ACPI S3 state.
>>> > (XEN) Disabling non-boot CPUs ...
>>> > (XEN) Breaking vcpu affinity for domain 0 vcpu 1
>>> > (XEN) Breaking vcpu affinity for domain 0 vcpu 2
>>> > (XEN) Breaking vcpu affinity for domain 0 vcpu 3
>>> > (XEN) Entering ACPI S3 state.
>>> > (XEN) mce_intel.c:1239: MCA Capability: BCAST 1 SER 0 CMCI 1 firstbank
>>> > 0 extended MCE MSR 0
>>> > (XEN) CPU0 CMCI LVT vector (0xf1) already installed
>>> > (XEN) Finishing wakeup from ACPI S3 state.
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > (XEN) Enabling non-boot CPUs  ...
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > [   36.440696] [drm:pch_irq_handler] *ERROR* PCH poison interrupt
>>> > (XEN) Preparing system for ACPI S3 state.
>>> > (XEN) Disabling non-boot CPUs ...
>>> > (XEN) Entering ACPI S3 state.
>>> > (XEN) mce_intel.c:1239: MCA Capability: BCAST 1 SER 0 CMCI 1 firstbank
>>> > 0 extended MCE MSR 0
>>> > (XEN) CPU0 CMCI LVT vector (0xf1) already installed
>>> > (XEN) Finishing wakeup from ACPI S3 state.
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > (XEN) Enabling non-boot CPUs  ...
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > (XEN) microcode: collect_cpu_info : sig=0x306a4, pf=0x2, rev=0x7
>>> > [   65.893235] [drm:pch_irq_handler] *ERROR* PCH poison interrupt
>>> > [   66.508829] ata3.00: revalidation failed (errno=-5)
>>> > [   66.508861] ata1.00: revalidation failed (errno=-5)
>>> > [   76.858815] ata3.00: revalidation failed (errno=-5)
>>> > [   76.898807] ata1.00: revalidation failed (errno=-5)
>>> > [  107.208817] ata3.00: revalidation failed (errno=-5)
>>> > [  107.288807] ata1.00: revalidation failed (errno=-5)
>>> > [  107.718866] pm_op(): scsi_bus_resume_common+0x0/0x60 returns 262144
>>> > [  107.718877] PM: Device 0:0:0:0 failed to resume async: error 262144
>>> > [  107.718913] end_request: I/O error, dev sda, sector 35193296
>>> > [  107.718919] Buffer I/O error on device dm-5, logical block 7690
>>> > [  107.718947] end_request: I/O error, dev sda, sector 35657184
>>> > [  107.718965] end_request: I/O error, dev sda, sector 246202760
>>> > [  107.718968] Buffer I/O error on device dm-6, logical block 26252801
>>> > [  107.718995] end_request: I/O error, dev sda, sector 254548368
>>> > [  107.719009] Aborting journal on device dm-6-8.
>>> > [  107.719021] end_request: I/O error, dev sda, sector 35164192
>>> > [  107.719023] Buffer I/O error on device dm-5, logical block 4052
>>> > [  107.719063] Aborting journal on device dm-5-8.
>>> > [  107.719085] end_request: I/O error, dev sda, sector 254546304
>>> > [  107.719097] Buffer I/O error on device dm-6, logical block 27295744
>>> > [  107.719129] JBD2: I/O error detected when updating journal
>>> > superblock for dm-6-8.
>>> > [  107.719141] end_request: I/O error, dev sda, sector 35656064
>>> > [  107.719146] Buffer I/O error on device dm-5, logical block 65536
>>> > [  107.719168] JBD2: I/O error detected when updating journal
>>> > superblock for dm-5-8.
>>> > [  107.870082] end_request: I/O error, dev sda, sector 35131776
>>> > [  107.875825] Buffer I/O error on device dm-5, logical block 0
>>> > [  107.881805] end_request: I/O error, dev sda, sector 35131776
>>> > [  107.887637] Buffer I/O error on device dm-5, logical block 0
>>> > [  107.893573] EXT4-fs error (device dm-5): ext4_journal_start_sb:327:
>>> > [  107.893579] EXT4-fs (dm-5): I/O error while writing superblock
>>> > [  107.893582] EXT4-fs error (device dm-5): ext4_journal_start_sb:327:
>>> > Detected aborted journal
>>> > [  107.893584] EXT4-fs (dm-5): Remounting filesystem read-only
>>> > [  107.893617] end_request: I/O error, dev sda, sector 35131776
>>> > [  107.893620] Buffer I/O error on device dm-5, logical block 0
>>> > [  107.893749] end_request: I/O error, dev sda, sector 36180352
>>> > [  107.893752] Buffer I/O error on device dm-6, logical block 0
>>> > [  107.893762] EXT4-fs error (device dm-6): ext4_journal_start_sb:327:
>>> > Detected aborted journal
>>> > [  107.893765] EXT4-fs (dm-6): Remounting filesystem read-only
>>> > [  107.893766] EXT4-fs (dm-6): previous I/O error to superblock detected
>>> > [  107.893784] end_request: I/O error, dev sda, sector 36180352
>>> > [  107.893787] Buffer I/O error on device dm-6, logical block 0
>>> > [  107.894467] EXT4-fs error (device dm-5): ext4_journal_start_sb:327:
>>> > Detected aborted journal
>>> > [  108.669763] end_request: I/O error, dev sda, sector 25957784
>>> > [  108.675555] Aborting journal on device dm-3-8.
>>> > [  108.680246] end_request: I/O error, dev sda, sector 25956736
>>> > [  108.686099] JBD2: I/O error detected when updating journal
>>> > superblock for dm-3-8.
>>> > [  108.693908] journal commit I/O error
>>> > [  108.755829] end_request: I/O error, dev sda, sector 17305984
>>> > [  108.761600] EXT4-fs error (device dm-3): ext4_journal_start_sb:327:
>>> > Detected aborted journal
>>> > [  108.770340] EXT4-fs (dm-3): Remounting filesystem read-only
>>> > [  108.776159] EXT4-fs (dm-3): previous I/O error to superblock detected
>>> > [  108.782904] end_request: I/O error, dev sda, sector 17305984
>>> > [  109.660011] end_request: I/O error, dev sda, sector 358788
>>> > [  109.665572] Buffer I/O error on device dm-1, logical block 46082
>>> > [  109.682479] end_request: I/O error, dev sda, sector 18832256
>>> > [  109.688246] end_request: I/O error, dev sda, sector 18832256
>>> > [  109.709559] end_request: I/O error, dev sda, sector 357762
>>> > [  109.715120] Buffer I/O error on device dm-1, logical block 45569
>>> > [  109.721506] end_request: I/O error, dev sda, sector 358790
>>> > [  109.727114] Buffer I/O error on device dm-1, logical block 46083
>>> > [  109.743714] end_request: I/O error, dev sda, sector 18832256
>>> > [  109.755555] end_request: I/O error, dev sda, sector 18832256
>>> > [  109.886187] end_request: I/O error, dev sda, sector 357764
>>> > [  109.891756] Buffer I/O error on device dm-1, logical block 45570
>>> > [  109.908344] end_request: I/O error, dev sda, sector 18832256
>>> > [  109.928369] end_request: I/O error, dev sda, sector 349574
>>> > [  109.933938] Buffer I/O error on device dm-1, logical block 41475
>>> > [  109.950336] end_request: I/O error, dev sda, sector 18832256
>>> > [  115.378875] end_request: I/O error, dev sda, sector 365000
>>> > [  115.384445] Aborting journal on device dm-1-8.
>>> > [  115.389120] end_request: I/O error, dev sda, sector 364930
>>> > [  115.394798] Buffer I/O error on device dm-1, logical block 49153
>>> > [  115.401101] JBD2: I/O error detected when updating journal
>>> > superblock for dm-1-8.
>>> > [  207.207426] end_request: I/O error, dev sda, sector 246192376
>>> > [  207.213313] end_request: I/O error, dev sda, sector 246192376
>>> > [  207.903181] end_request: I/O error, dev sda, sector 246192376
>>> > [  209.234399] end_request: I/O error, dev sda, sector 18518400
>>> > [  209.240221] end_request: I/O error, dev sda, sector 18518400
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@xxxxxxxxxxxxx
>>> http://lists.xen.org/xen-devel

Attachment: xen-dump-bad.txt
Description: Text document

Attachment: syslog-bad.txt
Description: Text document

Attachment: xen-dump-good.txt
Description: Text document

Attachment: syslog-good.txt
Description: Text document

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.