[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] resume from S3 sleep not working in Dom0 - Xen4.2.1



Thank you for your help guys. I tried to apply those patches, but I didn't succeed.
Konrard's acpi-s3 worked (in sence I didn't get error while patching), but the patches from Ben not. Arch linux is using pretty much vanilla kernel with only 3 patches applied(upstream patch, default console loglevel patch and fat issue patch), and it seems I'm missing a whole lot of things there.
For instance, the 'fix-dmar-zap-reinstate.txt' requires /xen/drivers/ folder, but in my kernel source it is missing (there is xen, but no drivers).

I'm not familiar with custom kernel building (this was my first time), but I would say I'm missing some xen patches that should be applied priror to those of Ben.
Could you please give me a hint on how can I proceed or what kernel did you use?
Thank you very much for your time.


2013/2/5 Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
On Mon, Feb 04, 2013 at 11:07:23AM +0100, Tomasz Wroblewski wrote:
>
> >>fix-suspend-scheduler-v2
> >>fix-suspend-scheduler-revert-affinity-part
> >>s3-timerirq
> >>
> >>All of these fixes have been proposed to the xen-devel list, but have
> >>not yet been accepted, for one reason, or another.
> >And I don't think comments on them have seen follow-ups.
> >
> >Jan
> >
> I guess it's worth bringing this up again;
>
> s3-timerirq: this was empirical hack which for some reason is needed
> on stable 4.2 we use, but not on latest unstable, didn't really
> investigate further since it appeared fixed later on anyway..
>
> fix-suspend-scheduler/revert-affinity: the big objection here was
> the part which reverts one of the hunks in Keir's commit. I tried
> for quite few days to find a working fix which does not do this
> revert using posted suggestions, but was not succesfull:
>
> - there was a crash in xen scheduler, which was fixable using your
> suggestion of masking softirqs during s3 (ugly)
> - there was also a crash in xen acpi cpufreq driver, which was
> similarily fixable using a bandaid s3 condition (ugly)
> - unfortunately this turned out to not be all, xen did not crash
> anymore at this point but dom0 kernel did around the time it enables
> cpus, in multiple places: at this point I didn't have a good
> explanation for it, my opinion of aggravating hunk was rather low,
> so I uttered a hearty curse and stuck a revert into private
> patchqueue.
>
> The dom0 kernel crashes were as follows:
>
> 1)
>
> [   60.657751] Enabling non-boot CPUs ...
> [   60.657958] installing Xen timer for CPU 1
> [   60.657987] cpu 1 spinlock event irq 279
> [   60.658101] Disabled fast string operations
> [   60.658466] CPU1 is up
> [   60.658736] installing Xen timer for CPU 2
> [   60.658784] cpu 2 spinlock event irq 285
> [   60.659764] Disabled fast string operations
> [   60.661811] BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000018
> [   60.661817] IP: [<ffffffff8105f700>]
> build_sched_domains+0x770/0(XEN) *** Serial input -> Xen (type
> 'CTRL-a' three times to switch input to DOM0)
>
>
>
>
> 2)
> .332997] installing Xen timer for CPU 2emory
> [   36.333061] cpu 2 spinlock event irq 285
> [   36.333343] Disabled fast string operations
> [   36.334939] CPU2 is up
> [   36.335213] installing Xen timer for CPU 3
> [   36.335244] cpu 3 spinlock event irq 291
> [   36.335561] Disabled fast string operations
> [   36.337461] CPU3 is up
> [   36.339513] ACPI: Waking up from system sleep state S3
> [   36.350193] BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000004
> [   36.350211] IP: [<ffffffff81055f9a>] find_busiest_group+0x38a/0xbb0
> [   36.350236] PGD 2f19067 PUD 2ec7067 PMD 0
> [   36.350252] Oops: 0000 [#1] SMP
> [   36.350263] CPU 1
> [   36.350267] Modules linked in: xt_mac ipt_MASQUERADE
> ebtable_filter ebtables iscsi_scst(O) xt_tcpudp scst_vdisk(O)
> xt_state crc32c xt_multiport libcrc32c iptable_filter iptable_nat
> nf_nat nf_conntrack_ipv4 nf_conntrack scst_cdrom(O) nf_defrag_ipv4
> ip_tables scst(O) x_tables bridge stp llc nls_cp437 isofs zram(C)
> snd_hda_codec_hdmi snd_hda_codec_conexant microcode arc4 psmouse
> serio_raw i915 drm_kms_helper drm iwlwifi(O) mac80211(O) cfg80211(O)
> thinkpad_acpi nvram snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
> snd_timer snd soundcore snd_page_alloc i2c_algo_bit intel_agp video
> intel_gtt tpm_tis tpm tpm_bios sdhci_pci sdhci ehci_hcd e1000e
> [   36.350437]
> [   36.350445] Pid: 2730, comm: bash Tainted: G         C O
> 3.2.23-orc #19 LENOVO 42404EU/42404EU
> [   36.350463] RIP: e030:[<ffffffff81055f9a>]  [<ffffffff81055f9a>]
> find_busiest_group+0x38a/0xbb0
> [   36.350481] RSP: e02b:ffff880002b71228  EFLAGS: 00010046
> [   36.350490] RAX: 0000000000000040 RBX: 0000000000000000 RCX:
> 0000000000000000
> [   36.350500] RDX: 0000000000000000 RSI: 0000000000000040 RDI:
> 0000000000000000
> [   36.350510] RBP: ffff880002b713b8 R08: ffff880026109f00 R09:
> 0000000000000000
> [   36.350519] R10: 0000000000000000 R11: 0000000000000001 R12:
> 0000000000000000
> [   36.350529] R13: ffff880026109f80 R14: ffffffffffffffff R15:
> ffff880026109f98
> [   36.350547] FS:  00007fc41e295700(0000) GS:ffff88002dc40000(0000)
> knlGS:0000000000000000
> [   36.350558] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [   36.350566] CR2: 0000000000000004 CR3: 0000000026329000 CR4:
> 0000000000002660
> [   36.350577] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [   36.350587] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [   36.350598] Process bash (pid: 2730, threadinfo ffff880002b70000,
> task ffff880027a7db40)
> [   36.350608] Stack:
> [   36.350613]  00ffffff00000002 0000000300000001 ffff880002b71498
> ffff880002b71534
> [   36.350630]  00ffffff00000002 0000000100000001 ffff8800262cf000
> 0000000000000008
> [   36.350646]  ffffffff00000000 0000000000000000 0000000000000000
> ffff88002dc4e2c8
> [   36.350662] Call Trace:
> [   36.350677]  [<ffffffff8105b158>] load_balance+0xb8/0x840
> [   36.350690]  [<ffffffff8101b909>] ? sched_clock+0x9/0x10
> [   36.350706]  [<ffffffff8108ccad>] ? sched_clock_cpu+0xbd/0x110
> [   36.350718]  [<ffffffff81052b1c>] ? update_shares+0xcc/0x100
> [   36.350735]  [<ffffffff8157b9b5>] __schedule+0x875/0x8d0
> [   36.350749]  [<ffffffff81073ae2>] ? try_to_del_timer_sync+0x92/0x130
> [   36.350762]  [<ffffffff8157bd3f>] schedule+0x3f/0x60
> [   36.350773]  [<ffffffff8157c24d>] schedule_timeout+0x16d/0x320
> [   36.350786]  [<ffffffff810728e0>] ? usleep_range+0x50/0x50
> [   36.350800]  [<ffffffff8157de2e>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
> [   36.350817]  [<ffffffff8130c340>]
> acpi_ec_transaction_unlocked+0x134/0x1d8
> [   36.350830]  [<ffffffff81086b90>] ? add_wait_queue+0x60/0x60
> [   36.350842]  [<ffffffff8130c6c6>] acpi_ec_transaction+0x196/0x239
> [   36.350856]  [<ffffffff8157de2e>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
> [   36.350869]  [<ffffffff8130c8a0>] acpi_ec_write+0x40/0x42
> [   36.350881]  [<ffffffff8130c9a8>] acpi_ec_space_handler+0x9e/0xfc
> [   36.350894]  [<ffffffff8130c90a>] ? acpi_ec_burst_disable+0x3d/0x3d
> [   36.350909]  [<ffffffff813159c6>]
> acpi_ev_address_space_dispatch+0x179/0x1c8
> [   36.350924]  [<ffffffff8131aafe>] acpi_ex_access_region+0x23e/0x24b
> [   36.350936]  [<ffffffff8106e82c>] ? __sysctl_head_next+0x11c/0x130
> [   36.350951]  [<ffffffff8131ae15>] acpi_ex_field_datum_io+0xf9/0x17a
> [   36.350965]  [<ffffffff8131b148>]
> acpi_ex_write_with_update_rule+0xb5/0xc1
> [   36.350989]  [<ffffffff8131acfa>] acpi_ex_insert_into_field+0x1ef/0x211
> [   36.351003]  [<ffffffff8132b5a7>] ?
> acpi_ut_allocate_object_desc_dbg+0x45/0x7f
> [   36.351018]  [<ffffffff8131980e>] acpi_ex_write_data_to_field+0x194/0x1c2
> [   36.351031]  [<ffffffff813131e4>] ?
> acpi_ds_init_object_from_op+0x137/0x231
> [   36.351044]  [<ffffffff8131d94f>] acpi_ex_store_object_to_node+0xa3/0xe2
> [   36.351056]  [<ffffffff8131da51>] acpi_ex_store+0xc3/0x256
> [   36.351066]  [<ffffffff8131b62b>] acpi_ex_opcode_1A_1T_1R+0x353/0x4a5
> [   36.351078]  [<ffffffff8131260c>] acpi_ds_exec_end_op+0xf7/0x3e7
> [   36.351092]  [<ffffffff81325ae7>] acpi_ps_parse_loop+0x7bd/0x94e
> [   36.351105]  [<ffffffff81324ed9>] acpi_ps_parse_aml+0x96/0x275
> [   36.351119]  [<ffffffff81326394>] acpi_ps_execute_method+0x1ce/0x276
> [   36.351131]  [<ffffffff8132165b>] acpi_ns_evaluate+0xdf/0x1aa
> [   36.351144]  [<ffffffff81320c9d>] acpi_evaluate_object+0xfb/0x1f4
> [   36.351156]  [<ffffffff8130f8ee>] acpi_device_sleep_wake+0x95/0xc7
> [   36.351168]  [<ffffffff8130fa60>]
> acpi_disable_wakeup_device_power+0x6e/0xc9
> [   36.351182]  [<ffffffff813085e2>] acpi_disable_wakeup_devices+0x7b/0x95
> [   36.351194]  [<ffffffff81308710>] acpi_pm_finish+0x39/0x55
> [   36.351208]  [<ffffffff810a6034>] suspend_devices_and_enter+0x104/0x310
> [   36.351222]  [<ffffffff810a63a7>] enter_state+0x167/0x190
> [   36.351234]  [<ffffffff810a4d27>] state_store+0xb7/0x130
> [   36.351246]  [<ffffffff812b54df>] kobj_attr_store+0xf/0x30
> [   36.351260]  [<ffffffff811d382f>] sysfs_write_file+0xef/0x170
> [   36.351274]  [<ffffffff811668d3>] vfs_write+0xb3/0x180
> [   36.351286]  [<ffffffff81166bfa>] sys_write+0x4a/0x90
> [   36.351300]  [<ffffffff81585d02>] system_call_fastpath+0x16/0x1b
> [   36.351308] Code: ff 48 8b bd a0 fe ff ff 44 88 85 78 fe ff ff e8
> 5d fb ff ff 44 0f b6 85 78 fe ff ff 0f 1f 44 00 00 49 8b 7d 10 4c 8b
> 4d 98 31 d2 <8b> 4f 04 4c 89 c8 48 c1 e0 0a 48 f7 f1 48 8b 4d a0 48
> 85 c9 48
> [   36.351435] RIP  [<ffffffff81055f9a>] find_busiest_group+0x38a/0xbb0
> [   36.351450]  RSP <ffff880002b71228>
> [   36.351456] CR2: 0000000000000004
> [   36.351465] ---[ end trace 5ad2b14b3a9050ae ]---
> [   36.352362] BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000010
> [   36.352379] IP: [<ffffffff812ba531>] rb_next+0x1/0x50
> [   36.352394] PGD 0
> [   36.352402] Oops: 0000 [#2] SMP
> [   36.352411] CPU 1
> [   36.352416] Modules linked in: xt_mac ipt_MASQUERADE
> ebtable_filter ebtables iscsi_scst(O) xt_tcpudp scst_vdisk(O)
> xt_state crc32c xt_multiport libcrc32c iptable_filter iptable_nat
> nf_nat nf_conntrack_ipv4 nf_conntrack scst_cdrom(O) nf_defrag_ipv4
> ip_tables scst(O) x_tables bridge stp llc nls_cp437 isofs zram(C)
> snd_hda_codec_hdmi snd_hda_codec_conexant microcode arc4 psmouse
> serio_raw i915 drm_kms_helper drm iwlwifi(O) mac80211(O) cfg80211(O)
> thinkpad_acpi nvram snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
> snd_timer snd soundcore snd_page_alloc i2c_algo_bit intel_agp video
> intel_gtt tpm_tis tpm tpm_bios sdhci_pci sdhci ehci_hcd e1000e
> [   36.352573]
> [   36.352580] Pid: 2730, comm: bash Tainted: G      D  C O
> 3.2.23-orc #19 LENOVO 42404EU/42404EU
> [   36.352596] RIP: e030:[<ffffffff812ba531>]  [<fffffff
>
>
>
>
> 3)
>
> [   47.833362] Resuming Xen processor info
> (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
> (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
> (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
> (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
> (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
> (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
> (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
> (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
> [   47.886297] Enabling non-boot CPUs ...
> [   47.890082] installing Xen timer for CPU 1
> [   47.894257] cpu 1 spinlock event irq 48
> [   47.899013] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> [   47.906740] IP: [<ffffffff8149196b>] __cpuidle_register_device+0x2b/0x100
> [   47.913578] PGD 34a4067 PUD 3ac3067 PMD 0
> [   47.917825] Oops: 0000 [#1] SMP
> [   47.921108] Modules linked in: ipt_MASQUERADE ebtable_filter ebtables iscsi_scst(O) xt_tcpudp xt_state xt_multiport iptable_filter scst_vdisk(O) iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack scst_cdrom(O) ip_tables scst(O) x_tables nls_cp437 isofs bridge stp llc zram(C) zsmalloc(C) hid_generic usbhid hid coretemp crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul microcode psmouse serio_raw arc4 iwldvm mac80211 i915 drm_kms_helper drm iwlwifi intel_agp i2c_algo_bit cfg80211 intel_gtt video ahci libahci e1000e [last unloaded: tpm_bios]
> [   47.974636] CPU 0
> [   47.976456] Pid: 2468, comm: pm-suspend Tainted: G         C O 3.8.0-orc #19 Intel Corporation SandyBridge Platform/Emerald Lake
> [   47.988310] RIP: e030:[<ffffffff8149196b>]  [<ffffffff8149196b>] __cpuidle_register_device+0x2b/0x100
> [   47.997605] RSP: e02b:ffff880025685c98  EFLAGS: 00010286
> [   48.002970] RAX: 0000000000000000 RBX: ffff88002de40000 RCX: 0000000000000000
> [   48.010154] RDX: ffff880025685fd8 RSI: 0000000000000007 RDI: ffff88002de40000
> [   48.017336] RBP: ffff880025685cb8 R08: 0000000000021120 R09: 0000000000000000
> [   48.024520] R10: 0000000000000030 R11: 0000000000000000 R12: ffff88002de40000
> [   48.031742] R13: 00000000ffffffde R14: 00000000ffffffea R15: 0000000000000000
> [   48.038927] FS:  00007fb599d0e700(0000) GS:ffff88002de00000(0000) knlGS:0000000000000000
> [   48.047060] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [   48.052859] CR2: 0000000000000008 CR3: 000000000345b000 CR4: 0000000000002660
> [   48.060043] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   48.067223] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [   48.074450] Process pm-suspend (pid: 2468, threadinfo ffff880025684000, task ffff880003558000)
> [   48.083102] Stack:
> [   48.085179]  ffff88002de40000 ffff88002de40000 00000000ffffffde ffffffff81a6b480
> [   48.092622]  ffff880025685cd8 ffffffff81491cc1 0000000000000001 ffff88002de40000
> [   48.100064]  ffff880025685cf8 ffffffff813046df 0000000000000001 0000000000000001
> [   48.107517] Call Trace:
> [   48.110029]  [<ffffffff81491cc1>] cpuidle_register_device+0x31/0x80
> [   48.116348]  [<ffffffff813046df>] intel_idle_cpu_init+0xbf/0x120
> [   48.122423]  [<ffffffff813047b0>] cpu_hotplug_notify+0x70/0x80
> [   48.128310]  [<ffffffff815a619d>] notifier_call_chain+0x4d/0x70
> [   48.134281]  [<ffffffff8107969e>] __raw_notifier_call_chain+0xe/0x10
> [   48.140686]  [<ffffffff81053bb0>] __cpu_notify+0x20/0x40
> [   48.146050]  [<ffffffff81594c7c>] _cpu_up+0xf1/0x138
> [   48.151070]  [<ffffffff8158ab39>] enable_nonboot_cpus+0x99/0xd0
> [   48.157090]  [<ffffffff81097b8d>] suspend_devices_and_enter+0x25d/0x330
> [   48.163752]  [<ffffffff81097def>] pm_suspend+0x18f/0x1f0
> [   48.169117]  [<ffffffff81096dea>] state_store+0x8a/0x100
> [   48.174483]  [<ffffffff812ac29f>] kobj_attr_store+0xf/0x30
> [   48.180022]  [<ffffffff811c005f>] sysfs_write_file+0xef/0x170
> [   48.185943]  [<ffffffff8115c253>] vfs_write+0xb3/0x180
> [   48.191056]  [<ffffffff8115c592>] sys_write+0x52/0xa0
> [   48.196160]  [<ffffffff815a614e>] ? do_page_fault+0xe/0x10
> [   48.201700]  [<ffffffff815aa7d9>] system_call_fastpath+0x16/0x1b
> [   48.207758] Code: 66 66 66 66 90 55 48 89 e5 48 83 ec 20 48 89 5d e0 4c 89 6d f0 48 89 fb 4c 89 75 f8 4c 89 65 e8 41 be ea ff ff ff e8 75 0a 00 00<48>  8b 78 08 49 89 c5 e8 19 80 c1 ff 84 c0 74 53 8b 43 04 49 c7
> [   48.226658] RIP  [<ffffffff8149196b>] __cpuidle_register_device+0x2b/0x100

Hm, that is suspect. There should not be any cpuidle_register? Perhaps
you are .. ah yes, you are hitting a bug that should be in the stable
tree fix.

Here is the git commit b88a634a903d9670aa5f2f785aa890628ce0dece and
6f8c2e7933679f54b6478945dc72e59ef9a3d5e0

> [   48.233582]  RSP<ffff880025685c98>
> [   48.237131] CR2: 0000000000000008
>
> [   48.240521] ---[ end trace 535ebe28cd06b143 ]---
>
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.