Xen project Mailing List

S3 resume issue in cpufreq -> get_cpu_idle_time->vcpu_runstate_get

From: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>

Date: Tue, 17 Aug 2021 16:50:51 +0200

Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Delivery-date: Tue, 17 Aug 2021 14:51:10 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Aug 17, 2021 at 04:04:24PM +0200, Jan Beulich wrote: > On 17.08.2021 15:48, Marek Marczykowski-Górecki wrote: > > On Tue, Aug 17, 2021 at 02:29:20PM +0100, Andrew Cooper wrote: > >> On 17/08/2021 14:21, Jan Beulich wrote: > >>> On 17.08.2021 15:06, Andrew Cooper wrote: > >>>> Perhaps we want the cpu_down() logic to explicitly invalidate their > >>>> lazily cached values? > >>> I'd rather do this on the cpu_up() path (no point clobbering what may > >>> get further clobbered, and then perhaps not to a value of our liking), > >>> yet then we can really avoid doing this from a notifier and instead do > >>> it early enough in xstate_init() (taking care of XSS at the same time). > > > > Funny you mention notifiers. Apparently cpufreq driver does use it to > > initialize things. And fails to do so: > > > > (XEN) Finishing wakeup from ACPI S3 state. > > (XEN) CPU0: xstate: size: 0x440 (uncompressed 0x440) and states: 0x1f > > (XEN) Enabling non-boot CPUs ... > > (XEN) CPU1: xstate: size: 0x440 (uncompressed 0x440) and states: 0x1f > > (XEN) ----[ Xen-4.16-unstable x86_64 debug=y Not tainted ]---- > > (XEN) CPU: 0 > > (XEN) RIP: e008:[<ffff82d04024ad2b>] vcpu_runstate_get+0x153/0x244 > > (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor > > (XEN) rax: 0000000000000000 rbx: ffff830049667c50 rcx: 0000000000000001 > > (XEN) rdx: 000000321d74d000 rsi: ffff830049667c50 rdi: ffff83025dcc0000 > > (XEN) rbp: ffff830049667c40 rsp: ffff830049667c10 r8: ffff83020511a820 > > (XEN) r9: ffff82d04057ef78 r10: 0180000000000000 r11: 8000000000000000 > > (XEN) r12: ffff83025dcc0000 r13: ffff830205118c60 r14: 0000000000000001 > > (XEN) r15: 0000000000000010 cr0: 000000008005003b cr4: 00000000003526e0 > > (XEN) cr3: 0000000049656000 cr2: 0000000000000028 > > (XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 0000000000000000 > > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > > (XEN) Xen code around <ffff82d04024ad2b> (vcpu_runstate_get+0x153/0x244): > > (XEN) 48 8b 14 ca 48 8b 04 02 <4c> 8b 70 28 e9 01 ff ff ff 4c 8d 3d dd 64 > > 32 00 > > (XEN) Xen stack trace from rsp=ffff830049667c10: > > (XEN) 0000000000000180 ffff83025dcbd410 ffff83020511bf30 ffff830205118c60 > > (XEN) 0000000000000001 0000000000000010 ffff830049667c80 ffff82d04024ae73 > > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > (XEN) 0000000000000000 0000000000000000 ffff830049667cb8 ffff82d0402560a9 > > (XEN) ffff830205118320 0000000000000001 ffff83020511bf30 ffff83025dc7a6f0 > > (XEN) 0000000000000000 ffff830049667d58 ffff82d040254cb1 00000001402e9f74 > > (XEN) 0000000000000000 ffff830049667d10 ffff82d040224eda 000000000025dc81 > > (XEN) 000000321d74d000 ffff82d040571278 0000000000000001 ffff830049667d28 > > (XEN) ffff82d040228b44 ffff82d0403102cf 0000000000000000 ffff82d0402283a4 > > (XEN) ffff82d040459688 ffff82d040459680 ffff82d040459240 0000000000000004 > > (XEN) 0000000000000000 ffff830049667d68 ffff82d04025510e ffff830049667db0 > > (XEN) ffff82d040221ba4 0000000000000000 0000000000000001 0000000000000001 > > (XEN) 0000000000000000 ffff830049667e00 0000000000000001 ffff82d04058a5c0 > > (XEN) ffff830049667dc8 ffff82d040203867 0000000000000001 ffff830049667df0 > > (XEN) ffff82d040203c51 ffff82d040459400 0000000000000001 0000000000000010 > > (XEN) ffff830049667e20 ffff82d040203e26 ffff830049667ef8 0000000000000000 > > (XEN) 0000000000000003 0000000000000200 ffff830049667e50 ffff82d040270bac > > (XEN) ffff83020116a640 ffff830258ff6000 0000000000000000 0000000000000000 > > (XEN) ffff830049667e70 ffff82d0402056aa ffff830258ff61b8 ffff82d0405701b0 > > (XEN) ffff830049667e88 ffff82d04022963c ffff82d0405701a0 ffff830049667eb8 > > (XEN) Xen call trace: > > (XEN) [<ffff82d04024ad2b>] R vcpu_runstate_get+0x153/0x244 This is xen/common/sched/core.c:322. get_sched_res(v->processor) is NULL at this point for CPU1. The only place that can calls set_sched_res() and doesn't expect the previous value to be valid, is cpu_schedule_up(). For non-BSP its called _only_ from notifier at CPU_UP_PREPARE (cpu_schedule_callback()), but that notifier explicitly exclude suspend/resume case: static int cpu_schedule_callback( struct notifier_block *nfb, unsigned long action, void *hcpu) { unsigned int cpu = (unsigned long)hcpu; int rc = 0; /* * All scheduler related suspend/resume handling needed is done in * cpupool.c. */ if ( system_state > SYS_STATE_active ) return NOTIFY_DONE; But, nothing in cpupool.c is calling into set_sched_res(). On the other hand, sched_rm_cpu() (which I believe is called as part of parking the CPU) calls cpu_schedule_down(), which then calls set_sched_res(cpu, NULL). In short: scheduler for parked CPUs is not re-initialized during resume. But cpufreq expects it to be... > > (XEN) [<ffff82d04024ae73>] F get_cpu_idle_time+0x57/0x59 > > (XEN) [<ffff82d0402560a9>] F cpufreq_statistic_init+0x191/0x210 > > (XEN) [<ffff82d040254cb1>] F cpufreq_add_cpu+0x3cc/0x5bb > > (XEN) [<ffff82d04025510e>] F cpufreq.c#cpu_callback+0x27/0x32 > > (XEN) [<ffff82d040221ba4>] F notifier_call_chain+0x6c/0x96 > > (XEN) [<ffff82d040203867>] F cpu.c#cpu_notifier_call_chain+0x1b/0x36 > > (XEN) [<ffff82d040203c51>] F cpu_up+0xaf/0xc8 > > (XEN) [<ffff82d040203e26>] F enable_nonboot_cpus+0x6b/0x1f8 > > (XEN) [<ffff82d040270bac>] F power.c#enter_state_helper+0x152/0x60a > > (XEN) [<ffff82d0402056aa>] F > > domain.c#continue_hypercall_tasklet_handler+0x4c/0xb9 > > (XEN) [<ffff82d04022963c>] F tasklet.c#do_tasklet_work+0x76/0xac > > (XEN) [<ffff82d040229920>] F do_tasklet+0x58/0x8a > > (XEN) [<ffff82d0402e6607>] F domain.c#idle_loop+0x74/0xdd > > (XEN) > > (XEN) Pagetable walk from 0000000000000028: > > (XEN) L4[0x000] = 000000025dce1063 ffffffffffffffff > > (XEN) L3[0x000] = 000000025dce0063 ffffffffffffffff > > (XEN) L2[0x000] = 000000025dcdf063 ffffffffffffffff > > (XEN) L1[0x000] = 0000000000000000 ffffffffffffffff > > (XEN) > > (XEN) **************************************** > > (XEN) Panic on CPU 0: > > (XEN) FATAL PAGE FAULT > > (XEN) [error_code=0000] > > (XEN) Faulting linear address: 0000000000000028 > > (XEN) **************************************** > > > > This is after adding brutal `this_cpu(xcr0) = 0` in xstate_init(). > > And presumably again only with "smt=0"? Yes. With smt=1 suspend works fine on this particular machine. At least with only dom0 running (haven't tried with any domU yet)... > In any event, for us to not mix > things, may I ask that you start a new thread for this further issue? Sure. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab

Attachment: signature.asc
Description: PGP signature

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.