Xen project Mailing List

Re: [Xen-devel] [BUG] unable to shutdown (page fault in mwait_idle()/do_dbs_timer()/__find_next_bit()) (fwd)

To: "Martin Cerveny" <M.Cerveny@xxxxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Thu, 01 Mar 2018 08:29:08 -0700

Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Thu, 01 Mar 2018 15:29:24 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

>>> On 09.01.18 at 23:51, <martin@xxxxxxxxx> wrote: > Hello. I'm sorry for taking so long to get back to this. > On Tue, 9 Jan 2018, Jan Beulich wrote: >>>>> On 08.01.18 at 17:07, <martin@xxxxxxxxx> wrote: >>> On Mon, 8 Jan 2018, Jan Beulich wrote: >>>>>>> On 07.01.18 at 13:34, <martin@xxxxxxxxx> wrote: >>>>> (XEN) ----[ Xen-4.10.0-vgpu x86_64 debug=n Not tainted ]---- >>>> >>>> The -vgpu tag makes me wonder whether you have any patches in >>>> your tree on top of plain 4.10.0 (or 4.10-staging). Also the debug=n >>>> above ... >>> >>> 4.10.0 + 11 patches to make nvidia/vgpu work >>> (https://github.com/xenserver/xen-4.7.pg). >>> debug=n because xen's modified debug build process. >>> >>>>> (XEN) [<ffff82d08026ae60>] __find_next_bit+0x10/0x80 >>>>> (XEN) [<ffff82d080253180>] cpufreq_ondemand.c#do_dbs_timer+0x160/0x220 >>>>> (XEN) [<ffff82d0802c7c0e>] mwait-idle.c#mwait_idle+0x23e/0x340 >>>>> (XEN) [<ffff82d08026fa56>] domain.c#idle_loop+0x86/0xc0 >>>> >>>> ... makes this call trace unreliable. But even with a reliable call >>>> trace, analysis of the crash would be helped if you made >>>> available the xen-syms (or xen.efi, depending on how you boot) >>>> somewhere. >>> >>> xen-syms - http://www.uschovna.cz/en/zasilka/UDP5LVE2679CGBIS-4YV/ >> >> Thanks. Looks to be a race between a timer in the governor and >> the CPUs being brought down. In general the governor is supposed >> to be disabled in the course of CPUs being brought down, so first >> of all I wonder whether you're having some daemon in use which >> sends management requests to the CPUfreq driver in Xen. Such a >> daemon should of course be disabled by the system shutdown >> scripts. Otherwise please try the attached debugging patch - >> maybe we can see something from its output. > > I suppose there should no be running anything because Dom0 kernel already > ended (see last two messages from dom0 kernel). Or how to check it ? > > Patch added. > - no "dbs:" in output (grep "dbs:" ...) Okay, this sufficiently proves that nothing tries to re-enable a governor after they were brought down for the CPUs. > - exaples of shutdown output (1* OK + 2* fail): > > ----------------------------------------------------- > > [ 632.439402] ACPI: Preparing to enter system sleep state S5 > [ 632.486728] reboot: Power down > (XEN) Preparing system for ACPI S5 state. > (XEN) Disabling non-boot CPUs ... > (XEN) cpufreq: del CPU1 (1,ffaaab,1,2) > (XEN) Broke affinity for irq 140 > (XEN) cpufreq: del CPU2 (1,4,1,4) > (XEN) Broke affinity for irq 139 > (XEN) cpufreq: del CPU3 (1,ffaaa9,1,8) > (XEN) Broke affinity for irq 83 > (XEN) cpufreq: del CPU4 (1,10,1,10) > (XEN) Broke affinity for irq 137 > (XEN) cpufreq: del CPU5 (1,ffaaa1,1,20) > (XEN) cpufreq: del CPU6 (1,40,1,40) > (XEN) Broke affinity for irq 141 > (XEN) cpufreq: del CPU7 (1,ffaa81,1,80) > (XEN) cpufreq: del CPU8 (1,100,1,100) > (XEN) cpufreq: del CPU9 (1,ffaa01,1,200) > (XEN) cpufreq: del CPU10 (1,400,1,400) > (XEN) cpufreq: del CPU11 (1,ffa801,1,800) > (XEN) cpufreq: del CPU12 (1,1000,1,1000) > (XEN) cpufreq: del CPU13 (1,ffa001,1,2000) > (XEN) cpufreq: del CPU14 (1,4000,1,4000) > (XEN) cpufreq: del CPU15 (1,ff8001,1,8000) > (XEN) cpufreq: del CPU16 (1,ff0001,1,10000) > (XEN) cpufreq: del CPU17 (1,fe0001,1,20000) > (XEN) cpufreq: del CPU18 (1,fc0001,1,40000) > (XEN) cpufreq: del CPU19 (1,f80001,1,80000) > (XEN) cpufreq: del CPU20 (1,f00001,1,100000) > (XEN) cpufreq: del CPU21 (1,e00001,1,200000) > (XEN) cpufreq: del CPU22 (1,c00001,1,400000) > (XEN) cpufreq: del CPU23 (1,800001,1,800000) > (XEN) Broke affinity for irq 72 > (XEN) cpufreq: del CPU0 (1,1,1,1) > (XEN) Entering ACPI S5 state. So this, despite being the working example, already shows a very odd arrangement: All odd numbered CPUs _and_ all even numbered ones from 16 up share a domain. The typical case would be for all sibling threads and/or cores to share a domain, iirc. > (XEN) cpufreq: del CPU23 (1,800001,1,800000) > (XEN) ----[ Xen-4.10.0-vgpu x86_64 debug=n Not tainted ]---- > (XEN) CPU: 23 So at the time the debug message gets printed, policy->cpus is still valid. With hw_all set, next thing is the call __cpufreq_governor(policy, CPUFREQ_GOV_STOP), which ought to lead to cpufreq_governor_dbs()'s "case CPUFREQ_GOV_STOP", which in turn calls dbs_timer_exit(). That function kills the timer that causes the crash. kill_timer(), in turn, makes sure the timer being killed does not currently have its handler executed on any CPU. IOW I can't spot any race, yet there quite obviously is one in your case. All I can do at the moment is hand you an extended debugging patch. Jan --- unstable.orig/xen/drivers/cpufreq/cpufreq.c 2017-09-12 12:39:58.310556379 +0200 +++ unstable/xen/drivers/cpufreq/cpufreq.c 2018-03-01 16:26:55.071367162 +0100 @@ -190,6 +190,7 @@ int cpufreq_add_cpu(unsigned int cpu) break; } } +printk("CPU%u: dom%u (%u,%u)\n", cpu, dom, domexist, hw_all);//temp if (!domexist) { cpufreq_dom = xzalloc(struct cpufreq_dom); @@ -352,6 +353,8 @@ int cpufreq_del_cpu(unsigned int cpu) /* for HW_ALL, stop gov for each core of the _PSD domain */ /* for SW_ALL & SW_ANY, stop gov for the 1st core of the _PSD domain */ +printk("cpufreq: del CPU%u (%u,%lx,%lu,%lx)\n", cpu,//temp + hw_all, cpufreq_dom->map->bits[0], perf->domain_info.num_processors, policy->cpus->bits[0]);//temp if (hw_all || (cpumask_weight(cpufreq_dom->map) == perf->domain_info.num_processors)) __cpufreq_governor(policy, CPUFREQ_GOV_STOP); --- unstable.orig/xen/drivers/cpufreq/cpufreq_ondemand.c 2017-09-12 12:39:58.310556379 +0200 +++ unstable/xen/drivers/cpufreq/cpufreq_ondemand.c 2018-03-01 16:21:41.829296179 +0100 @@ -179,6 +179,9 @@ static void do_dbs_timer(void *dbs) if (!dbs_info->enable) return; +if(system_state > SYS_STATE_active) {//temp + printk("dbs: check CPU%u\n", dbs_info->cpu); +} dbs_check_cpu(dbs_info); set_timer(&per_cpu(dbs_timer, dbs_info->cpu), @@ -218,6 +221,9 @@ int cpufreq_governor_dbs(struct cpufreq_ switch (event) { case CPUFREQ_GOV_START: +if(system_state > SYS_STATE_active) {//temp + printk("dbs: start CPU%u [%pS]\n", cpu, __builtin_return_address(0)); +} if ((!cpu_online(cpu)) || (!policy->cur)) return -EINVAL; @@ -273,12 +279,18 @@ int cpufreq_governor_dbs(struct cpufreq_ break; case CPUFREQ_GOV_STOP: +if(system_state > SYS_STATE_active) {//temp + printk("dbs: stop CPU%u (%u,%d)\n", cpu, this_dbs_info->cpu, this_dbs_info->enable); +} if ( !this_dbs_info->enable ) /* Already not enabled */ break; dbs_timer_exit(this_dbs_info); dbs_enable--; +if(system_state > SYS_STATE_active) {//temp + printk("dbs: stopped CPU%u (%d,%u)\n", cpu, per_cpu(dbs_timer, cpu).status, per_cpu(dbs_timer, cpu).cpu); +} break; _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.