[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [BUG] unable to shutdown (page fault in mwait_idle()/do_dbs_timer()/__find_next_bit()) (fwd)
Hello. On Tue, 9 Jan 2018, Jan Beulich wrote: On 08.01.18 at 17:07, <martin@xxxxxxxxx> wrote:On Mon, 8 Jan 2018, Jan Beulich wrote:On 07.01.18 at 13:34, <martin@xxxxxxxxx> wrote:(XEN) ----[ Xen-4.10.0-vgpu x86_64 debug=n Not tainted ]----The -vgpu tag makes me wonder whether you have any patches in your tree on top of plain 4.10.0 (or 4.10-staging). Also the debug=n above ...4.10.0 + 11 patches to make nvidia/vgpu work (https://github.com/xenserver/xen-4.7.pg). debug=n because xen's modified debug build process.(XEN) [<ffff82d08026ae60>] __find_next_bit+0x10/0x80 (XEN) [<ffff82d080253180>] cpufreq_ondemand.c#do_dbs_timer+0x160/0x220 (XEN) [<ffff82d0802c7c0e>] mwait-idle.c#mwait_idle+0x23e/0x340 (XEN) [<ffff82d08026fa56>] domain.c#idle_loop+0x86/0xc0... makes this call trace unreliable. But even with a reliable call trace, analysis of the crash would be helped if you made available the xen-syms (or xen.efi, depending on how you boot) somewhere.xen-syms - http://www.uschovna.cz/en/zasilka/UDP5LVE2679CGBIS-4YV/Thanks. Looks to be a race between a timer in the governor and the CPUs being brought down. In general the governor is supposed to be disabled in the course of CPUs being brought down, so first of all I wonder whether you're having some daemon in use which sends management requests to the CPUfreq driver in Xen. Such a daemon should of course be disabled by the system shutdown scripts. Otherwise please try the attached debugging patch - maybe we can see something from its output. I suppose there should no be running anything because Dom0 kernel already ended (see last two messages from dom0 kernel). Or how to check it ? Patch added. - no "dbs:" in output (grep "dbs:" ...) - exaples of shutdown output (1* OK + 2* fail): ----------------------------------------------------- [ 632.439402] ACPI: Preparing to enter system sleep state S5 [ 632.486728] reboot: Power down (XEN) Preparing system for ACPI S5 state. (XEN) Disabling non-boot CPUs ... (XEN) cpufreq: del CPU1 (1,ffaaab,1,2) (XEN) Broke affinity for irq 140 (XEN) cpufreq: del CPU2 (1,4,1,4) (XEN) Broke affinity for irq 139 (XEN) cpufreq: del CPU3 (1,ffaaa9,1,8) (XEN) Broke affinity for irq 83 (XEN) cpufreq: del CPU4 (1,10,1,10) (XEN) Broke affinity for irq 137 (XEN) cpufreq: del CPU5 (1,ffaaa1,1,20) (XEN) cpufreq: del CPU6 (1,40,1,40) (XEN) Broke affinity for irq 141 (XEN) cpufreq: del CPU7 (1,ffaa81,1,80) (XEN) cpufreq: del CPU8 (1,100,1,100) (XEN) cpufreq: del CPU9 (1,ffaa01,1,200) (XEN) cpufreq: del CPU10 (1,400,1,400) (XEN) cpufreq: del CPU11 (1,ffa801,1,800) (XEN) cpufreq: del CPU12 (1,1000,1,1000) (XEN) cpufreq: del CPU13 (1,ffa001,1,2000) (XEN) cpufreq: del CPU14 (1,4000,1,4000) (XEN) cpufreq: del CPU15 (1,ff8001,1,8000) (XEN) cpufreq: del CPU16 (1,ff0001,1,10000) (XEN) cpufreq: del CPU17 (1,fe0001,1,20000) (XEN) cpufreq: del CPU18 (1,fc0001,1,40000) (XEN) cpufreq: del CPU19 (1,f80001,1,80000) (XEN) cpufreq: del CPU20 (1,f00001,1,100000) (XEN) cpufreq: del CPU21 (1,e00001,1,200000) (XEN) cpufreq: del CPU22 (1,c00001,1,400000) (XEN) cpufreq: del CPU23 (1,800001,1,800000) (XEN) Broke affinity for irq 72 (XEN) cpufreq: del CPU0 (1,1,1,1) (XEN) Entering ACPI S5 state. ----------------------------------------------------------- [ 669.171396] ACPI: Preparing to enter system sleep state S5 [ 669.218637] reboot: Power down (XEN) Preparing system for ACPI S5 state. (XEN) Disabling non-boot CPUs ... (XEN) cpufreq: del CPU1 (1,ffaaab,1,2) (XEN) Broke affinity for irq 138 (XEN) cpufreq: del CPU2 (1,4,1,4) (XEN) Broke affinity for irq 141 (XEN) cpufreq: del CPU3 (1,ffaaa9,1,8) (XEN) cpufreq: del CPU4 (1,10,1,10) (XEN) cpufreq: del CPU5 (1,ffaaa1,1,20) (XEN) Broke affinity for irq 140 (XEN) cpufreq: del CPU6 (1,40,1,40) (XEN) Broke affinity for irq 139 (XEN) cpufreq: del CPU7 (1,ffaa81,1,80) (XEN) Broke affinity for irq 137 (XEN) cpufreq: del CPU8 (1,100,1,100) (XEN) cpufreq: del CPU9 (1,ffaa01,1,200) (XEN) cpufreq: del CPU10 (1,400,1,400) (XEN) cpufreq: del CPU11 (1,ffa801,1,800) (XEN) cpufreq: del CPU12 (1,1000,1,1000) (XEN) cpufreq: del CPU13 (1,ffa001,1,2000) (XEN) cpufreq: del CPU14 (1,4000,1,4000) (XEN) cpufreq: del CPU15 (1,ff8001,1,8000) (XEN) cpufreq: del CPU16 (1,ff0001,1,10000) (XEN) cpufreq: del CPU17 (1,fe0001,1,20000) (XEN) cpufreq: del CPU18 (1,fc0001,1,40000) (XEN) cpufreq: del CPU19 (1,f80001,1,80000) (XEN) cpufreq: del CPU20 (1,f00001,1,100000) (XEN) cpufreq: del CPU21 (1,e00001,1,200000) (XEN) cpufreq: del CPU22 (1,c00001,1,400000) (XEN) cpufreq: del CPU23 (1,800001,1,800000) (XEN) ----[ Xen-4.10.0-vgpu x86_64 debug=n Not tainted ]---- (XEN) CPU: 23 (XEN) RIP: e008:[<ffff82d08026aed0>] __find_next_bit+0x10/0x80 (XEN) RFLAGS: 0000000000010206 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: ffff830879db0400 rcx: 0000000000000018 (XEN) rdx: 0000000000000018 rsi: 0000000000000018 rdi: 0000000000000000 (XEN) rbp: 00000000061c6652 rsp: ffff83104eaafdd8 r8: 0000000000000018 (XEN) r9: ffff830879db6d70 r10: ffff830879db28e8 r11: 0000009df890a1e7 (XEN) r12: 0000000000000000 r13: ffff8308788cef80 r14: ffff82d0805614e0 (XEN) r15: 0000000000000017 cr0: 000000008005003b cr4: 00000000001526e0 (XEN) cr3: 000000007da2f000 cr2: 0000000000000000 (XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Xen code around <ffff82d08026aed0> (__find_next_bit+0x10/0x80): (XEN) e1 3f 48 8d 3c c7 74 25 <4c> 8b 0f 41 b8 40 00 00 00 41 29 c8 49 d3 e9 49 (XEN) Xen stack trace from rsp=ffff83104eaafdd8: (XEN) ffff82d0802531f0 0000000000000017 ffff830800000018 ffff82d080577380 (XEN) 00200f0879db6d98 0000009dd4bdccf5 0000000000000004 ffff830879db6e40 (XEN) ffff82d08054ac80 0000009dd4bdccf5 0000000000000017 0000000000000017 (XEN) ffff82d0802c7c7e 0000000000000d43 0000009dcec82f1b ffff830879db6ef8 (XEN) 0000002000000008 000001cf00000390 0000000000000000 0000000000000000 (XEN) 0000001900000001 ffff82e028c4b300 ffff82000007ffff ffff82d080552c80 (XEN) ffff82d08054b800 ffff82d0805771f0 0000000000000017 0000000000000017 (XEN) ffff82d0805614e0 ffff82d080420e80 ffff82d08026fac6 0000000000000000 (XEN) ffff83104eaaffff ffff83007ddf1000 ffff83007ddf1000 ffff83007ddf1000 (XEN) ffff830879db0180 ffff830879db0188 0000009dcec71067 ffff82d0805614e0 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000017 ffff83007ddf1000 00000037f9839080 (XEN) 00000000001526e0 (XEN) Xen call trace: (XEN) [<ffff82d08026aed0>] __find_next_bit+0x10/0x80 (XEN) [<ffff82d0802531f0>] cpufreq_ondemand.c#do_dbs_timer+0x160/0x220 (XEN) [<ffff82d0802c7c7e>] mwait-idle.c#mwait_idle+0x23e/0x340 (XEN) [<ffff82d08026fac6>] domain.c#idle_loop+0x86/0xc0 (XEN) (XEN) Pagetable walk from 0000000000000000: (XEN) L4[0x000] = 000000087ffeb063 ffffffffffffffff (XEN) L3[0x000] = 000000087ffea063 ffffffffffffffff (XEN) L2[0x000] = 000000087ffe9063 ffffffffffffffff (XEN) L1[0x000] = 0000000000000000 ffffffffffffffff (XEN) (XEN) **************************************** (XEN) Panic on CPU 23: (XEN) FATAL PAGE FAULT (XEN) [error_code=0000] (XEN) Faulting linear address: 0000000000000000 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... (XEN) Resetting with ACPI MEMORY or I/O RESET_REG. ------------------------------------------------------------- [ 305.965633] ACPI: Preparing to enter system sleep state S5^M [ 306.012876] reboot: Power down^M (XEN) Preparing system for ACPI S5 state. (XEN) Disabling non-boot CPUs ... (XEN) cpufreq: del CPU1 (1,ffaaab,1,2) (XEN) Broke affinity for irq 83 (XEN) cpufreq: del CPU2 (1,4,1,4) (XEN) Broke affinity for irq 138 (XEN) cpufreq: del CPU3 (1,ffaaa9,1,8) (XEN) Broke affinity for irq 137 (XEN) cpufreq: del CPU4 (1,10,1,10) (XEN) cpufreq: del CPU5 (1,ffaaa1,1,20) (XEN) Broke affinity for irq 140 (XEN) cpufreq: del CPU6 (1,40,1,40) (XEN) Broke affinity for irq 139 (XEN) cpufreq: del CPU7 (1,ffaa81,1,80) (XEN) cpufreq: del CPU8 (1,100,1,100) (XEN) cpufreq: del CPU9 (1,ffaa01,1,200) (XEN) cpufreq: del CPU10 (1,400,1,400) (XEN) cpufreq: del CPU11 (1,ffa801,1,800) (XEN) cpufreq: del CPU12 (1,1000,1,1000) (XEN) cpufreq: del CPU13 (1,ffa001,1,2000) (XEN) cpufreq: del CPU14 (1,4000,1,4000) (XEN) cpufreq: del CPU15 (1,ff8001,1,8000) (XEN) cpufreq: del CPU16 (1,ff0001,1,10000) (XEN) cpufreq: del CPU17 (1,fe0001,1,20000) (XEN) cpufreq: del CPU18 (1,fc0001,1,40000) (XEN) cpufreq: del CPU19 (1,f80001,1,80000) (XEN) cpufreq: del CPU20 (1,f00001,1,100000) (XEN) ----[ Xen-4.10.0-vgpu x86_64 debug=n Not tainted ]---- (XEN) CPU: 20 (XEN) RIP: e008:[<ffff82d08026aed0>] __find_next_bit+0x10/0x80 (XEN) RFLAGS: 0000000000010202 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: ffff830879dbc400 rcx: 0000000000000015 (XEN) rdx: 0000000000000015 rsi: 0000000000000018 rdi: 0000000000000000 (XEN) rbp: 00000000061bd8a8 rsp: ffff83104ead7dd8 r8: 0000000000000018 (XEN) r9: ffff83104eaea670 r10: ffff83104eaeade8 r11: 0000004974cb66fb (XEN) r12: 0000000000000000 r13: ffff8308788cfb20 r14: ffff82d0805614e0 (XEN) r15: 0000000000000014 cr0: 000000008005003b cr4: 00000000001526e0 (XEN) cr3: 000000007da2f000 cr2: 0000000000000000 (XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Xen code around <ffff82d08026aed0> (__find_next_bit+0x10/0x80): (XEN) e1 3f 48 8d 3c c7 74 25 <4c> 8b 0f 41 b8 40 00 00 00 41 29 c8 49 d3 e9 49 (XEN) Xen stack trace from rsp=ffff83104ead7dd8: (XEN) ffff82d0802531f0 0000000000000014 0000000000000018 ffff82d080577380 (XEN) 00200f084eaea698 00000049474d9c83 0000000000000004 ffff83104eaea960 (XEN) ffff82d08054ac80 00000049474d9c83 0000000000000014 0000000000000014 (XEN) ffff82d0802c7c7e ffff830879dbc300 000000494157ac5e ffff83104eaeaa18 (XEN) 0000002000000008 0000035f00000464 0000000000000000 0000000000000000 (XEN) 0000001900000001 ffff82e028c4b530 ffff82000007ffff ffff82d080552c80 (XEN) ffff82d08054b680 ffff82d0805771f0 0000000000000014 0000000000000014 (XEN) ffff82d0805614e0 ffff82d080420e80 ffff82d08026fac6 0000000000000000 (XEN) ffff83104ead7fff ffff83007ddf4000 ffff83007ddf4000 ffff83007ddf4000 (XEN) ffff830879dbc180 ffff830879dbc188 000000494156fa81 ffff82d0805614e0 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000014 ffff83007ddf4000 00000037f9845080 (XEN) 00000000001526e0 (XEN) Xen call trace: (XEN) [<ffff82d08026aed0>] __find_next_bit+0x10/0x80 (XEN) [<ffff82d0802531f0>] cpufreq_ondemand.c#do_dbs_timer+0x160/0x220 (XEN) [<ffff82d0802c7c7e>] mwait-idle.c#mwait_idle+0x23e/0x340 (XEN) [<ffff82d08026fac6>] domain.c#idle_loop+0x86/0xc0 (XEN) (XEN) Pagetable walk from 0000000000000000: (XEN) L4[0x000] = 000000087ffeb063 ffffffffffffffff (XEN) L3[0x000] = 000000087ffea063 ffffffffffffffff (XEN) L2[0x000] = 000000087ffe9063 ffffffffffffffff (XEN) L1[0x000] = 0000000000000000 ffffffffffffffff (XEN) (XEN) **************************************** (XEN) Panic on CPU 20: (XEN) FATAL PAGE FAULT (XEN) [error_code=0000] (XEN) Faulting linear address: 0000000000000000 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... (XEN) Resetting with ACPI MEMORY or I/O RESET_REG. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |