[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Crash in set_cpu_sibling_map() booting Xen 4.6.0 on Fusion



>>> On 20.11.15 at 02:22, <eswierk@xxxxxxxxxxxxxxxxxx> wrote:
> (XEN) ----[ Xen-4.6.1-pre  x86_64  debug=n  Not tainted ]----
> (XEN) CPU:    3
> (XEN) RIP:    e008:[<ffff82d08018302f>] set_cpu_sibling_map+0x3f/0x330
> (XEN) RFLAGS: 0000000000010006   CONTEXT: hypervisor
> (XEN) rax: 0000000000000001   rbx: 0000000000000000   rcx: 000000313d5b4080
> (XEN) rdx: 0000000000000006   rsi: 0000000000000000   rdi: 0000000000000003
> (XEN) rbp: 0000000000000300   rsp: ffff8301bd87fe90   r8:  ffff8301bd878000
> (XEN) r9:  000000313d5b4080   r10: 0000000000000001   r11: 0000000000000001
> (XEN) r12: ffff82d0802fd500   r13: 0000000000000000   r14: 0000000000000000
> (XEN) r15: 0000000000000003   cr0: 000000008005003b   cr4: 00000000001526a0
> (XEN) cr3: 00000000bfc75000   cr2: 0000000000000001
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> (XEN) Xen stack trace from rsp=ffff8301bd87fe90:
> (XEN)    00000003802fd800 0000000000000018 0000000000000000 0000010000000000
> (XEN)    ffff82d0802fd800 0000000000000000 00000000000000c8 0000000000000003
> (XEN)    0000000000000000 0000000000000000 0000000000000000 ffff82d0801834dc
> (XEN)    0000000000000000 0000000000000001 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000003 ffff8300bfafc000
> (XEN)    000000313d5b4080 0000000000000000
> (XEN) Xen call trace:
> (XEN)    [<ffff82d08018302f>] set_cpu_sibling_map+0x3f/0x330
> (XEN)    [<ffff82d0801834dc>] start_secondary+0x1bc/0x250
> (XEN)
> (XEN) Pagetable walk from 0000000000000001:
> (XEN)  L4[0x000] = 00000001bd8f0063 ffffffffffffffff
> (XEN)  L3[0x000] = 00000001bd8ef063 ffffffffffffffff
> (XEN)  L2[0x000] = 00000001bd8ee063 ffffffffffffffff
> (XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 3:
> (XEN) FATAL PAGE FAULT
> (XEN) [error_code=0002]
> (XEN) Faulting linear address: 0000000000000001
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
> 
> set_cpu_sibling_map+0x3f is the second cpumask_set_cpu() call in
> set_cpu_sibling_map():
> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/smpboot.c;h=0 
> 94699286f4f6962942024ec8b2b24c7b7996cc0;hb=78833c04250416f1870c458309d3ac0e5c
> f915fd#l261

I suppose cpu_to_socket(cpu) returns a value for which the
socket_cpumask[] entry didn't get set up yet. But to prove that,
we'd need to see the disassembly around the code location
above, to be able to associate register values with variables.

If that's the case, then I'd further guess that the CPUID
information provided by Fusion isn't exactly as one would expect
on real hardware. Whether we need to fix something, or can
work around a quirk of theirs depends on the exact nature of
the issue. Instrumenting code populating socket_cpumask[]
would be a good first step.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.