[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)



>>> On 04.11.13 at 20:54, Lars Kurth <lars.kurth.xen@xxxxxxxxx> wrote:
> See
> http://xenproject.org/help/questions-and-answers/hypervisor-fatal-page-fault-xen-4-3-
>  
> 1.html
> ---
> I have a 32 core system running XEN 4.3.1 with 30 Windows XP VM's.
> DOM0 is Centos 6.3 based with linux kernel 3.10.16.
> In my configuration all of the windows HVMs are running having been
> restored from xl save.
> VM's are destroyed or restored in an on-demand fashion. After some time XEN
> will experience a fatal page fault while restoring one of the windows HVM
> subjects. This does not happen very often, perhaps once in a 16 to 48 hour
> period.
> The stack trace from xen follows. Thanks in advance for any help.
> 
> (XEN) ----[ Xen-4.3.1 x86_64 debug=n Tainted: C ]----
> (XEN) CPU: 52
> (XEN) RIP: e008:[] domain_page_map_to_mfn+0x86/0xc0

Zapping addresses (here and below in the stack trace) is never
helpful when someone asks for help with a crash. Also, in order
to not just guess, the matching xen-syms or xen.efi should be
made available or pointed to.

> (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
> (XEN) rax: 000ffffffffff000 rbx: ffff8300bb163760 rcx: 0000000000000000
> (XEN) rdx: ffff810000000000 rsi: 0000000000000000 rdi: 0000000000000000
> (XEN) rbp: ffff8300bb163000 rsp: ffff8310333e7cd8 r8: 0000000000000000
> (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
> (XEN) r12: ffff8310333e7f18 r13: 0000000000000000 r14: 0000000000000000
> (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000426f0
> (XEN) cr3: 000000211bee5000 cr2: ffff810000000000
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> (XEN) Xen stack trace from rsp=ffff8310333e7cd8:
> (XEN) 0000000000000001 ffff82c4c01de869 ffff82c4c0182c70 ffff8300bb163000
> (XEN) 0000000000000014 ffff8310333e7f18 0000000000000000 ffff82c4c01d7548
> (XEN) ffff8300bb163490 ffff8300bb163000 ffff82c4c01c65b8 ffff8310333e7e60
> (XEN) ffff82c4c01badef ffff8300bb163000 0000000000000003 ffff833144d8e000
> (XEN) ffff82c4c01b4885 ffff8300bb163000 ffff8300bb163000 ffff8300bdff1000
> (XEN) 0000000000000001 ffff82c4c02f2880 ffff82c4c02f2880 ffff82c4c0308440
> (XEN) ffff82c4c01d0ea8 ffff8300bb163000 ffff82c4c015ad6c ffff82c4c02f2880
> (XEN) ffff82c4c02cf800 00000000ffffffff ffff8310333f5060 ffff82c4c02f2880
> (XEN) 0000000000000282 0010000000000000 0000000000000000 0000000000000000
> (XEN) 0000000000000000 ffff82c4c02f2880 ffff8300bdff1000 ffff8300bb163000
> (XEN) 000031a10f2b16ca 0000000000000001 ffff82c4c02f2880 ffff82c4c0308440
> (XEN) ffff82c4c0124444 0000000000000034 ffff8310333f5060 0000000001c9c380
> (XEN) 00000000c0155965 ffff82c4c01c6146 0000000001c9c380 ffffffffffffff00
> (XEN) ffff82c4c0128fa8 ffff8300bb163000 ffff8327d50e9000 ffff82c4c01bc490
> (XEN) 0000000000000000 ffff82c4c01dd254 0000000080549ae0 ffff82c4c01cfc3c
> (XEN) ffff8300bb163000 ffff82c4c01d6128 ffff82c4c0125db9 ffff82c4c0125db9
> (XEN) ffff8310333e0000 ffff8300bb163000 000000000012ffc0 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 ffff82c4c01deaa3
> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) 000000000012ffc0 000000007ffdf000 0000000000000000 0000000000000000
> (XEN) Xen call trace:
> (XEN) [] domain_page_map_to_mfn+0x86/0xc0
> (XEN) [] nvmx_handle_vmlaunch+0x49/0x160
> (XEN) [] __update_vcpu_system_time+0x240/0x310
> (XEN) [] vmx_vmexit_handler+0xb58/0x18c0
> (XEN) [] pt_restore_timer+0xa8/0xc0
> (XEN) [] hvm_io_assist+0xef/0x120
> (XEN) [] hvm_do_resume+0x195/0x1c0
> (XEN) [] vmx_do_resume+0x148/0x210
> (XEN) [] context_switch+0x1bc/0xfc0
> (XEN) [] schedule+0x254/0x5f0
> (XEN) [] pt_update_irq+0x256/0x2b0
> (XEN) [] timer_softirq_action+0x168/0x210
> (XEN) [] hvm_vcpu_has_pending_irq+0x50/0xb0
> (XEN) [] nvmx_switch_guest+0x54/0x1560
> (XEN) [] vmx_intr_assist+0x6c/0x490
> (XEN) [] vmx_vmenter_helper+0x88/0x160
> (XEN) [] __do_softirq+0x69/0xa0
> (XEN) [] __do_softirq+0x69/0xa0
> (XEN) [] vmx_asm_do_vmentry+0/0xed
> (XEN)
> (XEN) Pagetable walk from ffff810000000000:
> (XEN) L4[0x102] = 000000211bee5063 ffffffffffffffff
> (XEN) L3[0x000] = 0000000000000000 ffffffffffffffff

This makes me suspect that domain_page_map_to_mfn() gets a
NULL pointer passed here. As said above, this is only guesswork
at this point, and as Ian already pointed out, directing the
reporter to xen-devel would seem to be the right thing to do
here anyway.

Jan

> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 52:
> (XEN) FATAL PAGE FAULT
> (XEN) [error_code=0000]
> (XEN) Faulting linear address: ffff810000000000
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
> (XEN) Resetting with ACPI MEMORY or I/O RESET_REG.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.