[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.12 panic on Thinkpad W540 with UEFI mutiboot2, efi=no-rs workarounds it



On Wed, Aug 07, 2019 at 04:45:43PM +0200, Jan Beulich wrote:
> On 07.08.2019 15:26, Marek Marczykowski-Górecki  wrote:
> > Hi,
> > 
> > Xen 4.12 crashes when booting on UEFI (with multiboot2) unless I disable
> > runtime services. The crash happens shortly after starting dom0 kernel.
> > Unfortunately I don't have serial console there, so the only log I have
> > is a photo of VGA console (attached). Below I retype part of the message:
> > 
> > (XEN) ----[ Xen-4.12.0-3.fc29  x86_64  debug=n   Not tainted ]----
> > (XEN) CPU:    0
> > (XEN) RIP:    e008:[<00000000000000f6>] 00000000000000f6
> > (XEN) RFLAGS: 0000000000010287   CONTEXT: hypervisor (d0v0)
> > ...
> > (XEN) Xen call trace:
> > (XEN)    [<00000000000000f6>] 00000000000000f6
> > (XEN)    [<ffff82d08026c6ad>] flushtlb.c#pre_flush+0x3d/0x80
> > (XEN)    [                  ] efi_runtime_call+0x493/0xbd0
> > (XEN)    [                  ] efi_runtime_call+0x441/0xbd0
> > (XEN)    [                  ] vcpu_restore_fpu_nonlazy+0xe7/0x180
> > (XEN)    [                  ] do_platform_op+0/0x1880
> > (XEN)    [                  ] do_platform_op+0xb9c/0x1880
> > (XEN)    [                  ] do_platform_op+0xb9c/0x1880
> > (XEN)    [                  ] sched_credit2.c#csched2_schedule+0xcd0/0x13a0
> > (XEN)    [                  ] lstar_enter+0xae/0x120
> > (XEN)    [                  ] do_platform_op+0/0x1880
> > (XEN)    [                  ] pv_hypercall+0x152/0x220
> > (XEN)    [                  ] lstar_enter+0xae/0x120
> > (XEN)    [                  ] lstar_enter+0xa2/0x120
> > (XEN)    [                  ] lstar_enter+0xae/0x120
> > (XEN)    [                  ] lstar_enter+0xa2/0x120
> > (XEN)    [                  ] lstar_enter+0xae/0x120
> > (XEN)    [                  ] lstar_enter+0xa2/0x120
> > (XEN)    [                  ] lstar_enter+0xae/0x120
> > (XEN)    [                  ] lstar_enter+0xa2/0x120
> > (XEN)    [                  ] lstar_enter+0xae/0x120
> > (XEN)    [                  ] lstar_enter+0xa2/0x120
> > (XEN)    [                  ] lstar_enter+0xae/0x120
> > (XEN)    [                  ] lstar_enter+0x10c/0x120
> > (XEN)
> > (XEN)
> > (XEN) *****************************************
> > (XEN) Panic on CPU 0:
> > (XEN) FATAL TRAP: vector = 0 (divide error)
> > (XEN) [error_code=0000]
> > (XEN) *****************************************
> > 
> > Any idea? Lack of serial console doesn't help with collecting logs...
> 
> May I suggest that you repeat this with a debug hypervisor, such that
> the call trace becomes more usable/sensible? 

Actually, I've already tried, but every other build I try, I get even
less useful call trace, for example debug unstable build:

    Xen call trace:
       [<000000007b751c09>] 000000007b751c09
       [<8c2b0398e0000daa>] 8c2b0398e0000daa
...
    GENERAL PROTECTION FAULT
    [error_code=0000]

(photo with full message attached)

> I think, for example,
> that the pre_flush() that caught Andrew's eye is a red herring, and
> that instead a call through NULL has happened in e.g.
> efi_runtime_call().

That's probably true...

Here is disassembled pre_flush anyway:
   0x0026c670 <+0>:     mov    0x1993ca,%edx
   0x0026c676 <+6>:     push   %ebx
   0x0026c677 <+7>:     test   %edx,%edx
   0x0026c679 <+9>:     je     0x26c6d8 <pre_flush+104>
   0x0026c67b <+11>:    lea    0x1(%edx),%ecx
   0x0026c67e <+14>:    mov    %edx,%eax
   0x0026c680 <+16>:    dec    %eax
   0x0026c681 <+17>:    mov    %ecx,%ebx
   0x0026c683 <+19>:    lock cmpxchg %ecx,0x1993b5
   0x0026c68b <+27>:    mov    %eax,%ecx
   0x0026c68d <+29>:    cmp    %edx,%eax
   0x0026c68f <+31>:    jne    0x26c6b8 <pre_flush+72>
   0x0026c691 <+33>:    test   %ebx,%ebx
   0x0026c693 <+35>:    je     0x26c6e0 <pre_flush+112>
   0x0026c695 <+37>:    cmpb   $0x0,0x18c344
   0x0026c69c <+44>:    jne    0x26c6a8 <pre_flush+56>
   0x0026c69e <+46>:    mov    %ebx,%eax
   0x0026c6a0 <+48>:    pop    %ebx
   0x0026c6a1 <+49>:    ret    
   0x0026c6a2 <+50>:    nopw   0x0(%eax,%eax,1)
   0x0026c6a8 <+56>:    call   0x2d4690 <hvm_asid_flush_core>
   0x0026c6ad <+61>:    mov    %ebx,%eax
   0x0026c6af <+63>:    pop    %ebx
   0x0026c6b0 <+64>:    ret    
   0x0026c6b1 <+65>:    mov    %eax,%ecx
   0x0026c6b3 <+67>:    nopl   0x0(%eax,%eax,1)
   0x0026c6b8 <+72>:    test   %ecx,%ecx
   0x0026c6ba <+74>:    je     0x26c6d8 <pre_flush+104>
   0x0026c6bc <+76>:    lea    0x1(%ecx),%edx
   0x0026c6bf <+79>:    mov    %ecx,%eax
   0x0026c6c1 <+81>:    dec    %eax
   0x0026c6c2 <+82>:    mov    %edx,%ebx
   0x0026c6c4 <+84>:    lock cmpxchg %edx,0x199374
   0x0026c6cc <+92>:    cmp    %eax,%ecx
   0x0026c6ce <+94>:    je     0x26c691 <pre_flush+33>
   0x0026c6d0 <+96>:    jmp    0x26c6b1 <pre_flush+65>
   0x0026c6d2 <+98>:    nopw   0x0(%eax,%eax,1)
   0x0026c6d8 <+104>:   xor    %ebx,%ebx
   0x0026c6da <+106>:   jmp    0x26c695 <pre_flush+37>
   0x0026c6dc <+108>:   nopl   0x0(%eax)
   0x0026c6e0 <+112>:   mov    $0x2,%edi
   0x0026c6e5 <+117>:   call   0x232170 <raise_softirq>
   0x0026c6ea <+122>:   jmp    0x26c695 <pre_flush+37>

> Of course trying to capture more of Xen's boot log (in particular
> the EFI memory map) may be helpful, too, if you could manage to do
> so.

Attached a photo of memory map...

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Attachment: memory-map.jpg
Description: JPEG image

Attachment: unstable-crash.jpg
Description: JPEG image

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.