[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [BUG] panic: "IO-APIC + timer doesn't work" - several people have reproduced


  • To: Jason Andryuk <jandryuk@xxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Tue, 18 Feb 2020 01:21:59 +0000
  • Authentication-results: esa4.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=andrew.cooper3@xxxxxxxxxx; spf=Pass smtp.mailfrom=Andrew.Cooper3@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Aaron Janse <aaron@xxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>
  • Delivery-date: Tue, 18 Feb 2020 01:22:26 +0000
  • Ironport-sdr: Ne6pbezVcTTDPfOTkq5iWMe2bv1NuXlpxtOwVfjzKFf0QOktRbLSkmTpiiLhC/uYPZEsa/2Bdx +Hv1trE6U/ajm8utghGazbA2lM76KfmIklm7okIdmFSy3CrP8R4YCVjo1gleXe1lV+vN0+hK4c sGXckc0sLfbHPifXox5ixxFhR1/ovUdYaoIq95yoRSezH6629hF+rKHIK0AgjeB5QZzvI0WQu1 ZbHtJpwbe8idqYWBMpFWVs9uNFnHVOjT5nVwq0yv+eUG95rAMU8+ESpcZjMYHOquYIus67TYzu +EA=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 17/02/2020 20:41, Jason Andryuk wrote:
On Mon, Feb 17, 2020 at 2:46 PM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
On 17/02/2020 19:19, Jason Andryuk wrote:
enabling vecOn Tue, Dec 31, 2019 at 5:43 AM Aaron Janse <aaron@xxxxxxxxx> wrote:
On Tue, Dec 31, 2019, at 12:27 AM, Andrew Cooper wrote:
Is there any full boot log in the bad case?  Debugging via divination
isn't an effective way to get things done.
Agreed. I included some more verbose logs towards the end of the email (typed 
up by hand).

Attached are pictures from a slow-motion video of my laptop booting. Note that 
I also included a picture of a stack trace that happens immediately before 
reboot. It doesn't look related, but I wanted to include it anyway.

I think the original email should have said "4.8.5" instead of "4.0.5." 
Regardless, everyone on this mailing list can now see all the boot logs that I've seen.

Attaching a serial console seems like it would be difficult to do on this 
laptop, otherwise I would have sent the logs as a txt file.
I'm seeing Xen panic: "IO-APIC + timer doesn't work" on a Dell
Latitude 7200 2-in-1.  Fedora 31 Live USB image boots successfully.
No way to get serial output.  I manually recreated the output before
from the vga display.
We have multiple bugs.

First and foremost, Xen seems totally broken when running in ExtINT
mode.  This needs addressing, and ought to be sufficient to let Xen
boot, at which point we can try to figure out why it is trying to fall
back into 486(ish) compatibility mode.

I tested Linux with intel_iommu=on and that booted successfully.
Under Xen, this system sets iommu_x2apic_enabled = true, so
force_iommu is set and iommu=0 cannot disable the iommu.
fails.  Oh, I can disable x2apic and then disable iommu

x2apic=1 -> failure above
x2apic=0 iommu=0 -> failure above
clocksource=acpi -> doesn't help
clocksource=pit -> hangs after "load tracking window length 1073741824 ns"
None of these are surprising, given that Xen can't make any interrupts
work at all.

noapic -> BUG in init_bsp_APIC
This is a surprise.  Its clearly a bug in Xen.  (OTOH, I've been
threatening to rip all of that logic out, because there is no such thing
as a 64bit capable system without an integrated APIC.)
It's a GPF [error_code=0000] at init_bsp_APIC+0x53 which is
    0xffff82d080428f86 <+64>:    je     0xffff82d080428fc9 <init_bsp_APIC+131>
    0xffff82d080428f88 <+66>:    or     $0xff,%al
    0xffff82d080428f8a <+68>:    test   %sil,%sil
    0xffff82d080428f8d <+71>:    je     0xffff82d080428fd8 <init_bsp_APIC+146>
    0xffff82d080428f8f <+73>:    mov    $0x80f,%ecx
    0xffff82d080428f94 <+78>:    mov    $0x0,%edx
    0xffff82d080428f99 <+83>:    wrmsr

RAX is 0x3ff

This is immediately after Xen prints "Switched to APIC driver x2apic_cluster"

Hmm, in which case it isn't a BUG specifically, but merely a crash. 0x3ff to SPIV is trying to set reserved bits, so it is no surprise that there is a #GP.

In which case this can safely be filed under "even more collateral damage from failing to set up any kind of interrupt handling".

One other thing that might be noteworthy.  Linux only prints ACPI IRQ0
and IRQ9 used by override where Xen lists IRQ 0, 2 & 9.
Huh - this is supposed to come directly from the ACPI tables, so Linux
and Xen should be using the same source of information.

Below is the re-constructed Xen console output.  The SMBIOS line is
the first thing displayed on the VGA output.
Yes - it is the first thing printed after vesa_init() which I think is a
manifestation of a previous EFI bug I've reported.  Does booting with
-basevideo help?  (No need to transcribe the output, manually.  Just
need to know if it lets you see the full log.)
I'm booting grub->xen.gz so -basevideo isn't directly applicable.  My
attempt at setting a boot entry failed, so I'll have to try that
again.

Ah ok.  One thing which Xen(.gz) needs to do is to take video details from the bootloader rather than trying to figure them out itself.

By default, Xen.gz will try and write into the legacy vga range which most likely isn't working in an EFI system.

(As a slight tangent, It is possible to test xen.efi via grub with a suitable chainloader stanza, but xen.efi is deficient in enough important ways that I'd avoid it unless absolutely necessary.)

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.