[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

UEFI exception when trying to boot on Ryzen systems


  • To: xen-users@xxxxxxxxxxxxxxxxxxxx, Sam Mulvey <waveform.orchard@xxxxxxxxx>
  • From: Sam Mulvey <sam@xxxxxx>
  • Date: Mon, 1 Jun 2026 21:07:18 -0700
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=s714278 header.d=vis.nu header.i="@vis.nu" header.h="From:Subject:To:Message-ID:Date"; dkim=pass header.s=shaihulud header.d=vis.nu header.i="@vis.nu" header.h="Date:From:To:Subject"
  • Delivery-date: Tue, 02 Jun 2026 04:08:28 +0000
  • Feedback-id: 714278m:714278aMgcGVG:714278sh2CfdPJ-E
  • List-id: Xen user discussion <xen-users.lists.xenproject.org>


I've had to replace two motherboard that ran Xen for unrelated reasons.   I've replaced them with an MSI MS-S3661 and a Gigabyte MC-13-LE2.   Both systems fail to boot Xen.  Because I was having this problem on production servers, I decided to put together a test rig to continue working on this problem without touching my production systems too much.

The situation as it stands now:

  • The CPU and memory were successfully running Xen previously.
  • I'm reasonably sure they're all running the same chipset.
  • They are all server-type motherboards with on-board BMC using an AST2600.
  • The production systems are Ryzen 7900's, and the test rig is a Ryzen 9900X.  All show the same problem.
  • I am using the Xen package in Arch Linux's AUR.
  • I am the maintainer for said package.

The gigabyte board so far prints nothing, while the MSI board prints an EFI exception.   Given the similarity between the two boards I am guessing that the issues on them are the same, but that is a guess.   My test rig is the MSI board with the 9900X and shows the same problem, adding evidence at the motherboards being the source of the issue.

The MSI exception looks like this:

!!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000011  I:1 R:0 U:0 W:0 P:1 PK:0 SS:0 SGX:0
RIP  - 000000007D234D00, CS  - 0000000000000038, RFLAGS - 0000000000010202
RAX  - 0000000000000000, RCX - 0000000080B3DA18, RDX - 000000009867E018
RBX  - 0000000080B3D398, RSP - 0000000098FDDE88, RBP - 000000000000040E
RSI  - 0000000000000000, RDI - 0000000084641118
R8   - 000000009876F3D0, R9  - 0000000000000004, R10 - 0000000030726670
R11  - 0000000098FDDE80, R12 - 0000000080B3DF98, R13 - 0000000098FDDF40
R14  - 0000000000000000, R15 - 0000000080719C80
DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
GS   - 0000000000000030, SS  - 0000000000000030
CR0  - 0000000080010011, CR2 - 000000007D234D00, CR3 - 0000000098801000
CR4  - 0000000000000628, CR8 - 0000000000000000
DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
GDTR - 0000000097CB6020 0000000000000057, LDTR - 0000000000000000
IDTR - 0000000084CF6018 0000000000000FFF,   TR - 0000000000000048
FXSAVE_STATE - 0000000097CB5470
!!!! Can't find image information. !!!!


After which the system hangs.  I mentioned the problem on Matrix, and Andrew Cooper said:

Hmm, so that was a pagefault for an instruction fetch outside of a registered area

I have tried booting the kernel:

  • from systemd boot
  • from the EFI shell
  • directly by adding xen.efi to uefibootmgr
  • I've tried a stable-4.21 and a stable-4.20 build of Xen

There's a few other things I'll be trying, including trying to boot from GRUB2.   That might take a minute since I'm a bit allergic to GRUB2 and not wholly familiar with it.

Has anyone seen anything like this, or has a direction to point on fixing it?

Also, if you could cc waveform.orchard@xxxxxxxxx on appropriate replies, I'd appreciate it.   I have borrowed hardware to keep my systems running, but it's not a particularly stable state at the moment.

-Sam


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.