[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
UEFI exception when trying to boot on Ryzen systems
- To: xen-users@xxxxxxxxxxxxxxxxxxxx, Sam Mulvey <waveform.orchard@xxxxxxxxx>
- From: Sam Mulvey <sam@xxxxxx>
- Date: Mon, 1 Jun 2026 21:07:18 -0700
- Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=s714278 header.d=vis.nu header.i="@vis.nu" header.h="From:Subject:To:Message-ID:Date"; dkim=pass header.s=shaihulud header.d=vis.nu header.i="@vis.nu" header.h="Date:From:To:Subject"
- Delivery-date: Tue, 02 Jun 2026 04:08:28 +0000
- Feedback-id: 714278m:714278aMgcGVG:714278sh2CfdPJ-E
- List-id: Xen user discussion <xen-users.lists.xenproject.org>
I've had to replace two motherboard that ran Xen for unrelated
reasons. I've replaced them with an MSI MS-S3661 and a Gigabyte
MC-13-LE2. Both systems fail to boot Xen. Because I was having
this problem on production servers, I decided to put together a
test rig to continue working on this problem without touching my
production systems too much.
The situation as it stands now:
- The CPU and memory were successfully running Xen previously.
- I'm reasonably sure they're all running the same chipset.
- They are all server-type motherboards with on-board BMC using
an AST2600.
- The production systems are Ryzen 7900's, and the test rig is a
Ryzen 9900X. All show the same problem.
- I am using the Xen package in Arch Linux's AUR.
- I am the maintainer for said package.
The gigabyte board so far prints nothing, while the MSI board
prints an EFI exception. Given the similarity between the two
boards I am guessing that the issues on them are the same, but
that is a guess. My test rig is the MSI board with the 9900X and
shows the same problem, adding evidence at the motherboards being
the source of the issue.
The MSI exception looks like this:
!!!! X64 Exception Type - 0E(#PF -
Page-Fault) CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000011 I:1 R:0 U:0 W:0 P:1 PK:0 SS:0
SGX:0
RIP - 000000007D234D00, CS - 0000000000000038, RFLAGS -
0000000000010202
RAX - 0000000000000000, RCX - 0000000080B3DA18, RDX -
000000009867E018
RBX - 0000000080B3D398, RSP - 0000000098FDDE88, RBP -
000000000000040E
RSI - 0000000000000000, RDI - 0000000084641118
R8 - 000000009876F3D0, R9 - 0000000000000004, R10 -
0000000030726670
R11 - 0000000098FDDE80, R12 - 0000000080B3DF98, R13 -
0000000098FDDF40
R14 - 0000000000000000, R15 - 0000000080719C80
DS - 0000000000000030, ES - 0000000000000030, FS -
0000000000000030
GS - 0000000000000030, SS - 0000000000000030
CR0 - 0000000080010011, CR2 - 000000007D234D00, CR3 -
0000000098801000
CR4 - 0000000000000628, CR8 - 0000000000000000
DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 -
0000000000000000
DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 -
0000000000000400
GDTR - 0000000097CB6020 0000000000000057, LDTR -
0000000000000000
IDTR - 0000000084CF6018 0000000000000FFF, TR -
0000000000000048
FXSAVE_STATE - 0000000097CB5470
!!!! Can't find image information. !!!!
After which the system hangs. I mentioned the problem on Matrix,
and Andrew Cooper said:
Hmm, so that was a pagefault for an
instruction fetch outside of a registered area
I have tried booting the kernel:
- from systemd boot
- from the EFI shell
- directly by adding xen.efi to uefibootmgr
- I've tried a stable-4.21 and a stable-4.20 build of Xen
There's a few other things I'll be trying, including trying to
boot from GRUB2. That might take a minute since I'm a bit
allergic to GRUB2 and not wholly familiar with it.
Has anyone seen anything like this, or has a direction to point
on fixing it?
Also, if you could cc waveform.orchard@xxxxxxxxx on appropriate
replies, I'd appreciate it. I have borrowed hardware to keep my
systems running, but it's not a particularly stable state at the
moment.
-Sam
|