[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: issue with dom0_pvh on Xen 4.20
- To: Manuel Bouyer <bouyer@xxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
- From: Juergen Gross <jgross@xxxxxxxx>
- Date: Tue, 2 Sep 2025 14:22:29 +0200
- Authentication-results: smtp-out2.suse.de; none
- Autocrypt: addr=jgross@xxxxxxxx; keydata= xsBNBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOB ycWxw3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJve dYm8Of8Zd621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJ NwQpd369y9bfIhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvx XP3FAp2pkW0xqG7/377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEB AAHNH0p1ZXJnZW4gR3Jvc3MgPGpncm9zc0BzdXNlLmNvbT7CwHkEEwECACMFAlOMcK8CGwMH CwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRCw3p3WKL8TL8eZB/9G0juS/kDY9LhEXseh mE9U+iA1VsLhgDqVbsOtZ/S14LRFHczNd/Lqkn7souCSoyWsBs3/wO+OjPvxf7m+Ef+sMtr0 G5lCWEWa9wa0IXx5HRPW/ScL+e4AVUbL7rurYMfwCzco+7TfjhMEOkC+va5gzi1KrErgNRHH kg3PhlnRY0Udyqx++UYkAsN4TQuEhNN32MvN0Np3WlBJOgKcuXpIElmMM5f1BBzJSKBkW0Jc Wy3h2Wy912vHKpPV/Xv7ZwVJ27v7KcuZcErtptDevAljxJtE7aJG6WiBzm+v9EswyWxwMCIO RoVBYuiocc51872tRGywc03xaQydB+9R7BHPzsBNBFOMcBYBCADLMfoA44MwGOB9YT1V4KCy vAfd7E0BTfaAurbG+Olacciz3yd09QOmejFZC6AnoykydyvTFLAWYcSCdISMr88COmmCbJzn sHAogjexXiif6ANUUlHpjxlHCCcELmZUzomNDnEOTxZFeWMTFF9Rf2k2F0Tl4E5kmsNGgtSa aMO0rNZoOEiD/7UfPP3dfh8JCQ1VtUUsQtT1sxos8Eb/HmriJhnaTZ7Hp3jtgTVkV0ybpgFg w6WMaRkrBh17mV0z2ajjmabB7SJxcouSkR0hcpNl4oM74d2/VqoW4BxxxOD1FcNCObCELfIS auZx+XT6s+CE7Qi/c44ibBMR7hyjdzWbABEBAAHCwF8EGAECAAkFAlOMcBYCGwwACgkQsN6d 1ii/Ey9D+Af/WFr3q+bg/8v5tCknCtn92d5lyYTBNt7xgWzDZX8G6/pngzKyWfedArllp0Pn fgIXtMNV+3t8Li1Tg843EXkP7+2+CQ98MB8XvvPLYAfW8nNDV85TyVgWlldNcgdv7nn1Sq8g HwB2BHdIAkYce3hEoDQXt/mKlgEGsLpzJcnLKimtPXQQy9TxUaLBe9PInPd+Ohix0XOlY+Uk QFEx50Ki3rSDl2Zt2tnkNYKUCvTJq7jvOlaPd6d/W0tZqpyy7KVay+K4aMobDsodB3dvEAs6 ScCnh03dDAFgIq5nsB11j3KPKdVoPlfucX2c7kGNH+LUMbzqV6beIENfNexkOfxHfw==
- Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx
- Delivery-date: Tue, 02 Sep 2025 12:22:36 +0000
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
On 02.09.25 12:56, Manuel Bouyer wrote:
On Tue, Sep 02, 2025 at 11:44:36AM +0100, Andrew Cooper wrote:
On 02/09/2025 11:17 am, Manuel Bouyer wrote:
Hello,
I'm trying to boot a NetBSD PVH dom0 on Xen 4.20.
The same NetBSD kernel works fine with Xen 4.18
The boot options are:
menu=Boot netbsd-current PVH Xen420:dev hd0f:;load /netbsd-PVH console=com0
root=wd0f; multiboot /xen420-debug.gz dom0_mem=1024M console=com1
com1=38400,8n1 loglvl=all guest_loglvl=all gnttab_max_nr_frames=64
sync_console=1 dom0=pvh
and the full log from serial console is attached.
With 4.20 the boot fails with:
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) Freed 664kB init memory
(XEN) d0v0 Triple fault - invoking HVM shutdown action 1
(XEN) *** Dumping Dom0 vcpu#0 state: ***
(XEN) ----[ Xen-4.20.2-pre_20250821nb0 x86_64 debug=y Tainted: C ]----
(XEN) CPU: 7
(XEN) RIP: 0008:[<000000000020e268>]
(XEN) RFLAGS: 0000000000010006 CONTEXT: hvm guest (d0v0)
(XEN) rax: 000000002024c003 rbx: 000000000020e260 rcx: 00000000000dfeb7
(XEN) rdx: 0000000000100000 rsi: 0000000000103000 rdi: 000000000013e000
(XEN) rbp: 0000000080000000 rsp: 00000000014002e4 r8: 0000000000000000
(XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
(XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000
(XEN) r15: 0000000000000000 cr0: 0000000000000011 cr4: 0000000000000000
(XEN) cr3: 0000000000000000 cr2: 0000000000000000
(XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 0000000000000000
(XEN) ds: 0010 es: 0010 fs: 0000 gs: 0000 ss: 0010 cs: 0008
because of the triple fault the RIP above doens't point to the code.
I tracked it down to this code:
cmpl $0,%ecx ; /* zero-sized? */ \
je 2f ; \
pushl %ebp ; \
movl RELOC(nox_flag),%ebp ; \
1: movl %ebp,(PDE_SIZE-4)(%ebx) ; /* upper 32 bits: NX */ \
movl %eax,(%ebx) ; /* store phys addr */ \
addl $PDE_SIZE,%ebx ; /* next PTE/PDE */ \
addl $PAGE_SIZE,%eax ; /* next phys page */ \
loop 1b ; \
popl %ebp ; \
2: ;
there are others pushl/popl before so I don't think that's the problem
(in fact the exact same fragment is called just before with different
inputs and it doesn't fault). So the culprit it probably the write to (%ebx),
which would be 0x20e260
This is in the range:
(XEN) [0000000000100000, 0000000040068e77] (usable)
so I can't see why this would be a problem.
Any idea, including how to debug this further, welcome
Even though triple fault's are aborts, they're generally accurate under
virt, so 0x20e268 is most likely where things die.
but that's the RIP of the last fault, not the first one, right ?
0x20e268 isn't in the text segment of the kernel, my guess is that the
first fault triggers an exception, but the exeption handler isn't set up yet
so we end up jumping to some random value.
What puzzles me is that:
- %cr2 is 0, so probably the first fault wasn't a page fault
- RIP is %ebx + 8, so maybe the code was just clobbered by the loop?
Could it be the code has been moved to this location, or is about to
be moved away afterwards?
Juergen
Attachment:
OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key
Attachment:
OpenPGP_signature.asc
Description: OpenPGP digital signature
|