[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PV guest with PCI passthrough crash on Xen 4.8.3 inside KVM when booted through OVMF

On 16/02/18 18:51, Marek Marczykowski-Górecki wrote:
> On Fri, Feb 16, 2018 at 05:52:50PM +0000, Andrew Cooper wrote:
>> On 16/02/18 17:48, Marek Marczykowski-Górecki wrote:
>>> Hi,
>>> As in the subject, the guest crashes on boot, before kernel output
>>> anything. I've isolated this to the conditions below:
>>>  - PV guest have PCI device assigned (e1000e emulated by QEMU in this case),
>>>    without PCI device it works
>>>  - Xen (in KVM) is started through OVMF; with seabios it works
>>>  - nested HVM is disabled in KVM
>>>  - AMD IOMMU emulation is disabled in KVM; when enabled qemu crashes on
>>>    boot (looks like qemu bug, unrelated to this one)
>>> Version info:
>>>  - KVM host: OpenSUSE 42.3, qemu 2.9.1, 
>>> ovmf-2017+git1492060560.b6d11d7c46-4.1, AMD
>>>  - Xen host: Xen 4.8.3, dom0: Linux 4.14.13
>>>  - Xen domU: Linux 4.14.13, direct boot
>>> Not sure if relevant, but initially I've tried booting xen.efi /mapbs
>>> /noexitboot and then dom0 kernel crashed saying something about conflict
>>> between e820 and kernel mapping. But now those options are disabled.
>>> The crash message:
>>> (XEN) d1v0 Unhandled invalid opcode fault/trap [#6, ec=0000]
>>> (XEN) domain_crash_sync called from entry.S: fault at ffff82d080218720 
>>> entry.o#create_bounce_frame+0x137/0x146
>>> (XEN) Domain 1 (vcpu#0) crashed on cpu#1:
>>> (XEN) ----[ Xen-4.8.3  x86_64  debug=n   Not tainted ]----
>>> (XEN) CPU:    1
>>> (XEN) RIP:    e033:[<ffffffff826d9156>]
>> This is #UD, which is most probably hitting a BUG().  addr2line this ^
>> to find some code to look at.
> addr2line failed me

By default, vmlinux is stripped and compressed.  Ideally you want to
addr2line the vmlinux artefact in the root of your kernel build, which
is the plain elf with debugging symbols.

Alternatively, use scripts/extract-vmlinux on the binary you actually
booted, which might get you somewhere.

> , but System.map says its xen_memory_setup. And it
> looks like the BUG() is the same as I had in dom0 before:
> "Xen hypervisor allocated kernel memory conflicts with E820 map".

Juergen: Is there anything we can do to try and insert some dummy
exception handlers right at PV start, so we could at least print out a
oneliner to the host console which is a little more helpful than Xen
saying "something unknown went wrong" ?

> Disabling e820_host in guest config solved the problem. Thanks!
> Is this some bug in Xen or OVMF, or is it expected behavior and e820_host
> should be avoided?

I don't really know.  e820_host is a gross hack which shouldn't really
be present.  The actually problem is that Linux can't cope with the
memory layout it was given (and I can't recall if there is anything
Linux could potentially to do cope).  OTOH, the toolstack, which knew
about e820_host and chose to lay the guest out in an overlapping way is
probably also at fault.

IMO, PCI Passthrough is a trainwreck, and it is a miracle it functions
at all.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.