[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Kdump doesn't work when running with xen on newer hardware



On 05.02.20 10:03, Dietmar Hahn wrote:
Am Dienstag, 4. Februar 2020, 15:18:53 CET schrieb Jürgen Groß:
On 04.02.20 15:07, Dietmar Hahn wrote:
Am Freitag, 31. Januar 2020, 22:59:19 CET schrieb Igor Druzhinin:
On 30/01/2020 13:03, Dietmar Hahn wrote:
Hi,

we use SLES12 with kernel-default-4.12.14-95.45.1.x86_64 and
xen-4.11.3_02-2.20.1.x86_64

The dump kernel doesn't start after "echo c > /proc/sysrq_trigger".
Last messages on console are:
[  385.717532] Kernel panic - not syncing: Fatal exception
[  385.734565] Kernel Offset: disabled
(XEN) Hardware Dom0 crashed: Executing kexec image on cpu58
(XEN) Shot down all CPUs

After a short time a reboot is initiated.
Without xen the kdump works.

We see this behaviour only on newer hardware, for example a server with
Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz

I built the fresh released xen-4.13 myself and tried it but this doesn't help.

I tried x2apic=off on the xen side and nox2apic on the linux side but no 
success.

Starting from Xen 4.12 we keep IOMMU enabled during kexec transition
which resolved the problem you're describing. But you also need to make
sure IOMMU is enabled in your kexec kernel (which I think is now the
default for most distros). You can still try to workaround the issue
you're seeing on 4.11 by using "iommu=dom0-passthough" Xen option.

I added "iommu=dom0-passthrough" to the xen-4.11 command line but no success.
Further I added earlyprintk=... to the the kdump kernel and I could see the
dump kernel started and only one message from extract_kernel()
was printed. Then the reboot followed.

Which message?

Any chance you can build the kdump kernel with CONFIG_X86_VERBOSE_BOOTUP
enabled?

Yes it's switched on. The Message is from the first debug message in
extract_kernel() - debug_putaddr(input_data):
"input_data: 0x"

Weird, there should be "early console in extract_kernel\n" before that.

But not all of the text is seen!

Weird again - the address should be printed.

If I unterstand the early_serial_init code in 
arch/x86/boot/early_serial_console.c
correctly the serial line works with polling (no interrupts), so it seems the
reboot is initiated before the complete message is printed.

But polling is synchronous (see serial_putchar() in
arch/x86/boot/compressed/misc.c). So a reboot indicates a very early
failure.

Can you please show the complete kdump kernel boot parameters?


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.