[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Kdump doesn't work when running with xen on newer hardware



Am Mittwoch, 5. Februar 2020, 10:31:37 CET schrieb Jürgen Groß:
> On 05.02.20 10:03, Dietmar Hahn wrote:
> > Am Dienstag, 4. Februar 2020, 15:18:53 CET schrieb Jürgen Groß:
> >> On 04.02.20 15:07, Dietmar Hahn wrote:
> >>> Am Freitag, 31. Januar 2020, 22:59:19 CET schrieb Igor Druzhinin:
> >>>> On 30/01/2020 13:03, Dietmar Hahn wrote:
> >>>>> Hi,
> >>>>>
> >>>>> we use SLES12 with kernel-default-4.12.14-95.45.1.x86_64 and
> >>>>> xen-4.11.3_02-2.20.1.x86_64
> >>>>>
> >>>>> The dump kernel doesn't start after "echo c > /proc/sysrq_trigger".
> >>>>> Last messages on console are:
> >>>>> [  385.717532] Kernel panic - not syncing: Fatal exception
> >>>>> [  385.734565] Kernel Offset: disabled
> >>>>> (XEN) Hardware Dom0 crashed: Executing kexec image on cpu58
> >>>>> (XEN) Shot down all CPUs
> >>>>>
> >>>>> After a short time a reboot is initiated.
> >>>>> Without xen the kdump works.
> >>>>>
> >>>>> We see this behaviour only on newer hardware, for example a server with
> >>>>> Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz
> >>>>>
> >>>>> I built the fresh released xen-4.13 myself and tried it but this 
> >>>>> doesn't help.
> >>>>>
> >>>>> I tried x2apic=off on the xen side and nox2apic on the linux side but 
> >>>>> no success.
> >>>>
> >>>> Starting from Xen 4.12 we keep IOMMU enabled during kexec transition
> >>>> which resolved the problem you're describing. But you also need to make
> >>>> sure IOMMU is enabled in your kexec kernel (which I think is now the
> >>>> default for most distros). You can still try to workaround the issue
> >>>> you're seeing on 4.11 by using "iommu=dom0-passthough" Xen option.
> >>>
> >>> I added "iommu=dom0-passthrough" to the xen-4.11 command line but no 
> >>> success.
> >>> Further I added earlyprintk=... to the the kdump kernel and I could see 
> >>> the
> >>> dump kernel started and only one message from extract_kernel()
> >>> was printed. Then the reboot followed.
> >>
> >> Which message?
> >>
> >> Any chance you can build the kdump kernel with CONFIG_X86_VERBOSE_BOOTUP
> >> enabled?
> > 
> > Yes it's switched on. The Message is from the first debug message in
> > extract_kernel() - debug_putaddr(input_data):
> > "input_data: 0x"
> 
> Weird, there should be "early console in extract_kernel\n" before that.

Ah sorry, my fault. I fiddled around with this boot and commented out this
message. So I see:
(XEN) Hardware Dom0 crashed: Executing kexec image on cpu37
(XEN) Shot down all CPUs
early console in extract_kernel

> > But not all of the text is seen!
> 
> Weird again - the address should be printed.
> 
> > If I unterstand the early_serial_init code in 
> > arch/x86/boot/early_serial_console.c
> > correctly the serial line works with polling (no interrupts), so it seems 
> > the
> > reboot is initiated before the complete message is printed.
> 
> But polling is synchronous (see serial_putchar() in
> arch/x86/boot/compressed/misc.c). So a reboot indicates a very early
> failure.
> 
> Can you please show the complete kdump kernel boot parameters?

kexec loads:
/sbin/kexec -p /boot/vmlinuz-4.12.14-95.29-default --append="elevator=deadline 
sysrq=yes reset_devices acpi_no_memhotplug cgroup_disable=memory nokaslr 
numa=off irqpoll nr_cpus=1 root=kdump rootflags=bind rd.udev.children-max=8 
disable_cpu_apicid=0  earlyprintk=serial,ttyS0,38400" 
--initrd=/boot/initrd-4.12.14-95.29-default-kdump  -s

Thank you!
Dietmar.

> Juergen




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.