[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [arm] Dom0 hangs after enable KROBE_EVENTS and/or UPROBE_EVENTS in kernel config



Hi Stefano,

On 23/07/2021 17:42, Stefano Stabellini wrote:
On Fri, 23 Jul 2021, Julien Grall wrote:
On 22/07/2021 22:39, Stefano Stabellini wrote:
On Thu, 22 Jul 2021, Julien Grall wrote:
You can go and edit 76085aff29f585139a37a10ea0a7daa63f70872c to change
from 4K to any multiple of 4K, e.g. 8K, 12K, 16K, 20K. They should all
work the same.

Looking at the boot logs on pastebin I noticed that Xen is not loaded at
a 2MB aligned address. I recommend you change Xen loading address to
0x500200000. And the kernel loading address to 0x500400000.

I am curious to know why you recommend to load at 2MB aligned address. The
Image protocol doesn't require to load a 2MB aligned address. In fact, we
add
issue on Juno because the bootloader would load Xen at a 4KB address. UEFI
will also load at a 4KB align address.

It is from empirical evidence :-)

Right...

I cannot tell you the exact reason but I saw "strange" problems in the
past that went away after choosing a 2MB alignment. So we settled for
using 2MB in ImageBuilder and we haven't seen any more issues.

It would have been good to report such issue back then so it could have been
analyzed and possibly fixed.

However, it could have been anything: a bug in U-Boot not relevant
anymore, a bug in Linux, etc. I don't know for sure.

This is the worrying part. We have a potential bug that no one knows why it
happened. Can this be reproduced?

I managed to reproduce the problem. I switched ImageBuilder to use 4K
alignment (just by changing the variable "offset" at the top of
scripts/uboot-script-gen).

Thank you for reproducing it!

It generated a boot.source file like this:

tftpb 0xC01000 2021.1/xen
tftpb 0xCEA000 2021.1/xen-Image-5.10
tftpb 0x18D1000 2021.1/initrd.cpio
tftpb 0x1A55000 2021.1/xen.dtb
[trimmed because the rest is not too relevant]


These are the sizes:

12479370 Jul  21 19:02 xen-Image-5.10
40577    Jul  21 18:25 xen.dtb
950280   Jul  19 16:58 xen
1586176  Jun  4  17:14 initrd.cpio


I did the calculations by hand and there are no overlaps. This is the
output from u-boot and boot log failure: https://pastebin.com/rbTBPn5g

FWIW, I just gave a try on the foundation model with bootwrapper. My default setup load Xen and the kernel at the following:

(XEN) MODULE[0]: 0000000088200000 - 000000008835a8f8 Xen
(XEN) MODULE[1]: 0000000088000000 - 000000008800167f Device Tree
(XEN) MODULE[2]: 0000000080080000 - 0000000081e7ca00 Kernel
(XEN)  RESVD[0]: 0000000080000000 - 0000000080010000

Xen is 2MB aligned, but the kernel not. I couldn't see any failure.

I have also tried to load Xen at different address (this time not 2MB aligned) and still couldn't spot any issue:

(XEN) MODULE[0]: 0000000088201000 - 000000008835b8f8 Xen
(XEN) MODULE[1]: 0000000088000000 - 000000008800167f Device Tree
(XEN) MODULE[2]: 0000000080081000 - 0000000081e7da00 Kernel
(XEN)  RESVD[0]: 0000000080000000 - 0000000080010000

So this looks something specific to your setup. Looking at the log:

> (XEN) Latest ChangeSet: Tue Apr 13 10:59:05 2021 -0700 git:f44b1a6ede

I couldn't find this commit in the tree. What baseline are you using? From my side, I tested with 3a98c1a4cec1.

> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) invalid compressed format (err=1)
> (XEN) ****************************************

This implies Xen think the kernel module was a GZIP image and Xen is trying to decompress it. However, from your e-mail above the name of the kernel module is xen-Image-5.10 which implies this is not a compressed image.

Can you confirm what is the format of xen-Image-5.10?


Using 2MB works. I tried 1MB for curiosity and got a different
error: https://pastebin.com/UHFUHyxN

> (XEN) pg[0] MFN 00f50 c=0x180000000000000 o=4 v=0x7ffff t=0
> (XEN) Xen BUG at page_alloc.c:1425

This looks like two ranges has overlapped each other. Above, you confirmed there was no overlap, was it for both 4KB and 1MB alignment?


Do you think it is worth investigating further?

Definitely, I have got a setup where 4KB aligned (but not 2MB aligned) works. This is a hint that something odd is happening on your setup and I would like to understand what.

I have a Xilinx board at home (I haven't used it recently though), so I am happy to help debugging it. Alternatively, do you know if it reproduces on the Xilinx QEMU?

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.