[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [arm] Dom0 hangs after enable KROBE_EVENTS and/or UPROBE_EVENTS in kernel config



Hi Stefano and Oleksii,


On 22/07/2021 03:12, Stefano Stabellini wrote:
On Wed, 21 Jul 2021, Oleksii Moisieiev wrote:
Please see my answers below.

___________________________________________________________________________________________________________________________________________
From: Julien Grall <julien@xxxxxxx>
Sent: Wednesday, July 21, 2021 7:39 PM
To: Oleksii Moisieiev <Oleksii_Moisieiev@xxxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx 
<xen-devel@xxxxxxxxxxxxxxxxxxxx>
Cc: Andrii Anisov <Andrii_Anisov@xxxxxxxx>; Stefano Stabellini 
<sstabellini@xxxxxxxxxx>
Subject: Re: [arm] Dom0 hangs after enable KROBE_EVENTS and/or UPROBE_EVENTS in 
kernel config
       On 21/07/2021 15:40, Oleksii Moisieiev wrote:
       > Hello Julien,

       Hello,

       >>>
       >>> My setup:
       >>> Board: H3ULCB Kinfisher board
       >>> Xen: revision dba774896f7dd74773c14d537643b7d7477fefcd (stable-4.15)
       
>>>https://urldefense.com/v3/__https://github.com/xen-project/xen.git__;!!GF_29dbcQIUBPA!m4NHC2XbbSHWWZjQ7CX1ZZhaET6l0bQhZo581jtCmpst8E8JBp8Q
       ri3haIaks6cbo7Ri$
       
><https://urldefense.com/v3/__https://github.com/xen-project/xen.git__;!!GF_29dbcQIUBPA!m4NHC2XbbSHWWZjQ7CX1ZZhaET6l0bQhZo581jtCmpst8E8JBp8
       Qri3haIaks6cbo7Ri$>[github[.]com]
       >
       
>>><https://urldefense.com/v3/__https://github.com/xen-project/xen.git__;!!GF_29dbcQIUBPA!m4NHC2XbbSHWWZjQ7CX1ZZhaET6l0bQhZo581jtCmpst8E8JBp8
       Qri3haIaks6cbo7Ri$
       > [github[.]com]>;
       >>> Kernel: revision 09162bc32c880a791c6c0668ce0745cf7958f576 (v5.10-rc4)
       >
       >>Hmmm... 5.10 was released a few months ago and there are probably a few
       >>stable release for the version. Can you try the latest 5.10 stable?
       >
       > Switched to tag v5.10 rev: 2c85ebc57b3e of
       
>https://urldefense.com/v3/__https://github.com/torvalds/linux.git__;!!GF_29dbcQIUBPA!hJARiSsCASVNpAQxrnN-7sFsVHHTS39sjRraLqBkD6AoaCbplgoyi
       v-iCGlHhXafbPNc$ [github[.]com]
       > and got the same problem, that I see no output from kernel. All tests
       > were done with earlycon parameter set in the kernel cmdline.
       The tag v5.10 is the first official release. What I meant is using the
       stable branch from
       git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git (branch
       linux-5.10.y).

I need some time to download and build mainline kernel. I'll test this scenario 
and send you results tomorrow.

I tried 5.10 with:

CONFIG_KPROBE_EVENTS=y
CONFIG_UPROBE_EVENTS=y

and I could boot without issues on Xilinx ZynqMP.



       >>>
       
>>>https://urldefense.com/v3/__https://github.com/torvalds/linux.git__;!!GF_29dbcQIUBPA!m4NHC2XbbSHWWZjQ7CX1ZZhaET6l0bQhZo581jtCmpst8E8JBp8Qr
       i3haIaks29w69MC$
       
><https://urldefense.com/v3/__https://github.com/torvalds/linux.git__;!!GF_29dbcQIUBPA!m4NHC2XbbSHWWZjQ7CX1ZZhaET6l0bQhZo581jtCmpst8E8JBp8Q
       ri3haIaks29w69MC$>[github[.]com]
       >
       
>>><https://urldefense.com/v3/__https://github.com/torvalds/linux.git__;!!GF_29dbcQIUBPA!m4NHC2XbbSHWWZjQ7CX1ZZhaET6l0bQhZo581jtCmpst8E8JBp8Q
       ri3haIaks29w69MC$
       > [github[.]com]>;
       >>>
       >>> kernel config: see attached;
       >>>
       >>> dtb: see attached;
       >
       >>Please avoid large attachment as they will be duplicated on every
       >>mailbox. Instead, in the future, please upload them somewhere (your own
       >>webserve, pastebin...) and provide a link in the e-mail.
       >
       > I'm sorry for that.
       >
       >>>
       >>>
       >>> If kprobe/uprobe events are enabled - I see no output after xen 
switched
       >>> input to Dom0, if disabled - system boots up successfully.
       >>The console subsystem tends to be enabled quite late in the boot
       >>process. So this may mean a panic during early boot.
       >
       >>If you haven't done yet, I would suggest to add earlycon=xenboot on the
       >>dom0 command line. This will print some messages during early boot.
       >>ing.
       >
       > All tests were done with earlycon parameter set in the kernel command
       > line (xen, dom0-bootargs).
       >
       >>>
       >>> Both configs work fine when I boot without xen.
       >>>
       >>>
       >>> Dom0 information from Xen console shows that only one CPU works, and 
PC
       >>> stays in "__arch_counter_get_cntvct" function on read_sysreg call. //
       >>>
       >>> I did further investigation and found that kernel 5.4 doesn't have 
such
       >>> kind of issues.
       >>> After bisecting kernel,between 5.10 and 5.4, I found that output
       >>> disappeared on commit:
       >>>
       >>> 76085aff29f585139a37a10ea0a7daa63f70872c
       >
       >> From the information you provided so far, I am a bit confused how this
       >>could be the source of the problem. But given this is not the latest
       >>5.10, I will wait for you to confirm the bug is still present before
       >>providing more input.
       >
       > I was confused with this commit either. As I mentioned above, I've
       > checked with the latest stable 5.10 kernel and still got the same 
problem.

       Thanks for the testing. I am not quite too sure where this may fail.
       Maybe Stefano has an idea?

Are you booting with bootefi? (I cannot see any issues with or without
bootefi.)

In any case, the fact that you need to revert
76085aff29f585139a37a10ea0a7daa63f70872c to see the printk output is
very odd. It might point to an alignment problem or another memory
issue. It is possible that the weirdness you are seeing below (e.g. "we
get some 18446744073709551615 while expecting 0") is due to a memory
corruption.

Given that 76085aff29f585139a37a10ea0a7daa63f70872c is changing some
section alignment from 4K to 64K, it increases the memory used to load
the kernel. Is it possible that the size increase is causing you to go
beyond the address range supposed to be used? E.g. U-Boot loading the
kernel at invalid addresses.

Things like CONFIG_KPROBE_EVENTS=y and CONFIG_UPROBE_EVENTS=y are
relevant because they increase the size of the kernel, possibly pushing
it to an invalid memory range?

This is actually a good point. There are two other possible issues:
   1) The kernel and the hypervisor may overlaps each other.
   2) The size of the kernel is not correctly provided.

I remember hitting such issues in the past and they will lead to weird issues.

In fact looking at the device-tree provided in the first e-mail, I see:

                module@0 {
compatible = "xen,linux-zimage", "xen,multiboot-module";
                        reg = <0x5 0x1000000 0x0 0x2000000>;
                };

However from the pastebin, U-boot will report for the kernel:

Bytes transferred = 37124608 (2367a00 hex)

So, if I am not mistaken, the region in the DT is smaller than the kernel itself. The Image header doesn't provide the binary size, so Xen can't do any sanity check.

In this case, we would copy a truncated kernel. Can you change in the size in the DT and give another try?


If you haven't one yet, I would highly recommend to have script (either a U-boot one or outside) that will generate the correct DT for a given kernel, xen, initramfs. We have some example scripts on the wiki for either solution.


You can go and edit 76085aff29f585139a37a10ea0a7daa63f70872c to change
from 4K to any multiple of 4K, e.g. 8K, 12K, 16K, 20K. They should all
work the same.

Looking at the boot logs on pastebin I noticed that Xen is not loaded at
a 2MB aligned address. I recommend you change Xen loading address to
0x500200000. And the kernel loading address to 0x500400000.

I am curious to know why you recommend to load at 2MB aligned address. The Image protocol doesn't require to load a 2MB aligned address. In fact, we add issue on Juno because the bootloader would load Xen at a 4KB address. UEFI will also load at a 4KB align address.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.