[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Discussion of Xenheap problems on AArch64


  • To: Julien Grall <julien@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Henry Wang <Henry.Wang@xxxxxxx>
  • Date: Wed, 28 Apr 2021 09:28:28 +0000
  • Accept-language: zh-CN, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7M4zHgC/qRDgYyAz+PrKmEWGRjCoFfjajhnzFhc75zM=; b=Nt9jiAqirSrWM0UsJrDEJmD+uomoxU8nrUbkY+jvPhO+s97tFQ4QMYaxdM3un98rlYEsMiBrVWZ1uSLPiAMrHfRE9DzolPGKv3q/9TB1eV59XJ4qkayPJU2zJuePJdREtlbSZbCSpxZCMAxPx6ZMLIjTw+f5vmgSuxtZ517CBeFNWfg9zqGUI/Ekd1XPAjPNg3OR/uhXT/DOKR27TZO/OZPDKKPeupbwIk6gtHmaJLvK/pmHm+znWq8UYEtFYHyfQODZDehHW88CXdoyTMtHhT8hkHXz2cO0/4tUlnd+oSK/GyTB7W0IRAkgShxVPVpprt2vQnpSgdjMCKOh7GWIbg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AXTRWKjLh5ay9jpENMpv0BAfFia6aho3N3P9lwekT3hg4OL54JGM7HAcVn1xHEhYXGDJcuRRilm2sgcEL8vR3iz5WuW4gytFf7IqfJlDDrexEil5u8ZnVK0rslgHWXoiPlKcJj5zFqNi/jZzwGMQNLr6aJvlhB+061stkcpg9CJK+M5ztmxmuqXeC0jXJHXdE5C5rGMeTn4fWAx2htiQbfMscVEghU/hM+vsrFMM4PGg7yDOTBYJSowARWsLwAAWVPnEpqrYxrE8xs6PIgHN6jS6tNrzyjOAhC7e1KdbWhn86KrBrUPUE+eUzs8LwNi2IT05tA54FbyvcNAx87t2mA==
  • Authentication-results-original: xen.org; dkim=none (message not signed) header.d=none;xen.org; dmarc=none action=none header.from=arm.com;
  • Cc: Wei Chen <Wei.Chen@xxxxxxx>, Penny Zheng <Penny.Zheng@xxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
  • Delivery-date: Wed, 28 Apr 2021 09:29:03 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: xen.org; dkim=none (message not signed) header.d=none;xen.org; dmarc=none action=none header.from=arm.com;
  • Thread-index: Adc2dyA8lkZGRqbyRiSglHolanVkwQAFhaqAAACgy/AA4CfqgABHcHyAADhcqlA=
  • Thread-topic: Discussion of Xenheap problems on AArch64

Hi Julien,

I've done some test about the patch series in
https://xenbits.xen.org/gitweb/?p=people/julieng/xen-unstable.git;a=shortlog;h=refs/heads/pt/rfc-v2

If you have time, could you please take a look at the inline test result and
kindly inform if I tested the patch series correctly? Thanks!

> -----Original Message-----
> From: Henry Wang
> Sent: Tuesday, April 27, 2021 2:29 PM
> To: Julien Grall <julien@xxxxxxx>; sstabellini@xxxxxxxxxx; xen-
> devel@xxxxxxxxxxxxxxxxxxxx
> Cc: Wei Chen <Wei.Chen@xxxxxxx>; Penny Zheng
> <Penny.Zheng@xxxxxxx>; Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
> Subject: RE: Discussion of Xenheap problems on AArch64
> 
> Hi Julien,
> 
> Sorry for the late reply, I kinda missed this email somehow....
> 
> Please see my inline reply ^^
> 
> > -----Original Message-----
> > From: Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx> On Behalf Of
> > Julien Grall
> > Sent: Monday, April 26, 2021 4:20 AM
> > To: Henry Wang <Henry.Wang@xxxxxxx>; sstabellini@xxxxxxxxxx; xen-
> > devel@xxxxxxxxxxxxxxxxxxxx
> > Cc: Wei Chen <Wei.Chen@xxxxxxx>; Penny Zheng
> > <Penny.Zheng@xxxxxxx>; Bertrand Marquis
> <Bertrand.Marquis@xxxxxxx>
> > Subject: Re: Discussion of Xenheap problems on AArch64
> >
> >
> >
> > On 21/04/2021 10:32, Henry Wang wrote:
> > > Hi Julien,
> >
> > Hi Henry,
> >
> > >> -----Original Message-----
> > >> From: Julien Grall <julien@xxxxxxx>
> > >> Sent: Wednesday, April 21, 2021 5:04 PM
> > >> To: Henry Wang <Henry.Wang@xxxxxxx>; sstabellini@xxxxxxxxxx; xen-
> > >> devel@xxxxxxxxxxxxxxxxxxxx
> > >> Cc: Wei Chen <Wei.Chen@xxxxxxx>; Penny Zheng
> > >> <Penny.Zheng@xxxxxxx>; Bertrand Marquis
> > <Bertrand.Marquis@xxxxxxx>
> > >> Subject: Re: Discussion of Xenheap problems on AArch64
> > >>
> > >>
> > >>
> > >> On 21/04/2021 07:28, Henry Wang wrote:
> > >>> Hi,
> > >>
> > >> Hi Henry,
> > >>
> > >>>
> > >>> We are trying to implement the static memory allocation on AArch64.
> > Part
> > >> of
> > >>> this feature is the reserved heap memory allocation, where a specific
> > range
> > >> of
> > >>> memory is reserved only for heap. In the development process, we
> > found a
> > >>> pitfall in current AArch64 setup_xenheap_mappings() function.
> > >>>
> > >>> According to a previous discussion in community
> > >>> https://lore.kernel.org/xen-devel/20190216134456.10681-1-
> > >> peng.fan@xxxxxxx/,
> > >>> on AArch64, bootmem is initialized after setup_xenheap_mappings(),
> > >>> setup_xenheap_mappings() may try to allocate memory before
> memory
> > >> has been
> > >>> handed over to the boot allocator. If the reserved heap memory
> > allocation
> > >> is
> > >>> introduced, either of below 2 cases will trigger a crash:
> > >>>
> > >>> 1. If the reserved heap memory is at the end of the memory block list
> > and
> > >> the
> > >>> gap between reserved and unreserved memory is bigger than 512GB,
> > when
> > >> we setup
> > >>> mappings from the beginning of the memory block list, we will get
> OOM
> > >> caused
> > >>> by lack of pages in boot allocator. This is because the memory that is
> > >> reserved
> > >>> for heap has not been mapped and added to the boot allocator.
> > >>>
> > >>> 2. If we add the memory that is reserved for heap to boot allocator 
> > >>> first,
> > >> and
> > >>> then setup mappings for banks in the memory block list, we may get a
> > page
> > >> which
> > >>> has not been setup mapping, causing a data abort.
> > >>
> > >> There are a few issues with setup_xenheap_mappings(). I have been
> > >> reworking the code on my spare time and started to upstream bits of it.
> > >> A PoC can be found here:
> > >>
> > >> https://xenbits.xen.org/gitweb/?p=people/julieng/xen-
> > >> unstable.git;a=shortlog;h=refs/heads/pt/dev
> > >>
> > >
> > > Really great news! Thanks you very much for the information and your
> > hard
> > > work on the PoC :) I will start to go through your PoC code then.
> >
> > I spent sometimes today to clean-up the PoC and sent a series on the ML
> > (see [1]). This has been lightly tested so far.
> >
> > Would you be able to give a try and let me know if it helps your problem?
> 
> Yes of course! I will start to test this series ^^ Thank you for your work!
> 

Test platform: FVP_Base_RevC_2xAEMvA (with -C bp.dram_size=1024)

Default memory configuration (works well):
memory@80000000 {
                device_type = "memory";
                reg = <0x00 0x80000000 0x00 0x7f000000 0x08 0x80000000 0x00 
0x80000000>;
};

As the lowest part of DRAM range only has 2GB RAM 
(https://developer.arm.com/documentation/100964/1114/Base-Platform/Base---memory/Base-Platform-memory-map),
I only tested two memory banks with a big gap case.

1. Without patch (commit bea65a212c0581520203b6ad0d07615693f42f73)
and use two memory banks which have a big gap:

Memory node:
memory@80000000 {
                device_type = "memory";
                reg = <0x00 0x80000000 0x00 0x7f000000 0x8800 0x00000000 0x00 
0x80000000>;
};

Log:
(XEN)   VTCR_EL2: 80000000
(XEN)  VTTBR_EL2: 0000000000000000
(XEN)
(XEN)  SCTLR_EL2: 30cd183d
(XEN)    HCR_EL2: 0000000000000038
(XEN)  TTBR0_EL2: 000000008413d000
(XEN)
(XEN)    ESR_EL2: 96000041
(XEN)  HPFAR_EL2: 0000000000000000
(XEN)    FAR_EL2: 00008010c3fff000
(XEN) Xen call trace:
(XEN)    [<000000000025c7a0>] clear_page+0x10/0x2c (PC)
(XEN)    [<00000000002caa30>] setup_frametable_mappings+0x1ac/0x2e0 (LR)
(XEN)    [<00000000002cbf34>] start_xen+0x348/0xbc4
(XEN)    [<00000000002001c0>] arm64/head.o#primary_switched+0x10/0x30
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) CPU0: Unexpected Trap: Data Abort
(XEN) ****************************************

2. Apply patch and use two memory banks which have a big gap:
Memory node:
memory@80000000 {
                device_type = "memory";
                reg = <0x00 0x80000000 0x00 0x7f000000 0x8800 0x00000000 0x00 
0x80000000>;
};

Log:
(XEN)   VTCR_EL2: 80000000
(XEN)  VTTBR_EL2: 0000000000000000
(XEN)
(XEN)  SCTLR_EL2: 30cd183d
(XEN)    HCR_EL2: 0000000000000038
(XEN)  TTBR0_EL2: 000000008413c000
(XEN)
(XEN)    ESR_EL2: 96000043
(XEN)  HPFAR_EL2: 0000000000000000
(XEN)    FAR_EL2: 0000000000443000
(XEN)
(XEN) Xen call trace:
(XEN)    [<000000000025c7a0>] clear_page+0x10/0x2c (PC)
(XEN)    [<000000000026cf9c>] mm.c#xen_pt_update+0x1b8/0x7b0 (LR)
(XEN)    [<00000000002ca298>] setup_xenheap_mappings+0xb4/0x134
(XEN)    [<00000000002cc1b0>] start_xen+0xb6c/0xbcc
(XEN)    [<00000000002001c0>] arm64/head.o#primary_switched+0x10/0x30
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) CPU0: Unexpected Trap: Data Abort
(XEN) ****************************************

Kind regards,
Henry

> >
> > For convenience, I have pushed a branch with the series applied here:
> >
> > https://xenbits.xen.org/gitweb/?p=people/julieng/xen-
> > unstable.git;a=shortlog;h=refs/heads/pt/rfc-v2
> >
> 
> Great, thanks!
> 
> > Cheers,
> >
> > [1] https://lore.kernel.org/xen-devel/20210425201318.15447-1-
> > julien@xxxxxxx/
> >
> > --
> > Julien Grall


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.