[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [XEN RFC PATCH 23/40] xen/arm: introduce a helper to parse device tree memory node


  • To: Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • From: Wei Chen <Wei.Chen@xxxxxxx>
  • Date: Sat, 28 Aug 2021 13:58:25 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=KfTalIHeFFOw52GLBtuyzY+WxhvfLn6EgD4jh350JOk=; b=OjT27xz02vHSDnc5LYxVtYj9xJCWMmYcfrIvAv+zPV1/dN4Hp3pLYCjudIGRlSKeBN112tkd/2o2D9D5goZMP446h32kfosVhMdw18Zdlt+D01HrB58sPjfGdKzy2MUixgLhWD+DHFT0tDXPws5jozpKEh1KPsI15eamlTJ1T7KguqKSr7JuWrTOzf7z3xkdxiDn0RvmxiVm5sL24unlnB5PqB5oQbnCd4I/gR0CtZpj4LQfdSdD2Au0gjWxHbe86Lr1JoXNvjdZnUo1AEZ/sg67Ww2+80rpKeWiJbOMrW1XGUyU8U4IStQAJX1mP7mvgxDayF/eKLuBES6lB56khQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JX3HF/v0GuixwM7RpCQEaOpZ9xLac0w2QFv/+VGve/f8Vzj2JNgkMGIM82FIqJhmPeti3krBhVYC4MKposvAOHBUEoMzUShfH5TqIB/CWZ8UzoTo2Nq96/EDVnNyWV4ijNmmQY3OI6psnRu/i9FxAJ3b4ObHdlnRRc2fwB1viwCdgxM1Nb3BQ/X0h5e5LycGwiC4YrCk/f5+kPBAGR0PJLLuWcF9k5ix3IiwngMzEisdKOEiA5ApE3RYfpw+Xrv+5iwkjL4bQCNAPel0IcNG+4LFJx8cIstmt4PB73gLCCs0X4MC3BEzgXPWgvU4A8QRb40NP0mg4AqDPz5u5HDRRg==
  • Authentication-results-original: xen.org; dkim=none (message not signed) header.d=none;xen.org; dmarc=none action=none header.from=arm.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>
  • Delivery-date: Sat, 28 Aug 2021 13:59:12 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: xen.org; dkim=none (message not signed) header.d=none;xen.org; dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHXjps6uVXfLZKD7EKSvvhEptOzS6uINJ4AgAAs9PCAAHGGgIAAOFuQ
  • Thread-topic: [XEN RFC PATCH 23/40] xen/arm: introduce a helper to parse device tree memory node

Hi Julien,

> -----Original Message-----
> From: Julien Grall <julien@xxxxxxx>
> Sent: 2021年8月28日 18:34
> To: Wei Chen <Wei.Chen@xxxxxxx>; Stefano Stabellini
> <sstabellini@xxxxxxxxxx>
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; Bertrand Marquis
> <Bertrand.Marquis@xxxxxxx>; Jan Beulich <jbeulich@xxxxxxxx>
> Subject: Re: [XEN RFC PATCH 23/40] xen/arm: introduce a helper to parse
> device tree memory node
> 
> Hi Wei,
> 
> On 28/08/2021 04:56, Wei Chen wrote:
> >> -----Original Message-----
> >> From: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> >> Sent: 2021��8��28�� 9:06
> >> To: Wei Chen <Wei.Chen@xxxxxxx>
> >> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; sstabellini@xxxxxxxxxx;
> julien@xxxxxxx;
> >> jbeulich@xxxxxxxx; Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
> >> Subject: Re: [XEN RFC PATCH 23/40] xen/arm: introduce a helper to parse
> >> device tree memory node
> >>
> >> On Wed, 11 Aug 2021, Wei Chen wrote:
> >>> Memory blocks' NUMA ID information is stored in device tree's
> >>> memory nodes as "numa-node-id". We need a new helper to parse
> >>> and verify this ID from memory nodes.
> >>>
> >>> In order to support memory affinity in later use, the valid
> >>> memory ranges and NUMA ID will be saved to tables.
> >>>
> >>> Signed-off-by: Wei Chen <wei.chen@xxxxxxx>
> >>> ---
> >>>   xen/arch/arm/numa_device_tree.c | 130
> ++++++++++++++++++++++++++++++++
> >>>   1 file changed, 130 insertions(+)
> >>>
> >>> diff --git a/xen/arch/arm/numa_device_tree.c
> >> b/xen/arch/arm/numa_device_tree.c
> >>> index 37cc56acf3..bbe081dcd1 100644
> >>> --- a/xen/arch/arm/numa_device_tree.c
> >>> +++ b/xen/arch/arm/numa_device_tree.c
> >>> @@ -20,11 +20,13 @@
> >>>   #include <xen/init.h>
> >>>   #include <xen/nodemask.h>
> >>>   #include <xen/numa.h>
> >>> +#include <xen/libfdt/libfdt.h>
> >>>   #include <xen/device_tree.h>
> >>>   #include <asm/setup.h>
> >>>
> >>>   s8 device_tree_numa = 0;
> >>>   static nodemask_t processor_nodes_parsed __initdata;
> >>> +static nodemask_t memory_nodes_parsed __initdata;
> >>>
> >>>   static int srat_disabled(void)
> >>>   {
> >>> @@ -55,6 +57,79 @@ static int __init
> >> dtb_numa_processor_affinity_init(nodeid_t node)
> >>>       return 0;
> >>>   }
> >>>
> >>> +/* Callback for parsing of the memory regions affinity */
> >>> +static int __init dtb_numa_memory_affinity_init(nodeid_t node,
> >>> +                                paddr_t start, paddr_t size)
> >>> +{
> >>> +    struct node *nd;
> >>> +    paddr_t end;
> >>> +    int i;
> >>> +
> >>> +    if ( srat_disabled() )
> >>> +        return -EINVAL;
> >>> +
> >>> +    end = start + size;
> >>> +    if ( num_node_memblks >= NR_NODE_MEMBLKS )
> >>> +    {
> >>> +        dprintk(XENLOG_WARNING,
> >>> +                "Too many numa entry, try bigger NR_NODE_MEMBLKS \n");
> >>> +        bad_srat();
> >>> +        return -EINVAL;
> >>> +    }
> >>> +
> >>> +    /* It is fine to add this area to the nodes data it will be used
> >> later */
> >>> +    i = conflicting_memblks(start, end);
> >>> +    /* No conflicting memory block, we can save it for later usage */;
> >>> +    if ( i < 0 )
> >>> +        goto save_memblk;
> >>> +
> >>> +    if ( memblk_nodeid[i] == node ) {
> >>> +        /*
> >>> +         * Overlaps with other memblk in the same node, warning here.
> >>> +         * This memblk will be merged with conflicted memblk later.
> >>> +         */
> >>> +        printk(XENLOG_WARNING
> >>> +               "DT: NUMA NODE %u (%"PRIx64
> >>> +               "-%"PRIx64") overlaps with itself (%"PRIx64"-
> >> %"PRIx64")\n",
> >>> +               node, start, end,
> >>> +               node_memblk_range[i].start, node_memblk_range[i].end);
> >>> +    } else {
> >>> +        /*
> >>> +         * Conflict with memblk in other node, this is an error.
> >>> +         * The NUMA information is invalid, NUMA will be turn off.
> >>> +         */
> >>> +        printk(XENLOG_ERR
> >>> +               "DT: NUMA NODE %u (%"PRIx64"-%"
> >>> +               PRIx64") overlaps with NODE %u (%"PRIx64"-
> %"PRIx64")\n",
> >>> +               node, start, end, memblk_nodeid[i],
> >>> +               node_memblk_range[i].start, node_memblk_range[i].end);
> >>> +        bad_srat();
> >>> +        return -EINVAL;
> >>> +    }
> >>> +
> >>> +save_memblk:
> >>> +    nd = &nodes[node];
> >>> +    if ( !node_test_and_set(node, memory_nodes_parsed) ) {
> >>> +        nd->start = start;
> >>> +        nd->end = end;
> >>> +    } else {
> >>> +        if ( start < nd->start )
> >>> +            nd->start = start;
> >>> +        if ( nd->end < end )
> >>> +            nd->end = end;
> >>> +    }
> >>> +
> >>> +    printk(XENLOG_INFO "DT: NUMA node %u %"PRIx64"-%"PRIx64"\n",
> >>> +           node, start, end);
> >>> +
> >>> +    node_memblk_range[num_node_memblks].start = start;
> >>> +    node_memblk_range[num_node_memblks].end = end;
> >>> +    memblk_nodeid[num_node_memblks] = node;
> >>> +    num_node_memblks++;
> >>
> >>
> >> Is it possible to have non-contigous ranges of memory for a single NUMA
> >> node?
> >>
> >> Looking at the DT bindings and Linux implementation, it seems possible.
> >> Here, it seems that node_memblk_range/memblk_nodeid could handle it,
> >> but nodes couldn't.
> >
> > Yes, you're right. I copied this code for x86 ACPI NUMA. Does ACPI allow
> > non-contiguous ranges of memory for a single NUMA node too?
> 
> I couldn't find any restriction for ACPI. Although, I only briefly
> looked at the spec.
> 
> > If yes, I think
> > this will affect x86 ACPI NUMA too. In next version, we plan to merge
> > dtb_numa_memory_affinity_init and acpi_numa_memory_affinity_init into a
> > neutral function. So we can fix them at the same time.
> >
> > If not, maybe we have to keep the diversity for dtb and ACPI here.
> 
> I am not entirely sure what you mean. Are you saying if ACPI doesn't
> allow non-contiguous ranges of memory, then we should keep the
> implementation separated?
> 
> If so, then I disagree with that. It is fine to have code that supports
> more than what a firmware table supports. The main benefit is less code
> and therefore less long term maintenance (with the current solution we
> would need to check both the ACPI and DT implementation if there is a
> bug in one).
> 

Yes, I agree.

> Cheers,
> 
> --
> Julien Grall

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.