[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH v3 8/9] xen/x86: add detection of memory interleaves for different nodes


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Wei Chen <Wei.Chen@xxxxxxx>
  • Date: Thu, 19 May 2022 03:37:05 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=lists.xenproject.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com])
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Xs4zjADiaEX/YriP9sNFK7JlmaSd1WYyrSb7ClwrtMA=; b=hx0l6yLrecPD7KatwgJ2sD4u+INKhWYjA+nAlD6Q9Mkify+B0laVxyr/LHsZzTbOPJRSVRpts5bVhk/eKTz97fCt7y2Eta2e+X/ohG7NVH0n8DigH5wz5XvpyemNBf6Xejre6ZcrbMIQtC0+qVeDtOOa2WRVPk4P9fB2oP/INIYLkD+vb6FL7SE8dXx+feGxB5yjQGA7t3Dds+87A+zH3NlZ2ExxV6nROh14Zq8sHBSixoP3F6M0+tXQt0lwK6kFVV0xnppf+gxNKfzLRvcd6owkS5OEILaPNmeZWKst5u+xMoAz5EMozWSPpkMUOS5BMEbeLDK5H66Et/AWQlIlCQ==
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Xs4zjADiaEX/YriP9sNFK7JlmaSd1WYyrSb7ClwrtMA=; b=ZJw2/v+DivS69JSBv4jATD/NZCs6QIDzFZUKnYn6qOkgdI65miqKNHwAELN11SkFVEluvIqEAmoCAvV2haStQ/ShZRiRqjXwhLw4gTcDOm2XEbVgjCY2PMJLv3Suu7BLE4vlI07OdqrFeIdqTGQH6MO3tO1UrG62dzom2vyeI7v2rTLO+KYqxD+TnwIz7tUn0rZHqzncQlXs131wlnJsSuX4DCGjF6FijA8EIMbjLWpttu7sYMo62AYVBkrYl75XyIMFrD7rMKui6JTQ+OuUxS1CPwxl3xqwz8FimddPdPKbQbjf0F1AY8gmu/SmKugKWN3k8sSNiCOjLMk6h5CkYQ==
  • Arc-seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=FBBEKbXRGWRX8WUMTgmo7yCffzOWXw0PIvnZBY8CFRGzJTclt0rC8SHW1vIWZsC/MxmmNANuFD1yU3EvJMHIk/Uo5cFvdxxzIO+jyi6DIRcoXUUzurSuCfE93/x6HmsN6gcD4UA20I9uIPyCRYxd6aYRvEZ7Zwdl/Rc/qp1e6N0lZMtojebd8B2JMFZq8NYLg7RlzhzphBgrK0zexok7oK0nIVl4M9/oLOC5Giwk1/u3Kj1LfHa8itt5DbbaUvDr4gUPxUpAQN8tnosz4ce5MqQzV97cnvhQjGHNjgy0W38JGvw8i+LKilUahe5+d8gNwqWza5ij5wdQeIvtC49Uig==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mGyfaF5CsmFGERQ0wkLAcbZYN9KUspgoOvQJiXxbVX0922hn8hQaEWaEDrIW3BBJtbsG3C4Lq94HxH5qrJtu0hMpIsF+nPrqi7gND+RdlabRUosrpFABGiLR3c66AyJ4Oy3DS9UCSOuypA/uOwEjkNSEGppsqJsBfmCAogVqdY/eNLZ9ewHuN29aYV/YZs5ToQXOIbLZoSE5KSSYGGZuyWVLbIUh2CDhjbOFX/9yoKrnEQP0yEpN2dhaV1DdShRC4JjHiqq/vhUWOiXdnXCLZ4sBad9hkWV9xBguc6DUHaJH0d19HgFfEOySyVvfYBNGAmQ0KrBJstB49SiuMC+/yw==
  • Authentication-results-original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Cc: nd <nd@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Jiamei Xie <Jiamei.Xie@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 19 May 2022 03:37:27 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHYZNkRevkwA8X5Q0qos/NOAm18uq0krTiAgADb+cA=
  • Thread-topic: [PATCH v3 8/9] xen/x86: add detection of memory interleaves for different nodes

Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: 2022年5月18日 21:31
> To: Wei Chen <Wei.Chen@xxxxxxx>
> Cc: nd <nd@xxxxxxx>; Andrew Cooper <andrew.cooper3@xxxxxxxxxx>; Roger Pau
> Monné <roger.pau@xxxxxxxxxx>; Wei Liu <wl@xxxxxxx>; Jiamei Xie
> <Jiamei.Xie@xxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx
> Subject: Re: [PATCH v3 8/9] xen/x86: add detection of memory interleaves
> for different nodes
> 
> On 11.05.2022 03:46, Wei Chen wrote:
> > --- a/xen/arch/x86/srat.c
> > +++ b/xen/arch/x86/srat.c
> > @@ -42,6 +42,12 @@ static struct node node_memblk_range[NR_NODE_MEMBLKS];
> >  static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
> >  static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
> >
> > +enum conflicts {
> > +   NO_CONFLICT = 0,
> 
> No need for the "= 0".
> 

Ack.

> > +   ERR_OVERLAP,
> 
> While this at least can be an error (the self-overlap case is merely
> warned about), ...
> 
> > +   ERR_INTERLEAVE,
> 
> ... I don't think this is, and hence I'd recommend to drop "ERR_".
> 

Oh, yes. I all drop it for all above enumerations.

> > @@ -119,20 +125,43 @@ int valid_numa_range(paddr_t start, paddr_t end,
> nodeid_t node)
> >     return 0;
> >  }
> >
> > -static __init int conflicting_memblks(paddr_t start, paddr_t end)
> > +static enum conflicts __init
> > +conflicting_memblks(nodeid_t nid, paddr_t start, paddr_t end,
> > +               paddr_t nd_start, paddr_t nd_end, int *mblkid)
> 
> Why "int"? Can the value passed back be negative?
> 

The caller "acpi_numa_memory_affinity_init" defines int for node memory
block id, and as a return value at the same time. So I haven't changed it.
As we don't use this "int" for return value any more, I agree, it will
never be negative, I would fix it.

> >  {
> >     int i;
> >
> > +   /*
> > +    * Scan all recorded nodes' memory blocks to check conflicts:
> > +    * Overlap or interleave.
> > +    */
> >     for (i = 0; i < num_node_memblks; i++) {
> >             struct node *nd = &node_memblk_range[i];
> > +           *mblkid = i;
> 
> Style: Please maintain a blank line between declaration(s) and
> statement(s).
> 

Ok.

> > @@ -310,42 +342,67 @@ acpi_numa_memory_affinity_init(const struct
> acpi_srat_mem_affinity *ma)
> >             bad_srat();
> >             return;
> >     }
> > +
> > +   /*
> > +    * For the node that already has some memory blocks, we will
> > +    * expand the node memory range temporarily to check memory
> > +    * interleaves with other nodes. We will not use this node
> > +    * temp memory range to check overlaps, because it will mask
> > +    * the overlaps in same node.
> > +    *
> > +    * Node with 0 bytes memory doesn't need this expandsion.
> > +    */
> > +   nd_start = start;
> > +   nd_end = end;
> > +   nd = &nodes[node];
> > +   if (nd->start != nd->end) {
> > +           if (nd_start > nd->start)
> > +                   nd_start = nd->start;
> > +
> > +           if (nd_end < end)
> 
> Did you mean nd->end here on the right side of < ? By intentionally

Oh! thanks for pointing out this one! Yes, right side should be nd->end.

> not adding "default:" in the body, you then also allow the compiler
> to point out that addition of a new enumerator also needs handling
> here.
> 

Did you mean, we need to add if ... else ... in this block? If yes,
is it ok to update this block like:
        if (nd->start != nd->end) {
                nd_start = min(nd_start, nd->start);
                nd_end = max(nd_end, nd->end);
        }
?

> > +                   nd_end = nd->end;
> > +   }
> > +
> >     /* It is fine to add this area to the nodes data it will be used
> later*/
> > -   i = conflicting_memblks(start, end);
> > -   if (i < 0)
> > -           /* everything fine */;
> > -   else if (memblk_nodeid[i] == node) {
> > -           bool mismatch = !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) !=
> > -                           !test_bit(i, memblk_hotplug);
> > -
> > -           printk("%sSRAT: PXM %u (%"PRIpaddr"-%"PRIpaddr") overlaps with
> itself (%"PRIpaddr"-%"PRIpaddr")\n",
> > -                  mismatch ? KERN_ERR : KERN_WARNING, pxm, start, end,
> > -                  node_memblk_range[i].start, node_memblk_range[i].end);
> > -           if (mismatch) {
> > +   status = conflicting_memblks(node, start, end, nd_start, nd_end, &i);
> > +   if (status == ERR_OVERLAP) {
> 
> Please use switch(status) when checking enumerated values.
> 

Ok, I will do it.

> > +           if (memblk_nodeid[i] == node) {
> > +                   bool mismatch = !(ma->flags &
> > +                                   ACPI_SRAT_MEM_HOT_PLUGGABLE) !=
> > +                                   !test_bit(i, memblk_hotplug);
> 
> Style: The middle line wants indenting by two more characters.
> 

Yes, I will do it.

> > +
> > +                   printk("%sSRAT: PXM %u (%"PRIpaddr"-%"PRIpaddr")
> overlaps with itself (%"PRIpaddr"-%"PRIpaddr")\n",
> > +                          mismatch ? KERN_ERR : KERN_WARNING, pxm, start,
> > +                          end, node_memblk_range[i].start,
> > +                          node_memblk_range[i].end);
> > +                   if (mismatch) {
> > +                           bad_srat();
> > +                           return;
> > +                   }
> > +           } else {
> > +                   printk(KERN_ERR
> > +                          "SRAT: PXM %u (%"PRIpaddr"-%"PRIpaddr") overlaps
> with PXM %u (%"PRIpaddr"-%"PRIpaddr")\n",
> > +                          pxm, start, end, node_to_pxm(memblk_nodeid[i]),
> > +                          node_memblk_range[i].start,
> > +                          node_memblk_range[i].end);
> >                     bad_srat();
> >                     return;
> >             }
> > -   } else {
> > +   } else if (status == ERR_INTERLEAVE) {
> >             printk(KERN_ERR
> > -                  "SRAT: PXM %u (%"PRIpaddr"-%"PRIpaddr") overlaps with
> PXM %u (%"PRIpaddr"-%"PRIpaddr")\n",
> > -                  pxm, start, end, node_to_pxm(memblk_nodeid[i]),
> > +                  "SRAT: Node %u: (%"PRIpaddr"-%"PRIpaddr") interleaves
> with node %u memblk (%"PRIpaddr"-%"PRIpaddr")\n",
> > +                  node, nd_start, nd_end, memblk_nodeid[i],
> 
> Please log pxm (not node) here just like is done in the overlap case.
> The remote node ID will then require converting to PXM, of course.
> 

Ok, will use PXM here. But I have question for upcoming changes, if we
move this part of code to common. As device tree NUMA doesn't have
PXM concept (even I can use a fake node_to_pxm to do 1:1 mapping), so
can we still use PXM here?

> Jan


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.