[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3] xen/dom0less: support for vcpu affinity



Hi Stefano,

On 18/02/2025 20:29, Stefano Stabellini wrote:
Add vcpu affinity to the dom0less bindings. Example:

     dom1 {
             ...
             cpus = <4>;
             vcpu0 {
                    compatible = "xen,vcpu-affinity";

I would prefer if the compatible is "xen,vcpu". This would allow us to extend for anything that vCPU specific. I would also look less odd if someone ...

                    id = <0>;
                    hard-affinity = "4-7";

... doesn't specify hard-affinity which is optional.

             };
             vcpu1 {
                    compatible = "xen,vcpu-affinity";
                    id = <1>;
                    hard-affinity = "0-3";

NIT: This example is exactly the same as vcpu0. How about changing to a list of range/single value? This would make clear that a mix is possible.

             };
             vcpu2 {
                    compatible = "xen,vcpu-affinity";
                    id = <2>;
                    hard-affinity = "1,6";
             };
             ...

Note that the property hard-affinity is optional. It is possible to add
other properties in the future not only to specify soft affinity, but
also to provide more precise methods for configuring affinity. For
instance, on ARM the MPIDR could be use to specify the pCPU. For now, it
is left to the future.

Signed-off-by: Xenia Ragiadakou <xenia.ragiadakou@xxxxxxx>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xxxxxxx>
---
Changes in v3:
- improve commit message
- improve binding doc
- add panic on invalid pCPU
- move parsing to a separate function

diff --git a/docs/misc/arm/device-tree/booting.txt 
b/docs/misc/arm/device-tree/booting.txt
index 9c881baccc..10e55c825c 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -324,6 +324,27 @@ The ramdisk sub-node has the following properties:
      property because it will be created by the UEFI stub on boot.
      This option is needed only when UEFI boot is used.
+Under the "xen,domain" compatible node, it is possible optionally to add
+one or more sub-nodes to configure vCPU affinity. The vCPU affinity
+sub-node has the following properties:
+
+- compatible
+
+    "xen,vcpu-affinity"
+
+- id
+
+    A 32-bit integer that specifies the vCPU id. 0 is the first vCPU.
+    The last vCPU is cpus-1, where "cpus" is the number of vCPUs
+    specified with the "cpus" property under the "xen,domain" node.

I think it would be worth mentioning that each node must have a unique ID. It is not necessary to check in the code, but it would avoid the question of what happen if someone specify twice the VCPU with different affinity.

+
+- hard-affinity
+
+    Optional. A string specifying the hard affinity configuration for the
+    vCPU: a comma-separated list of pCPUs or ranges of pCPUs is used.
+    Ranges are hyphen-separated intervals (such as `0-4`) and are inclusive
+    on both sides. The numbers refer to pCPU ids.

Technically MPIDRs are pCPUs ID. So I would add "logical" in front of pCPU ids to make clear what IDs are we talking about

+
Example
  =======

No update to the example?

diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index 49d1f14d65..e364820189 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -810,6 +810,68 @@ static int __init construct_domU(struct domain *d,
      return rc;
  }
+static void __init domain_vcpu_affinity(struct domain *d,
+                                        const struct dt_device_node *node)
> +{> +    const char *hard_affinity_str = NULL;
+    struct dt_device_node *np;
+    uint32_t val;
+    int rc;

Can you expain why 'rc', 'val', 'hard_affinity_str' are defined outside of the loop when ...

+
+    dt_for_each_child_node(node, np)
+    {
+        const char *s;
+        struct vcpu *v;
+        cpumask_t affinity;

... they are not? Yet they have the same property (i.e. only called within the loop).

+
+        if ( !dt_device_is_compatible(np, "xen,vcpu-affinity") )
+            continue;
+
+        if ( !dt_property_read_u32(np, "id", &val) )

Looking at the binding you wrote, "id" is mandatory. So I think we should throw an error if it is not present.

+            continue;
> +> +        if ( val >= d->max_vcpus )
+            panic("Invalid vcpu_id %u for domain %s\n", val, 
dt_node_name(node));

NIT: Maybe print the maximum number of vCPUs? This would make easier to know what's wrong with the ID.

+
+        v = d->vcpu[val];
+        rc = dt_property_read_string(np, "hard-affinity", &hard_affinity_str);
+        if ( rc < 0 )
+            continue;
+
+        s = hard_affinity_str;

OOI, you don't seem to use hard_affinity_str afterwards, so why can't we use 'hard_affinity_str' directly and avoid an extra variable?

+        cpumask_clear(&affinity);
+        while ( *s != '\0' )
+        {
+            unsigned int start, end;
+
+            start = simple_strtoul(s, &s, 0);
+
+            if ( *s == '-' )    /* Range */
+            {
+                s++;
+                end = simple_strtoul(s, &s, 0);
+            }
+            else                /* Single value */
+                end = start;
+
+            if ( end >= nr_cpu_ids )
+                panic("Invalid pCPU %u for domain %s\n", end, 
dt_node_name(node));
+
+            for ( ; start <= end; start++ )
+                cpumask_set_cpu(start, &affinity);
+
+            if ( *s == ',' )
+                s++;
+            else if ( *s != '\0' )
+                break;

NIT: We seem to have various place in Xen parsing range (e.g. init_boot_pages()). Could we provide an helper to avoid duplicating the code?

+        }
+
+        rc = vcpu_set_hard_affinity(v, &affinity);
+        if ( rc )
+            panic("vcpu%d: failed to set hard affinity\n", v->vcpu_id);

Can we print the domain name like you do before and also 'rc'? This would help any debugging.


+    }
+}
+
  void __init create_domUs(void)
  {
      struct dt_device_node *node;
@@ -992,6 +1054,8 @@ void __init create_domUs(void)
          if ( rc )
              panic("Could not set up domain %s (rc = %d)\n",
                    dt_node_name(node), rc);
+
+        domain_vcpu_affinity(d, node);

Shouldn't this call be part of construct_domU?

      }
  }

Cheers,

--
Julien Grall




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.