[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] libxl: avoid considering pCPUs outside of the cpupool during NUMA placement



On 21/10/16 12:29, Wei Liu wrote:
> On Fri, Oct 21, 2016 at 11:56:14AM +0200, Dario Faggioli wrote:
>> During NUMA automatic placement, the information
>> of how many vCPUs can run on what NUMA nodes is used,
>> in order to spread the load as evenly as possible.
>>
>> Such information is derived from vCPU hard and soft
>> affinity, but that is not enough. In fact, affinity
>> can be set to be a superset of the pCPUs that belongs
>> to the cpupool in which a domain is but, of course,
>> the domain will never run on pCPUs outside of its
>> cpupool.
>>
>> Take this into account in the placement algorithm.
>>
>> Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
>> Reported-by: George Dunlap <george.dunlap@xxxxxxxxxx>
>> ---
>> Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
>> Cc: Wei Liu <wei.liu2@xxxxxxxxxx>
>> Cc: George Dunlap <george.dunlap@xxxxxxxxxx>
>> Cc: Juergen Gross <jgross@xxxxxxxx>
>> Cc: Anshul Makkar <anshul.makkar@xxxxxxxxxx>
>> ---
>> Wei, this is bugfix, so I think it should go in 4.8.
>>
> 
> Yes. I agree.
> 
>> Ian, this is bugfix, so I think it is a backporting candidate.
>>
>> Also, note that this function does not respect the libxl coding style, as far
>> as error handling is concerned. However, given that I'm asking for it to go 
>> in
>> now and to be backported, I've tried to keep the changes to the minimum.
>>
>> I'm up for a follow up patch for 4.9 to make the style compliant.
>>
>> Thanks, Dario
>> ---
>>  tools/libxl/libxl_numa.c |   25 ++++++++++++++++++++++---
>>  1 file changed, 22 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/libxl/libxl_numa.c b/tools/libxl/libxl_numa.c
>> index 33289d5..f2a719d 100644
>> --- a/tools/libxl/libxl_numa.c
>> +++ b/tools/libxl/libxl_numa.c
>> @@ -186,9 +186,12 @@ static int nr_vcpus_on_nodes(libxl__gc *gc, 
>> libxl_cputopology *tinfo,
>>  {
>>      libxl_dominfo *dinfo = NULL;
>>      libxl_bitmap dom_nodemap, nodes_counted;
>> +    libxl_cpupoolinfo cpupool_info;
>>      int nr_doms, nr_cpus;
>>      int i, j, k;
>>  
>> +    libxl_cpupoolinfo_init(&cpupool_info);
>> +
> 
> Please move this into the loop below, see (*).

Why? libxl_cpupoolinfo_dispose() will clear cpupool_info.

> 
>>      dinfo = libxl_list_domain(CTX, &nr_doms);
>>      if (dinfo == NULL)
>>          return ERROR_FAIL;
>> @@ -205,12 +208,18 @@ static int nr_vcpus_on_nodes(libxl__gc *gc, 
>> libxl_cputopology *tinfo,
>>      }
>>  
>>      for (i = 0; i < nr_doms; i++) {
>> -        libxl_vcpuinfo *vinfo;
>> -        int nr_dom_vcpus;
>> +        libxl_vcpuinfo *vinfo = NULL;
> 
> This is not necessary because vinfo is written right away.

No, the first "goto next" is before vinfo is being written.

> 
>> +        int cpupool, nr_dom_vcpus;
>> +
> 
> (*) here.
> 
>> +        cpupool = libxl__domain_cpupool(gc, dinfo[i].domid);
>> +        if (cpupool < 0)
>> +            goto next;
>> +        if (libxl_cpupool_info(CTX, &cpupool_info, cpupool))
>> +            goto next;
>>  
>>          vinfo = libxl_list_vcpu(CTX, dinfo[i].domid, &nr_dom_vcpus, 
>> &nr_cpus);
>>          if (vinfo == NULL)
>> -            continue;
>> +            goto next;
>>  
>>          /* Retrieve the domain's node-affinity map */
>>          libxl_domain_get_nodeaffinity(CTX, dinfo[i].domid, &dom_nodemap);
>> @@ -220,6 +229,12 @@ static int nr_vcpus_on_nodes(libxl__gc *gc, 
>> libxl_cputopology *tinfo,
>>               * For each vcpu of each domain, it must have both vcpu-affinity
>>               * and node-affinity to (a pcpu belonging to) a certain node to
>>               * cause an increment in the corresponding element of the array.
>> +             *
>> +             * Note that we also need to check whether the cpu actually
>> +             * belongs to the domain's cpupool (the cpupool of the domain
>> +             * being checked). In fact, it could be that the vcpu has 
>> affinity
>> +             * with cpus in suitable_cpumask, but that are not in its own
>> +             * cpupool, and we don't want to consider those!
>>               */
>>              libxl_bitmap_set_none(&nodes_counted);
>>              libxl_for_each_set_bit(k, vinfo[j].cpumap) {
>> @@ -228,6 +243,7 @@ static int nr_vcpus_on_nodes(libxl__gc *gc, 
>> libxl_cputopology *tinfo,
>>                  int node = tinfo[k].node;
>>  
>>                  if (libxl_bitmap_test(suitable_cpumap, k) &&
>> +                    libxl_bitmap_test(&cpupool_info.cpumap, k) &&
>>                      libxl_bitmap_test(&dom_nodemap, node) &&
>>                      !libxl_bitmap_test(&nodes_counted, node)) {
>>                      libxl_bitmap_set(&nodes_counted, node);
>> @@ -236,7 +252,10 @@ static int nr_vcpus_on_nodes(libxl__gc *gc, 
>> libxl_cputopology *tinfo,
>>              }
>>          }
>>  
>> + next:
>> +        libxl_cpupoolinfo_dispose(&cpupool_info);
>>          libxl_vcpuinfo_list_free(vinfo, nr_dom_vcpus);
>> +        vinfo = NULL;
> 
> This is not necessary as vinfo is rewritten at the beginning of every
> loop.
> 
>>      }
>>  
>>      libxl_bitmap_dispose(&dom_nodemap);
>>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.