[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 24/48] xen: switch from for_each_vcpu() to for_each_sched_unit()

To: Jan Beulich <jbeulich@xxxxxxxx>
From: Juergen Gross <jgross@xxxxxxxx>
Date: Thu, 12 Sep 2019 16:02:17 +0200
Cc: Tim Deegan <tim@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
Delivery-date: Thu, 12 Sep 2019 14:02:36 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 09.09.19 17:14, Jan Beulich wrote:

On 09.08.2019 16:58, Juergen Gross wrote:

@@ -504,22 +511,21 @@ int sched_move_domain(struct domain *d, struct cpupool *c)
      if ( IS_ERR(domdata) )
          return PTR_ERR(domdata);

- vcpu_priv = xzalloc_array(void *, d->max_vcpus);

-    if ( vcpu_priv == NULL )
+    unit_priv = xzalloc_array(void *, d->max_vcpus);


I find it confusing that an array of units (as per the use below)
is dimensioned by the domain's vCPU count. Isn't there a correlation
between vCPU IDs and units IDs, perhaps along the lines of CPU
APIC (thread), core, and socket IDs? If so, the array size could
be bounded here by a smaller (down the road) value.


I'll add a comment in this regard in this patch and when the number
of vcpus per unit gets added (patch 31) I'll modify the array size.

@@ -880,18 +889,36 @@ void vcpu_force_reschedule(struct vcpu *v)
      vcpu_migrate_finish(v);
  }
+static bool sched_check_affinity_broken(struct sched_unit *unit)
const


Okay.

+{
+    struct vcpu *v;


const


Okay.

@@ -910,18 +937,20 @@ void restore_vcpu_affinity(struct domain *d)
                      cpupool_domain_cpumask(d));
          if ( cpumask_empty(cpumask_scratch_cpu(cpu)) )
          {
-            if ( v->affinity_broken )
+            if ( sched_check_affinity_broken(unit) )
              {
-                sched_set_affinity(v, unit->cpu_hard_affinity_saved, NULL);
-                v->affinity_broken = 0;
+                sched_set_affinity(unit->vcpu_list,
+                                   unit->cpu_hard_affinity_saved, NULL);
+                sched_reset_affinity_broken(unit);
                  cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity,
                              cpupool_domain_cpumask(d));
              }

if ( cpumask_empty(cpumask_scratch_cpu(cpu)) )

              {
-                printk(XENLOG_DEBUG "Breaking affinity for %pv\n", v);
-                sched_set_affinity(v, &cpumask_all, NULL);
+                printk(XENLOG_DEBUG "Breaking affinity for %pv\n",
+                       unit->vcpu_list);
+                sched_set_affinity(unit->vcpu_list, &cpumask_all, NULL);
                  cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity,
                              cpupool_domain_cpumask(d));
              }
[...]> @@ -964,17 +992,18 @@ int cpu_disable_scheduler(unsigned int cpu)

for_each_domain_in_cpupool ( d, c )

      {
-        for_each_vcpu ( d, v )
+        struct sched_unit *unit;
+
+        for_each_sched_unit ( d, unit )
          {
              unsigned long flags;
-            struct sched_unit *unit = v->sched_unit;
              spinlock_t *lock = unit_schedule_lock_irqsave(unit, &flags);

cpumask_and(&online_affinity, unit->cpu_hard_affinity, c->cpu_valid);

              if ( cpumask_empty(&online_affinity) &&
                   cpumask_test_cpu(cpu, unit->cpu_hard_affinity) )
              {
-                if ( v->affinity_broken )
+                if ( unit->vcpu_list->affinity_broken )
                  {
                      /* The vcpu is temporarily pinned, can't move it. */
                      unit_schedule_unlock_irqrestore(lock, flags, unit);
@@ -982,14 +1011,15 @@ int cpu_disable_scheduler(unsigned int cpu)
                      break;
                  }

- printk(XENLOG_DEBUG "Breaking affinity for %pv\n", v);

+                printk(XENLOG_DEBUG "Breaking affinity for %pv\n",
+                       unit->vcpu_list);

- sched_set_affinity(v, &cpumask_all, NULL);

+                sched_set_affinity(unit->vcpu_list, &cpumask_all, NULL);
              }

- if ( v->processor != cpu )

+            if ( sched_unit_cpu(unit) != sched_get_resource_cpu(cpu) )
              {
-                /* The vcpu is not on this cpu, so we can move on. */
+                /* The unit is not on this cpu, so we can move on. */
                  unit_schedule_unlock_irqrestore(lock, flags, unit);
                  continue;
              }
@@ -1002,17 +1032,17 @@ int cpu_disable_scheduler(unsigned int cpu)
               *  * the scheduler will always find a suitable solution, or
               *    things would have failed before getting in here.
               */
-            vcpu_migrate_start(v);
+            vcpu_migrate_start(unit->vcpu_list);
              unit_schedule_unlock_irqrestore(lock, flags, unit);

- vcpu_migrate_finish(v);

+            vcpu_migrate_finish(unit->vcpu_list);


All the ->vcpu_list references look bogus considering where you're
moving, but I can only guess that all of this will need touching
again later in the series. I wonder though whether these wouldn't
better change into for-each-vCPU-in-unit loops right away.


Especially the vcpu_migrate part is more complicated. I think it is
much easier to review with the more mechanical changes split from the
logical changes.

              /*
               * The only caveat, in this case, is that if a vcpu active in
               * the hypervisor isn't migratable. In this case, the caller
               * should try again after releasing and reaquiring all locks.
               */
-            if ( v->processor == cpu )
+            if ( sched_unit_cpu(unit) == sched_get_resource_cpu(cpu) )


Is comparing the (pseudo) CPU values here the most efficient approach
generated code wise? Can't there be some pointer comparison that's
cheaper?


Yes, you are right. unit->res == get_sched_res(cpu) is equivalent.

@@ -1023,8 +1053,8 @@ int cpu_disable_scheduler(unsigned int cpu)
  static int cpu_disable_scheduler_check(unsigned int cpu)
  {
      struct domain *d;
-    struct vcpu *v;
      struct cpupool *c;
+    struct vcpu *v;

c = per_cpu(cpupool, cpu);

      if ( c == NULL )


Stray change?


Yes.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH v2 24/48] xen: switch from for_each_vcpu() to for_each_sched_unit()
  - From: Jan Beulich

References:
- Re: [Xen-devel] [PATCH v2 24/48] xen: switch from for_each_vcpu() to for_each_sched_unit()
  - From: Jan Beulich

Prev by Date: Re: [Xen-devel] [PATCH 6/8] tools/libxc: Rework xc_cpuid_apply_policy() to use {get, set}_cpu_policy()
Next by Date: [Xen-devel] [PATCH] public/xen.h: update the comment explaining 'Wallclock time'
Previous by thread: Re: [Xen-devel] [PATCH v2 24/48] xen: switch from for_each_vcpu() to for_each_sched_unit()
Next by thread: Re: [Xen-devel] [PATCH v2 24/48] xen: switch from for_each_vcpu() to for_each_sched_unit()
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.