[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH] xen: credit1: on vCPU wakeup, kick away current only if makes sense



In fact, when waking up a vCPU, __runq_tickle() is called
to allow the new vCPU to run on a pCPU (which one, depends
on the relationship between the priority of the new vCPU,
and the ones of the vCPUs that are already running).

If there is no idle processor on which the new vCPU can
run (e.g., because of pinning/affinity), we try to migrate
away the vCPU that is currently running on the new vCPU's
processor (i.e., the processor on which the vCPU is waking
up).

Now, trying to migrate a vCPU has the effect of pushing it
through a

 running --> offline --> runnable

transition, which, in turn, has the following negative
effects:

 1) Credit1 counts that as a wakeup, and it BOOSTs the
    vCPU, even if it is a CPU-bound one, which wouldn't
    normally have deserved boosting. This can prevent
    legit IO-bound vCPUs to get ahold of the processor
    until such spurious boosting expires, hurting the
    performance!

 2) since the vCPU is fails the vcpu_runnable() test
    (within the call to csched_schedule() that follows
    the wakeup, as a consequence of tickling) the
    scheduling rate-limiting mechanism is also fooled,
    i.e., the context switch happens even if less than
    the minimum execution amount of time passed.

In particular, 1) has been reported to cause the following
issue:

 * VM-IO: 1-vCPU pinned to a pCPU, running netperf
 * VM-CPU: 1-vCPU pinned the the same pCPU, running a busy
           CPU loop
 ==> Only VM-I/O: throughput is 806.64 Mbps
 ==> VM-I/O + VM-CPU: throughput is 166.50 Mbps

This patch solves (for the above scenario) the problem
by checking whether or not it makes sense to try to
migrate away the vCPU currently running on the processor.
In fact, if there aren't idle processors where such a vCPU
can execute. attempting the migration is just futile
(harmful, actually!).

With this patch, in the above configuration, results are:

 ==> Only VM-I/O: throughput is 807.18 Mbps
 ==> VM-I/O + VM-CPU: throughput is 731.66 Mbps

Reported-by: Kun Suo <ksuo@xxxxxxxx>
Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
Tested-by: Kun Suo <ksuo@xxxxxxxx>
---
Cc: George Dunlap <george.dunlap@xxxxxxxxxx>
Cc: suokun <suokunstar@xxxxxxxxx>
---
Note that, about 1) above, it is IMO _wrong_ that Credit1 treats wakeups
resulting from migration of a vCPU to another pCPU as "regular wakeups",
hence granting BOOST priority to the vCPUs experiencing that. However:
 - fixing that is non-trivial, and requires being done in its own patch;
 - that is orthogonal to the fix being introduced here.  That is to say,
   even when Credit1 will be fixed not to boost migrating vCPUs, this patch
   will be still corect and necessary.

Regards,
Dario
---
 xen/common/sched_credit.c |   22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index b8f28fe..1b30e67 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -426,9 +426,10 @@ __runq_tickle(unsigned int cpu, struct csched_vcpu *new)
 
             /*
              * If there are no suitable idlers for new, and it's higher
-             * priority than cur, ask the scheduler to migrate cur away.
-             * We have to act like this (instead of just waking some of
-             * the idlers suitable for cur) because cur is running.
+             * priority than cur, check whether we can migrate cur away.
+             * (We have to do it indirectly, via _VPF_migrating, instead
+             * of just tickling any idler suitable for cur) because cur
+             * is running.)
              *
              * If there are suitable idlers for new, no matter priorities,
              * leave cur alone (as it is running and is, likely, cache-hot)
@@ -437,11 +438,18 @@ __runq_tickle(unsigned int cpu, struct csched_vcpu *new)
              */
             if ( new_idlers_empty && new->pri > cur->pri )
             {
+                csched_balance_cpumask(cur->vcpu, balance_step,
+                                       csched_balance_mask(cpu));
+                if ( cpumask_intersects(csched_balance_mask(cpu),
+                                        &idle_mask) )
+                {
+                    SCHED_VCPU_STAT_CRANK(cur, kicked_away);
+                    SCHED_VCPU_STAT_CRANK(cur, migrate_r);
+                    SCHED_STAT_CRANK(migrate_kicked_away);
+                    set_bit(_VPF_migrating, &cur->vcpu->pause_flags);
+                }
+                /* Tickle cpu anyway, to let new preempt cur. */
                 SCHED_STAT_CRANK(tickle_idlers_none);
-                SCHED_VCPU_STAT_CRANK(cur, kicked_away);
-                SCHED_VCPU_STAT_CRANK(cur, migrate_r);
-                SCHED_STAT_CRANK(migrate_kicked_away);
-                set_bit(_VPF_migrating, &cur->vcpu->pause_flags);
                 __cpumask_set_cpu(cpu, &mask);
             }
             else if ( !new_idlers_empty )


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.