commit 16381936ad320d010c7566c946a3e528f803e78a Author: Dario Faggioli Date: Tue Oct 27 23:22:16 2015 +0100 xen: credit1: on vCPU wakeup, kick away current only if makes sense In fact, when waking up a vCPU, __runq_tickle() is called to allow the new vCPU to run on a pCPU (which one, depends on the relationship between the priority of the new vCPU, and the ones of the vCPUs that are already running). If there is no idle processor on which the new vCPU can run (e.g., because of pinning/affinity), we try to migrate away the vCPU that is currently running on the new vCPU's processor (i.e., the processor on which the vCPU is waking up). Now, trying to migrate a vCPU has the effect of pushing it through a running --> offline --> runnable transition, which, in turn, has the following negative effects: 1) Credit1 counts that as a wakeup, and it BOOSTs the vCPU, even if it is a CPU-bound one, which wouldn't normally have deserved boosting. This can prevent legit IO-bound vCPUs to get ahold of the processor until such spurious boosting expires, hurting the performance! 2) since the vCPU is fails the vcpu_runnable() test (within the call to csched_schedule() that follows the wakeup, as a consequence of tickling) the scheduling rate-limiting mechanism is also fooled, i.e., the context switch happens even if less than the minimum execution amount of time passed. In particular, 1) has been reported to cause the following issue: * VM-IO: 1-vCPU pinned to a pCPU, running netperf * VM-CPU: 1-vCPU pinned the the same pCPU, running a busy CPU loop ==> Only VM-I/O: throughput is 806.64 Mbps ==> VM-I/O + VM-CPU: throughput is 166.50 Mbps This patch solves (for the above scenario) the problem by checking whether or not it makes sense to try to migrate away the vCPU currently running on the processor. In fact, we shouldn't even try to do it, if there aren't idle processors where such a vCPU can execute. In such case, Attempting the migration is just futile (harmful, actually!). With this patch, in the above configuration, results are: ==> Only VM-I/O: throughput is 807.18 Mbps ==> VM-I/O + VM-CPU: throughput is 731.66 Mbps Note that, still about 1), it is _wrong_ that Credit1 treats wakeups resulting from migration of a vCPU to another pCPU as "regular wakeups", hence granting BOOST priority to the vCPUs experiencing that. However: - fixing that is non-trivial, and requires being done in its own patch; - that is orthogonal to the fix being introduced here. That is to say, even when Credit1 will be fixed not to boost migrating vCPUs, this patch will be still corect and necessary. Reported-by: suokun Signed-off-by: Dario Faggioli --- Cc: George Dunlap Cc: suokun diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c index b8f28fe..1b30e67 100644 --- a/xen/common/sched_credit.c +++ b/xen/common/sched_credit.c @@ -426,9 +426,10 @@ __runq_tickle(unsigned int cpu, struct csched_vcpu *new) /* * If there are no suitable idlers for new, and it's higher - * priority than cur, ask the scheduler to migrate cur away. - * We have to act like this (instead of just waking some of - * the idlers suitable for cur) because cur is running. + * priority than cur, check whether we can migrate cur away. + * (We have to do it indirectly, via _VPF_migrating, instead + * of just tickling any idler suitable for cur) because cur + * is running.) * * If there are suitable idlers for new, no matter priorities, * leave cur alone (as it is running and is, likely, cache-hot) @@ -437,11 +438,18 @@ __runq_tickle(unsigned int cpu, struct csched_vcpu *new) */ if ( new_idlers_empty && new->pri > cur->pri ) { + csched_balance_cpumask(cur->vcpu, balance_step, + csched_balance_mask(cpu)); + if ( cpumask_intersects(csched_balance_mask(cpu), + &idle_mask) ) + { + SCHED_VCPU_STAT_CRANK(cur, kicked_away); + SCHED_VCPU_STAT_CRANK(cur, migrate_r); + SCHED_STAT_CRANK(migrate_kicked_away); + set_bit(_VPF_migrating, &cur->vcpu->pause_flags); + } + /* Tickle cpu anyway, to let new preempt cur. */ SCHED_STAT_CRANK(tickle_idlers_none); - SCHED_VCPU_STAT_CRANK(cur, kicked_away); - SCHED_VCPU_STAT_CRANK(cur, migrate_r); - SCHED_STAT_CRANK(migrate_kicked_away); - set_bit(_VPF_migrating, &cur->vcpu->pause_flags); __cpumask_set_cpu(cpu, &mask); } else if ( !new_idlers_empty )