[Xen-devel] [PATCH] CSCHED: Optimize __runq_tickle to reduce IPI

To:	"xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject:	[Xen-devel] [PATCH] CSCHED: Optimize __runq_tickle to reduce IPIs
From:	"Wei, Gang" <gang.wei@xxxxxxxxx>
Date:	Fri, 2 Apr 2010 15:03:17 +0800
Accept-language:	zh-CN, en-US
Acceptlanguage:	zh-CN, en-US
Cc:	George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, "Tian, Kevin" <kevin.tian@xxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, "Yu, Ke" <ke.yu@xxxxxxxxx>
Delivery-date:	Fri, 02 Apr 2010 00:04:50 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<E6467867A6B05E4FA831B7DF29925F5C40D6628C@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<E6467867A6B05E4FA831B7DF29925F5C40D6628C@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index:	AcrHPUalciZ52kqPSSS8mkLvqOuGTgK8ywbg
Thread-topic:	[PATCH] CSCHED: Optimize __runq_tickle to reduce IPIs

Since there are no concern or objection yet, here is my implementation for this 
optimization. Thanks to Jan for recommending cycle_cpu usage.

Jimmy

CSCHED: Optimize __runq_tickle to reduce IPIs

Limiting the number of idle cpus tickled for vcpu migration purpose
to ONLY ONE to get rid of a lot of IPI events which may impact the
average cpu idle residency time.

The default on option 'tickle_one_idle_cpu=0' can be used to disable
this optimization if needed.

Signed-off-by: Wei Gang <gang.wei@xxxxxxxxx>

diff -r dbb473bba30b xen/common/sched_credit.c
--- a/xen/common/sched_credit.c Fri Apr 02 10:45:39 2010 +0800
+++ b/xen/common/sched_credit.c Fri Apr 02 11:43:56 2010 +0800
@@ -228,6 +228,11 @@ static void burn_credits(struct csched_v
     svc->start_time += (credits * MILLISECS(1)) / CSCHED_CREDITS_PER_MSEC;
 }
 
+static int opt_tickle_one_idle __read_mostly = 1;
+boolean_param("tickle_one_idle_cpu", opt_tickle_one_idle);
+
+DEFINE_PER_CPU(unsigned int, last_tickle_cpu) = 0;
+
 static inline void
 __runq_tickle(unsigned int cpu, struct csched_vcpu *new)
 {
@@ -265,8 +270,21 @@ __runq_tickle(unsigned int cpu, struct c
         }
         else
         {
-            CSCHED_STAT_CRANK(tickle_idlers_some);
-            cpus_or(mask, mask, csched_priv.idlers);
+            cpumask_t idle_mask;
+
+            cpus_and(idle_mask, csched_priv.idlers, new->vcpu->cpu_affinity);
+            if ( !cpus_empty(idle_mask) )
+            {
+                CSCHED_STAT_CRANK(tickle_idlers_some);
+                if ( opt_tickle_one_idle )
+                {
+                    this_cpu(last_tickle_cpu) = 
+                        cycle_cpu(this_cpu(last_tickle_cpu), idle_mask);
+                    cpu_set(this_cpu(last_tickle_cpu), mask);
+                }
+                else
+                    cpus_or(mask, mask, idle_mask);
+            }
             cpus_and(mask, mask, new->vcpu->cpu_affinity);
         }
     }

On Friday, 2010-3-19 4:22 PM, Wei, Gang wrote:
> I used to find for multiple idle vms case, there are a lot of break
> events come from IPIs which are used to raise SCHEDULE_SOFTIRQ to
> wake up idle cpus to do load balancing -- csched_vcpu_wake
> ->__runq_tickle->cpumask_raise_softirq. In __runq_tickle(), if there
> are at least two vcpus runable, it will try to tickle all idle cpus
> which have affinity with the waking up vcpu to let them pull this
> vcpu away.      
> 
> I am thinking about an optimization, limiting the number of idle cpus
> tickled for vcpu migration purpose to ONLY ONE to get rid of a lot of
> IPI events which may impact the average cpu idle residency time.  
> 
> There are two concerns about this optimization:
> 1. if the only one target cpu failed to pull this vcpu (for the
> reason such as it just has been scheduled for another vcpu), this
> vcpu may stay on the original cpu for a long period until
> suspend/wakeup again and keep system cpus unbalanced.   
> 2. if first_cpu() was used as the way to choose the target among all
> possible idle cpus, will it cause overall unbalanced cpu utilization?
> i.e. cpu 0 > cpu 1 > ... > cpu N  
> 
> Do my concerns make sense? Or any comments, suggestions, ...
> 
> Jimmy

selective_runq_tickle_v3.patch
Description: selective_runq_tickle_v3.patch

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] [PATCH] CSCHED: Optimize __runq_tickle to reduce IPIs