[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] SEDF: avoid gathering vCPU-s on pCPU0


  • To: Jan Beulich <JBeulich@xxxxxxxx>
  • From: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>
  • Date: Mon, 04 Mar 2013 07:48:47 +0100
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxx>
  • Delivery-date: Mon, 04 Mar 2013 06:49:32 +0000
  • Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:Message-ID:Date:From:Organization: User-Agent:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=uG4GtVMNnepOpN436MJ5Pzg0y6gBc5j3InDbucOu+Ca60fEdDzwco0tP woU/GWBmcqJw5BpLrj/nfbgFnY6XqQ2udSPF/1MfmcRkKcQKOuop1Ym4a Buyehujxkw6LDhxX7qBuCUJxtG4oK1S5+odgkFTZVCaEVh5fYfpvQaRuQ NIektU6090ORrFmnQSrNGXfd9RGZoPIXfPhUPoXtoZHNLZIROADu41s5f fWnrWBSTrpvgrgtBenZPQlcQYsFmi;
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 01.03.2013 16:35, Jan Beulich wrote:
The introduction of vcpu_force_reschedule() in 14320:215b799fa181 was
incompatible with the SEDF scheduler: Any vCPU using
VCPUOP_stop_periodic_timer (e.g. any vCPU of half way modern PV Linux
guests) ends up on pCPU0 after that call. Obviously, running all PV
guests' (and namely Dom0's) vCPU-s on pCPU0 causes problems for those
guests rather sooner than later.

So the main thing that was clearly wrong (and bogus from the beginning)
was the use of cpumask_first() in sedf_pick_cpu(). It is being replaced
by a construct that prefers to put back the vCPU on the pCPU that it
got launched on.

However, there's one more glitch: When reducing the affinity of a vCPU
temporarily, and then widening it again to a set that includes the pCPU
that the vCPU was last running on, the generic scheduler code would not
force a migration of that vCPU, and hence it would forever stay on the
pCPU it last ran on. Since that can again create a load imbalance, the
SEDF scheduler wants a migration to happen regardless of it being
apparently unnecessary.

Of course, an alternative to checking for SEDF explicitly in
vcpu_set_affinity() would be to introduce a flags field in struct
scheduler, and have SEDF set a "always-migrate-on-affinity-change"
flag.

Or something like this? I don't like the test for sedf in schedule.c


diff -r 65105a4a8c7a xen/common/sched_sedf.c
--- a/xen/common/sched_sedf.c   Fri Mar 01 16:59:49 2013 +0100
+++ b/xen/common/sched_sedf.c   Mon Mar 04 07:35:53 2013 +0100
@@ -397,7 +397,8 @@ static int sedf_pick_cpu(const struct sc

     online = cpupool_scheduler_cpumask(v->domain->cpupool);
     cpumask_and(&online_affinity, v->cpu_affinity, online);
-    return cpumask_first(&online_affinity);
+    return cpumask_cycle(v->vcpu_id % cpumask_weight(&online_affinity) - 1,
+                         &online_affinity);
 }

 /*
@@ -1503,6 +1504,11 @@ out:
     return rc;
 }

+static void sedf_set_affinity(const struct scheduler *ops, struct vcpu *v)
+{
+    set_bit(_VPF_migrating, &v->pause_flags);
+}
+
 static struct sedf_priv_info _sedf_priv;

 const struct scheduler sched_sedf_def = {
@@ -1532,6 +1538,7 @@ const struct scheduler sched_sedf_def =
     .sleep          = sedf_sleep,
     .wake           = sedf_wake,
     .adjust         = sedf_adjust,
+    .set_affinity   = sedf_set_affinity,
 };

 /*
diff -r 65105a4a8c7a xen/common/schedule.c
--- a/xen/common/schedule.c     Fri Mar 01 16:59:49 2013 +0100
+++ b/xen/common/schedule.c     Mon Mar 04 07:35:53 2013 +0100
@@ -615,6 +615,8 @@ int vcpu_set_affinity(struct vcpu *v, co
     cpumask_copy(v->cpu_affinity, affinity);
     if ( !cpumask_test_cpu(v->processor, v->cpu_affinity) )
         set_bit(_VPF_migrating, &v->pause_flags);
+    if ( VCPU2OP(v)->set_affinity )
+        SCHED_OP(VCPU2OP(v), set_affinity, v);

     vcpu_schedule_unlock_irq(v);

diff -r 65105a4a8c7a xen/include/xen/sched-if.h
--- a/xen/include/xen/sched-if.h        Fri Mar 01 16:59:49 2013 +0100
+++ b/xen/include/xen/sched-if.h        Mon Mar 04 07:35:53 2013 +0100
@@ -180,6 +180,7 @@ struct scheduler {
     int          (*pick_cpu)       (const struct scheduler *, struct vcpu *);
     void         (*migrate)        (const struct scheduler *, struct vcpu *,
                                     unsigned int);
+    void         (*set_affinity)   (const struct scheduler *, struct vcpu *);
     int          (*adjust)         (const struct scheduler *, struct domain *,
                                     struct xen_domctl_scheduler_op *);
     int          (*adjust_global)  (const struct scheduler *,



--
Juergen Gross                 Principal Developer Operating Systems
PBG PDG ES&S SWE OS6                   Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@xxxxxxxxxxxxxx
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.