Xen project Mailing List

Re: [Xen-devel] [PATCH v3 4/4] sched: credit2: consider per-vcpu soft affinity

To: "Justin T. Weaver" <jtweaver@xxxxxxxxxx>

From: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>

Date: Wed, 22 Apr 2015 17:16:52 +0100

Cc: Dario Faggioli <dario.faggioli@xxxxxxxxxx>, henric@xxxxxxxxxx, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Wed, 22 Apr 2015 16:17:18 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Thu, Mar 26, 2015 at 9:48 AM, Justin T. Weaver <jtweaver@xxxxxxxxxx> wrote: [snip] > Added a function to determine the number of soft cpus gained (or lost) by a > given vcpu if it is migrated from a given source run queue to a given > destination run queue. > > Modified algorithm in balance_load and consider... > * if the load on lrqd and/or orqd is less than the number of their active > cpus, balance_load will look for vcpus that would have their soft affinity > improved by being pushed and/or pulled. Load does not need to be considered > since a run queue recieveing a pushed or pulled vcpu is not being fully > utilized. This returns vcpus that may have been migrated away from their > soft affinity due to load balancing back to their soft affinity. > * in consider, vcpus that might be picked for migration because pushing or > pulling them decreases the load delta are not picked if their current run > queue's load is less than its active cpu count and if that migration would > harm their soft affinity. There's no need to push/pull if the load is under > capacity, and the vcpu would lose access to some or all of its soft cpus. > * in consider, if a push/pull/swap migration decreases the load delta by a > similar amount to another push/pull/swap migration, then use soft cpu gain > as a tie breaker. This allows load to continue to balance across run > queues, > but favors soft affinity gains if the load deltas are close. OK, I've done a lot of thinking about this over the last few days, and here's what I've come up with. As I said before, there are basically two different activities in this patch. The first is to try to honor soft affinity *given* where the vcpu is placed; the other is to try to take soft affinity into account when *placing* a vcpu. The first bit is basically correct, and I've already given feedback on it. The second half certainly represents a lot of hard work and I think it's not a *bad* solution; but overall I'm not satisfied with it and I think we can do better. At a basic level, I don't like adding in the "if the load is less than the number of pcpus then ignore load and only pull based on softaffinity" shortcut to balance_load. I think that it's likely that the two halves of balance_load will have very different ideas about what "good" looks like, and that as a result vcpus will end up getting bounced back and forth depending on which particular algorithm ends up being activated. Apart from that, the whole thing is rather ad-hoc and makes it much more complicated to understand what is going on, or predict what would happen. My original plan wrt load balancing was to calculate some sort of "badness" (or perhaps "expected overhead") to the current placement situation, and try to minimize that. The idea was that you could then try to boil all the factors down to a single number to optimize. For example, you could add in "badness" for spinning up extra cores or sockets (or "goodness" for having them entirely empty), and then automatically balance performance vs power consumption; or, in this case, to balance the cost of running with remote memory accesses (as running outside your soft affinity would normally mean) vs the cost of waiting for the cpu. Of course, if you're only considering load, then no matter what you do, minimizing "badness" means minimizing load, so there's no reason to complicate things, which is why I wrote the load balancing code the way I did. But I've been thinking for the last few days and I think I can probably structure it differently to make it easier to get a consistent evaluation of the goodness / badness of a given setup. So what about this: 1. You re-submit this series, with the comments I've given, and the first half of this patch (i.e., leaving out the changes to choose_cpu and balance_load) 2. I'll work on re-working the load balancing code based on my ideas; i.e., calculating "badness" of the load and deciding based on that. I'll probably try to do the soft-affinity factoring myself as a proof of concept. What do you think? (Feel free to weigh in here too, Dario.) --- Also, a few comments on the code (just for feedback): > @@ -1207,15 +1296,75 @@ typedef struct { > /* NB: Modified by consider() */ > s_time_t load_delta; > struct csched2_vcpu * best_push_svc, *best_pull_svc; > + int soft_affinity_boost; > + bool_t valid_sa_boost; > /* NB: Read by consider() */ > struct csched2_runqueue_data *lrqd; > struct csched2_runqueue_data *orqd; > } balance_state_t; > > +/* > + * Return the number of pcpus gained in vc's soft affinity mask that vc can > + * run on if vc is migrated from run queue src_rqd to run queue dst_rqd. > + */ > +static int get_soft_affinity_gain(const struct vcpu *vc, > + const struct csched2_runqueue_data > *src_rqd, > + const struct csched2_runqueue_data > *dst_rqd) > +{ > + /* > + * Locks must already be held for src_rqd and dst_rqd. > + * Function assumes vc has at least hard affinity with one or more > + * pcpus in both the source and destination run queues. > + */ > + > + /* Does vcpu not have soft affinity? */ > + if ( !__vcpu_has_soft_affinity(vc, vc->cpu_hard_affinity) ) > + return 0; > + > + /* Does vcpu have soft affinity with pcpu(s) in the destination runq? */ > + sched_balance_cpumask(vc, SCHED_BALANCE_SOFT_AFFINITY, csched2_cpumask); > + if ( cpumask_intersects(csched2_cpumask, &dst_rqd->active) ) > + { > + int soft_cpus_dst; > + cpumask_and(csched2_cpumask, csched2_cpumask, &dst_rqd->active); > + soft_cpus_dst = cpumask_weight(csched2_cpumask); > + > + /* ... and soft affinity with the source run queue? */ > + sched_balance_cpumask(vc, SCHED_BALANCE_SOFT_AFFINITY, > + csched2_cpumask); > + if ( cpumask_intersects(csched2_cpumask, &src_rqd->active) ) > + { > + int soft_cpus_src; > + cpumask_and(csched2_cpumask, csched2_cpumask, &src_rqd->active); > + soft_cpus_src = cpumask_weight(csched2_cpumask); > + > + /* Soft affinity to soft affinity migration. */ > + return soft_cpus_dst - soft_cpus_src; > + } > + else > + /* Hard affinity to soft affinity migration. */ > + return soft_cpus_dst; > + } > + else > + { > + int soft_cpus_src = 0; > + cpumask_and(csched2_cpumask, csched2_cpumask, &src_rqd->active); > + soft_cpus_src = cpumask_weight(csched2_cpumask); > + > + /* > + * Hard affinity to hard affinity migration or soft affinity to hard > + * affinity migration. > + */ > + return -soft_cpus_src; > + } Is there a reason you can't just do the following? --- int soft_cpus_dst=0, soft_cpus_src=0; sched_balance_cpumask(...) if(cpumask_intersects(cpumask, dst)) { cpumask_and(...) soft_cpus_dst = cpumask_weight(...) /* Restore soft affinity mask for next check */ sched_balance_cpumask(...) } if(cpumask_intersects(cpumask, src)) { cpumask_and(...) soft_cpus_src = cpumask_weight(...) } return soft_cpus_dst - soft_cpus_src; --- It seems like that would be a lot easier to figure out what's going on. > @@ -1237,12 +1386,88 @@ static void consider(balance_state_t *st, > if ( delta < 0 ) > delta = -delta; > > - if ( delta < st->load_delta ) > + /* > + * Use soft affinity gain as a tie breaker if at least one migration has > + * already been picked and stored in the balance state, and the absolute > + * value of the difference between the delta in st and the new delta > being > + * considered here is less than 1/16th of the load_window_shift. > + */ > + delta_diff = delta - st->load_delta; > + if ( delta_diff < 0 ) > + delta_diff = -delta_diff; > + if ( (st->best_push_svc != NULL || st->best_pull_svc != NULL) > + && delta_diff < 1<<(load_window_shift - CSCHED2_DIVIDE_BY_16) ) > { > - st->load_delta = delta; > - st->best_push_svc=push_svc; > - st->best_pull_svc=pull_svc; > + int st_soft_gain = 0, consider_soft_gain = 0; > + > + /* Find the soft affinity gain for the migration in st. */ > + if ( !st->valid_sa_boost ) > + if ( st->best_push_svc ) > + st_soft_gain += get_soft_affinity_gain( > + st->best_push_svc->vcpu, st->lrqd, st->orqd); > + if ( st->best_pull_svc ) > + st_soft_gain += get_soft_affinity_gain( > + st->best_pull_svc->vcpu, st->orqd, st->lrqd); > + else > + st_soft_gain = st->soft_affinity_boost; Looks like you're missing some braces around those two indented if's. > + /* Only consider load delta. */ > + if ( delta < st->load_delta ) > + st->valid_sa_boost = 0; > + > + /* > + * If the migration results in a loss of some or all soft cpus and > the > + * vcpu's current run queue has less load than physical processors, > do > + * not use the migration. > + */ > + if ( push_svc && > + (st->lrqd->load < cpumask_weight(&st->lrqd->active) && > + get_soft_affinity_gain(push_svc->vcpu, st->lrqd, st->orqd) < 0) > ) > + return; > + if ( pull_svc && > + (st->orqd->load < cpumask_weight(&st->orqd->active) && > + get_soft_affinity_gain(pull_svc->vcpu, st->orqd, st->lrqd) < 0) > ) > + return; > + else > + return; Missing braces again. -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.