[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] sched: credit2: respect per-vcpu hard affinity



So, being finally able to look at this again, I'm going directly to
reviewing v2, since it's available.

However, there was an open question on this email, which I figured I
should answer, so here it is. :-)

On Mon, 2015-01-19 at 21:21 -1000, Justin Weaver wrote:
> On Mon, Jan 12, 2015 at 8:05 AM, Dario Faggioli
> <dario.faggioli@xxxxxxxxxx> wrote:

> >>      /* First check to see if we're here because someone else suggested a 
> >> place
> >> @@ -1081,13 +1099,17 @@ choose_cpu(const struct scheduler *ops, struct 
> >> vcpu *vc)
> >>          else
> >>          {
> >>              d2printk("%pv +\n", svc->vcpu);
> >> -            new_cpu = cpumask_cycle(vc->processor, 
> >> &svc->migrate_rqd->active);
> >> -            goto out_up;
> >> +            cpumask_and(&temp_mask, vc->cpu_hard_affinity,
> >> +                &svc->migrate_rqd->active);
> >> +            if ( !cpumask_empty(&temp_mask) )
> >> +            {
> >> +                new_cpu = cpumask_any(&temp_mask);
> >> +                goto out_up;
> >> +            }
> >> +            /* Fall-through to normal cpu pick */
> >>
> > So, if we've been asked to move to a cpu where we are not allowed to run
> > (i.e., in case temp_mask ends up empty), we just, silently, ignore the
> > request. This has the potential of screwing the work of the load
> > balancer. If we need to keep this as a sort of "safety catch", then
> > fine, but I'd really like to see a lot of efforts made in the load
> > balancing code to make this not trigger!
>
> I'm glad you brought this up because it's not clear to me what the
> relationship between migrate() and choose_cpu() is, if any. This block
> of code in choose_cpu only triggers if __CSFLAG_runq_migrate_request
> is set, and that flag is only set in migrate(), and only if
> __CSFLAG_scheduled is set. 
>
Yes, because, if the vCPU in question is running, we can't just move it,
and all we can do is "file a migration request". To do so, in Credit2, 2
flags are set:
 - _VPF_migrating, in svc->vcpu->pause_flags
 - __CSFLAG_runq_migrate_request, in svc->flags

> choose_cpu() is only called in response to
> the pick_cpu call in schedule.c in vcpu_migrate() which doesn't have
> anything to do with the load balancer balance_load() in credit 2. 
>
Well, the Credit2 load balancer (sched_credit2.c:balance_load()) calls
sched_credit2.c:migrate(), which, if the vCPU it wants to act on is
running, does as said above: it sets both _VPF_migrating and
__CSFLAGS_runq_migrate_request.

Now, as soon as schedule.c:schedule() is invoked on the pCPU where that
vCPU is running, _VPF_migrating is checked (in context_saved(), called
by context_switch()) and, if it's set, schedule():vcpu_migrate() is
called.

That gets us back to the pick_cpu hook, and hence, in Credit2, to
choose_cpu(), and all because of something initiated by the Credit2 load
balancer... So choose_cpu() and balance_load() seem to have some
relationship, after all. :-D

In fact, if you look at the comment that documents the meaning of the
flag, it says:

/* CSFLAG_runq_migrate_request: This vcpu is being migrated as a result of a
 * credit2-initiated runq migrate request; migrate it to the runqueue indicated
 * in the svc struct.
 */

> It
> seems to me that the hard affinity check is needed at the end of this
> block in case affinity has changed after __CSFLAG_runq_migrate_request
> is set in migrate() and before a call to choose_cpu(). Please let me
> know what you think.
> 
migrate() is only called when all the proper locks are held, so there is
a few scope for things changing under its feet (look at
schedule.c:vcpu_set_affinity()), so I don't think I see the potential
race you're talking about.

That being said, I agree that checking hard affinity is important here.
In fact, no matter whether it is Credit2 load balancer or something else
that brought us here, but we can't return to pick_cpu() a pCPU which is
not in our hard affinity mask.

For more/other comments, see the reply to v2 posting... :-)

Regards,
Dario

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.