[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 04 of 11 v3] xen: sched_credit: let the scheduler know about node-affinity



Ok, here we are again! :-)

On mer, 2013-02-27 at 19:00 +0000, George Dunlap wrote:
> On Fri, Feb 1, 2013 at 11:01 AM, Dario Faggioli 

>         +
>         +/*
>         + * vcpu-affinity balancing is always necessary and must never
>         be skipped.
>         + * OTOH, if a domain has affinity with all the nodes, we can
>         tell the caller
>         + * that he can safely skip the node-affinity balancing step.
>         + */
>         +#define __vcpu_has_valuable_node_affinity(vc) \
>         +    ( !
>         cpumask_full(CSCHED_DOM(vc->domain)->node_affinity_cpumask) )
> 
> 
> Something about the name of this one just doesn't strike me right.  I
> might be tempted just to go with "__vcpu_has_node_affinity", and let
> the comment above it explain what it means for the curious.
> 
Yes, I deeply hate it too, but wasn't sure whether or not leaving the
"_valuable_" part out would have made things clear enough. Glad you
think it does, I'll kill it, while at the same time improve the comment,
as you suggest.

>         +
>         +static inline int csched_balance_step_skippable(int step,
>         struct vcpu *vc)
>         +{
>         +    if ( step == CSCHED_BALANCE_NODE_AFFINITY
>         +         && !__vcpu_has_valuable_node_affinity(vc) )
>         +        return 1;
>         +    return 0;
>         +}
> 
> 
> I'm not a fan of this kind of encapsulation, but I'll let it be I
> guess.  You missed a place to use it however -- in csched_runq_steal()
> you have the if() statement.
> 
Well, that was indeed on purpose, as in one of the places, I'm actually
skipping the balancing step as a whole, while in csched_runq_steal(),
I'm actually skipping only the various vCPUs, one after the other.

Anyway, the fact that you've seen it like this clearly demonstrates this
attempt of mine of being, let's say, precise, did not work... I'm happy
with killing the function above and put the if() there.


> Just an observation -- I think this will penalize systems that do not
> have node affinity enabled, in that if no one is using node
> affinities, then what will happen is the load balancing code will go
> through and check each vcpu on each processor, determine that there
> are no node affinities, and then go through again looking at vcpu
> affinities.  Going through twice for a single vcpu when doing
> placement is probably OK; but going all the way through all vcpus once
> could be pretty expensive.
> 
Well, yes, that is one of the downside of scanning the runqueue inthe
node-wise fashion we agreed upon during the review of one of the
previous rounds. Anyway...

> I think we should take this patch now (with the other minor changes
> mentioned above) so we can get it tested properly.  But we should
> probably try to do something to address this issue before the release
> -- maybe something like keeping a bitmask for each runqueue, saying
> whether any vcpus running on them have node affinity?  That way the
> first round we'll only check runqueues where we might conceivably
> steal something.
> 
...This sounds reasonable to me. I'm releasing v4 _without_ it, but with
a comment in the code. I'm already working on a solution along the line
of what you suggest, and I'm sure I can release the outcome later, as a
separate patch, if the series (as I hope! :-D) will be in at the time,
after having rerun all my usual benchmarks (which is something I can't
do right now).

> Other than that, looks good -- thanks for the good work.
> 
Thanks to you for bearing with it (and with me :-)).
Dario


-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.