[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] schedulers' use of cpumask_scratch



On Thu, 2016-12-08 at 07:33 -0700, Jan Beulich wrote:
> Dario,
> 
Hi,

> While it's
> already
> in use in credit1's __runq_tickle(), I then got puzzled by the
> different
> ways credit1 and credit2 use the variable: The former always uses
> scratch_cpumask_cpu() with the subject CPU as argument, while
> the latter always uses plain scratch_cpumask. What guarantees
> that these two uses never conflict, if both schedulers are active for
> some part of the system?
> 
So, what's important, when using the scratch masks of one CPU is that:
 - the CPU belongs to your cpupool and scheduler,
 - you own the runqueue lock (the one you take via 
   {v,p}cpu_schedule_lock()) for that CPU.

So, in Credit1:

__runq_tickle(): cpu is what's in new's processor. new comes from v in
vcpu_wake(); we take its lock, but we may not be on v->processor, so we
need scratch_cpumask_cpu().

csched_runq_steal(): cpu comes from cpu in csched_load_balance(), which
comes from cpu in csched_schedule(), which is smp_processor_id(). So,
in this case, I could have used cpumask_scratch (and probably should
make a patch to that effect).

In Credit2:

get_fallback_cpu(): it's called by csched2_cpu_pick(). That comes from
either vcpu_migrate() (via csched2_cpu_pick()) or
from csched2_vcpu_insert(), and in both cases, we can't be sure we
either are on or hold the lock of smp_processor_id. So, yes, this
indeed must be converted in cpumask_scratch_cpu(cpu), with cpu being
one of the CPUs for which we hold the runqueue lock (which in case of
Credit2 is per runqueue, not per-CPU, but that does not make the
current code less wrong, I think)!

csched2_cpu_pick(): should be fixed and use cpumask_scratch_cpu(), for
the same reasons listed above.

migrate(): when it comes from balance_load(), it's ok, because that
comes from csched2_schedule(), and hence we can use this_cpu(), as
using cpumask_scratch directly does, because we have the lock on
smp_processor_id()'s runqueue. When it comes
from csched2_vcpu_migrate(), however, it's not, because vcpu_migrate()
can be called from other places. So, this needs fixing too. :-(

csched2_schedule(): here, it's obviously ok to just use
cpumask_scratch.

So, indeed it looks we are at least in need for a patch. What must have
happened is that I changed the logic of how scratch space was used when
implementing it in schedule.c, for all schedulers, but did not update
the Credit2 patches accordingly.

I'll send a fix ASAP (not sure if by today, as today I'm not working...
well :-P), and I guess that would be a 4.8.1 candidate. :-/

Thanks a lot for reporting this,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.