[PATCH v2 0/7] xen: credit2: limit the number of CPUs per runqueue


Here's v2 of this series... a bit late, but technically still in time
for code-freeze, although I understand this is quite tight! :-P

Anyway, In Credit2, the CPUs are assigned to runqueues according to the
host topology. For instance, if we want per-socket runqueues (which is
the default), the CPUs that are in the same socket will end up in the
same runqueue.

This is generally good for scalability, at least until the number of
CPUs that end up in the same runqueue is not too high. In fact, all this
CPUs will compete for the same spinlock, for making scheduling decisions
and manipulating the scheduler data structures. Therefore, if they are
too many, that can become a bottleneck.

This has not been an issue so far, but architectures with 128 CPUs per
socket are now available, and it is certainly unideal to have so many
CPUs in the same runqueue, competing for the same locks, etc.

Let's therefore set a cap to the total number of CPUs that can share a
runqueue. The value is set to 16, by default, but can be changed with
a boot command line parameter.

Note also that, if the host has hyperthreading (or equivalent), and we
are not using core-scheduling), additional care is taken to avoid splitting
CPUs that are hyperthread siblings among different runqueues.

In v2, in addition to trying to address the review comments, I've added
the logic for doing a full rebalance of the scheduler runqueues, while
the system is running. This is actually something that itself came up
during review of v1, when we realized that we do not only wanted a cap,
we also wanted some balancing, and if you want real balancing, you have
to be able to re-arrange the runqueue layout, dynamically.

It took a while because I, although I had something that looked a lot
like the final solution implemented in this patch, could not see how to
solve cleanly and effectively the issue of having the vCPUs in the
runqueues, while trying to re-balance them. It was while talking with
Juergen that we figured out that we can actually pause the domains,
which I had not thought at... So, once again, Juergen, thanks! :-)

I have done the most of the stress testing with core-scheduling
disabled, and it has survived to anything I threw at it, but of course
the more testing the better, and I will be able to actually do more of
it, in the coming days.

IAC, I have also verified that at least a few core-scheduling enabled
configs also work.

There are git branches here:
 git://xenbits.xen.org/people/dariof/xen.git  sched/credit2-max-cpus-runqueue-v2


While v1 is at the following link:

Thanks and Regards
Dario Faggioli (7):
      xen: credit2: factor cpu to runqueue matching in a function
      xen: credit2: factor runqueue initialization in its own function.
      xen: cpupool: add a back-pointer from a scheduler to its pool
      xen: credit2: limit the max number of CPUs in a runqueue
      xen: credit2: compute cpus per-runqueue more dynamically.
      cpupool: create an the 'cpupool sync' infrastructure
      xen: credit2: rebalance the number of CPUs in the scheduler runqueues

 docs/misc/xen-command-line.pandoc |   14 +
 xen/common/sched/cpupool.c        |   53 ++++
 xen/common/sched/credit2.c        |  437 ++++++++++++++++++++++++++++++++++---
 xen/common/sched/private.h        |    7 +
 xen/include/asm-arm/cpufeature.h  |    5 
 xen/include/asm-x86/processor.h   |    5 
 xen/include/xen/sched.h           |    1 
 7 files changed, 491 insertions(+), 31 deletions(-)

Dario Faggioli, Ph.D
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)



