Xen project Mailing List

[PATCH v2 0/7] xen: credit2: limit the number of CPUs per runqueue

From: Dario Faggioli <dfaggioli@xxxxxxxx>

Date: Thu, 28 May 2020 23:29:17 +0200

Cc: Juergen Gross <jgross@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>

Delivery-date: Thu, 28 May 2020 21:29:46 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hello! Here's v2 of this series... a bit late, but technically still in time for code-freeze, although I understand this is quite tight! :-P Anyway, In Credit2, the CPUs are assigned to runqueues according to the host topology. For instance, if we want per-socket runqueues (which is the default), the CPUs that are in the same socket will end up in the same runqueue. This is generally good for scalability, at least until the number of CPUs that end up in the same runqueue is not too high. In fact, all this CPUs will compete for the same spinlock, for making scheduling decisions and manipulating the scheduler data structures. Therefore, if they are too many, that can become a bottleneck. This has not been an issue so far, but architectures with 128 CPUs per socket are now available, and it is certainly unideal to have so many CPUs in the same runqueue, competing for the same locks, etc. Let's therefore set a cap to the total number of CPUs that can share a runqueue. The value is set to 16, by default, but can be changed with a boot command line parameter. Note also that, if the host has hyperthreading (or equivalent), and we are not using core-scheduling), additional care is taken to avoid splitting CPUs that are hyperthread siblings among different runqueues. In v2, in addition to trying to address the review comments, I've added the logic for doing a full rebalance of the scheduler runqueues, while the system is running. This is actually something that itself came up during review of v1, when we realized that we do not only wanted a cap, we also wanted some balancing, and if you want real balancing, you have to be able to re-arrange the runqueue layout, dynamically. It took a while because I, although I had something that looked a lot like the final solution implemented in this patch, could not see how to solve cleanly and effectively the issue of having the vCPUs in the runqueues, while trying to re-balance them. It was while talking with Juergen that we figured out that we can actually pause the domains, which I had not thought at... So, once again, Juergen, thanks! :-) I have done the most of the stress testing with core-scheduling disabled, and it has survived to anything I threw at it, but of course the more testing the better, and I will be able to actually do more of it, in the coming days. IAC, I have also verified that at least a few core-scheduling enabled configs also work. There are git branches here: git://xenbits.xen.org/people/dariof/xen.git sched/credit2-max-cpus-runqueue-v2 http://xenbits.xen.org/gitweb/?p=people/dariof/xen.git;a=shortlog;h=refs/heads/sched/credit2-max-cpus-runqueue-v2 https://github.com/dfaggioli/xen/tree/sched/credit2-max-cpus-runqueue-v2 While v1 is at the following link: https://lore.kernel.org/xen-devel/158818022727.24327.14309662489731832234.stgit@Palanthas/T/#m1e885a0f0a1feef83790ac179ab66512201cb770 Thanks and Regards --- Dario Faggioli (7): xen: credit2: factor cpu to runqueue matching in a function xen: credit2: factor runqueue initialization in its own function. xen: cpupool: add a back-pointer from a scheduler to its pool xen: credit2: limit the max number of CPUs in a runqueue xen: credit2: compute cpus per-runqueue more dynamically. cpupool: create an the 'cpupool sync' infrastructure xen: credit2: rebalance the number of CPUs in the scheduler runqueues docs/misc/xen-command-line.pandoc | 14 + xen/common/sched/cpupool.c | 53 ++++ xen/common/sched/credit2.c | 437 ++++++++++++++++++++++++++++++++++--- xen/common/sched/private.h | 7 + xen/include/asm-arm/cpufeature.h | 5 xen/include/asm-x86/processor.h | 5 xen/include/xen/sched.h | 1 7 files changed, 491 insertions(+), 31 deletions(-) -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <<This happens because _I_ choose it to happen!>> (Raistlin Majere)

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.