[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86: make "dom0_nodes=" work with credit2


  • To: Jan Beulich <JBeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Dario Faggioli <dfaggioli@xxxxxxxx>
  • Date: Fri, 8 Apr 2022 11:20:07 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tpHOOFkNLOW3CUFaBJazP3CHbIw6xQpArDcPv3B12+E=; b=bMN6j/WqxKAPJseSx9T70Ptm5Gze+NmceuUS4fJ7YKj3fTib2dIJfQvYmnhGZeiycfzFofcb6RH9+AR14OcinNA2Zp97QfP9by8JoKXejsyQBkF2uDXoAXj9cHstxJCKfy1U1yiAeb8XE80A/vONWOYPRRGnx42MB+cEh4++dYMVUeI9vM69bKJLX0aufRiHFboSpVQ+TDTPGMW/BCpb3Giv58q/HkG39N+3GhKFNmkZNFJNGuqLFg1B+W9YCl/hjFwX0tTuYvlJJusr3Vk82l3vpTrNc8mge3rdL7EVRxzv4wbj9/tDajnRMSC05fPqWbWv8G176RDkYcMfx4T/3Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ifEcXAXfrvdCUEwHQJbNupEmU5pm3xBLfExYjJYuekZ4g/2kO7kw6Fx49+giWVp5Xp9RHlkkkaHZ7e5Oa6hNKzPrDdkXADLBzccQgxNyrbL+CKAAh5PCMJjXv6eF1zPFaw555xc0yaGTMS0IL8OeFbCkD4z99nmYL+W6DP6AGkTmtpqi8yvI/zPvUrZyaHhUNqb8OKSvzMOx4D3o7nwcfyoBRKiAyVmL4T27ZNGaZQQVxkic8FRAzUFk3Yhgu7rC7T9yG88MCIc4Mp8mbCNxfAPQq50/OIje8G/eB0bEzCYD1OHY063A3lYvmiy3RD0rOiUib/xzFxE8ypA2mSVjXA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "ohering@xxxxxxx" <ohering@xxxxxxx>, "george.dunlap@xxxxxxxxxx" <george.dunlap@xxxxxxxxxx>
  • Delivery-date: Fri, 08 Apr 2022 11:20:20 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHYSoM0M4Zi9SzqTkOjTOtQV+PP7azl4C2A
  • Thread-topic: [PATCH] x86: make "dom0_nodes=" work with credit2

On Thu, 2022-04-07 at 15:27 +0200, Jan Beulich wrote:
> ---
> The Fixes: tag isn't very precise - it's rather the commit exposing
> the
> issue by default. I haven't been able to identify the actual commit
> which did introduce the problem; it may well be that it has always
> been
> there since the introduction of credit2.
> 
> Credit2 moving the vCPU-s off of their initially assigned ones right
> away of course renders sched_select_initial_cpu()'s use of
> cpu_cycle()
> pretty useless.
>
Mmm... you mean that Credit2 is moving the vCPUs off they're assigned
ones _right_now_, or that it will, with this patch?

If you mean the former, I'm not sure it is. In fact, when
sched_select_initial_cpu() is called for dom0, dom0's node affinity is
just all nodes, isn't it? So, the we can pick any CPU in the cpupool,
and we use cycle to try to spread the vCPUs kind of evenly.

If you mean after this patch, well, sure, but that's normal. Again,
when we pick the initial CPU, we still don't know that the vCPUs have
an affinity. Or, actually, we know that in sched_setup_dom0_vcpus(),
but there's no direct way to tell it to sched_init_vcpu() (and hence
sched_select_initial_cpu()).

That's because, by default, affinity is just "all cpus", when we create
the vCPUs, and we change that later, if they have one already (due to
it being present in the config file, or in the dom0_nodes parameter).

Something that *maybe* we can try, since we're handling dom0 vCPUs
specially anyway, is to directly set dom0's node affinity to the nodes
of the CPUs we have in dom0_cpus, before calling vcpu_create() (in
sched_setup_dom0_vcpus(), of course).

This should make sched_select_initial_cpu() pick one of the "correct"
CPUs in the first place. But I don't know if it's worth, neither if
we'll still need this patch anyway (I have to check more thoroughly).

>  But I guess that's still useful for other schedulers.
> I wonder though whether sched_init_vcpu() shouldn't use the CPU map
> calculated by sched_select_initial_cpu() for its call to
> sched_set_affinity() in the non-pinned case, rather than setting
> "all".
>
If we do that, and there's no affinity configured for the guest, or no
"dom0_nodes=", when will we reset the affinity to all, which what it
should be in such a case?

Also, if I'm right in my reasoning above, when we come from
sched_setup_dom0_vcpus(), the mast calculated by
sched_select_initial_cpu() is basically cpupool0's cpus_valid, so this
wouldn't really change anything for the problem we're trying to solve
here.

> (I guess doing so might mask the issue at hand, but I think the
> change
> here would still be applicable at least from an abstract pov.)
> 
I don't think it would mask it, but I do think that, yes, the change
you're making would still be applicable.

And, about it, One thing...

> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -3403,9 +3403,15 @@ void __init sched_setup_dom0_vcpus(struc
>      {
>          for_each_sched_unit ( d, unit )
>          {
> +            spinlock_t *lock = unit_schedule_lock_irq(unit);
> +
>              if ( !opt_dom0_vcpus_pin && !dom0_affinity_relaxed )
>                  sched_set_affinity(unit, &dom0_cpus, NULL);
>              sched_set_affinity(unit, NULL, &dom0_cpus);
> +
> +            sched_unit_migrate_start(unit);
> +            unit_schedule_unlock_irq(lock, unit);
> +            sched_unit_migrate_finish(unit);
>          }
>      }
>  
... The result of this (also considering the call to
domain_update_node_affinity()) ends up looking very similar to what
we'd get if, instead of calling sched_set_affinity(), we call
vcpu_set_affinity().

I'm therefore wondering if we should try to just do that... But I'm not
sure, mostly because that would mean calling
domain_update_node_affinity() for all dom0's vCPUs, which is clearly
less efficient than just calling it once at the end.

So I'm thinking that we can indeed do it like this, and add a comment.

Anyone else any thoughts?

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

Attachment: signature.asc
Description: This is a digitally signed message part


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.