This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] RFC: automatic NUMA placement

To: Andre Przywara <andre.przywara@xxxxxxx>
Subject: Re: [Xen-devel] RFC: automatic NUMA placement
From: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>
Date: Tue, 28 Sep 2010 06:48:45 +0200
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Mon, 27 Sep 2010 21:49:29 -0700
Dkim-signature: v=1; a=rsa-sha256; c=simple/simple; d=ts.fujitsu.com; i=juergen.gross@xxxxxxxxxxxxxx; q=dns/txt; s=s1536b; t=1285649344; x=1317185344; h=message-id:date:from:mime-version:to:cc:subject: references:in-reply-to:content-transfer-encoding; z=Message-ID:=20<4CA173AD.1070302@xxxxxxxxxxxxxx>|Date:=20 Tue,=2028=20Sep=202010=2006:48:45=20+0200|From:=20Juergen =20Gross=20<juergen.gross@xxxxxxxxxxxxxx>|MIME-Version: =201.0|To:=20Andre=20Przywara=20<andre.przywara@xxxxxxx> |CC:=20"xen-devel@xxxxxxxxxxxxxxxxxxx"=20<xen-devel@lists .xensource.com>|Subject:=20Re:=20[Xen-devel]=20RFC:=20aut omatic=20NUMA=20placement|References:=20<4C921DDF.6020809 @ts.fujitsu.com>=20<4CA110D3.5050000@xxxxxxx> |In-Reply-To:=20<4CA110D3.5050000@xxxxxxx> |Content-Transfer-Encoding:=208bit; bh=vpXxSNzppaFNbBVPYamqPhOqce1kSY12UAqlhXQ2EH0=; b=mge2IvET3I/PfxXuH3fP3OKc280nMNaGo3oWSCcDynRdY/ukJSxW1UEt y4/zbg8SUxp65na65T9uRAVKchUlcDpRLDOoFoy10g+oreyEeSlqmJ7KM RRdFmq8M2biKMBddvKhl3VYzK+k1tLOGWoNS/lEq7G/R/3bzNJLGbx5t1 Un17avolODF/2sm8uegGQxiYZcUz+8S+RZHeZcTkCEOlS4DHj0RO6eov5 xi6vw5/4Zqv4p/T6ck3+3wx7pcLqA;
Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:Message-ID:Date:From:Organization: User-Agent:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=dKP0cpp4wuUosbQeVpQEFySo4k5ENp2M6PPe+Nck94ZpGnX2gKWqiqnH bHYBhBeZCMVZ1LMIlxGxDRdpoakRTSSvetYI/o10do+eQ/bWyFvF7jr7r 4kozIpIQncPjYFR65A7X35qKCnTmgM7+qQAXepfOQ7KN9d1vDnkDaM1Yq o087YTsgR3sRZb8hTgy9XFIZ/KDigCecEcefQmJ/Vb1GEelTUiXXRvx/F mlPsZeI00bKVA20cRELOwkdqhD9cZ;
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4CA110D3.5050000@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Fujitsu Technology Solutions
References: <4C921DDF.6020809@xxxxxxxxxxxxxx> <4CA110D3.5050000@xxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20100913 Iceowl/1.0b1 Icedove/3.0.7
Hi Andre,

thanks for your thoughts.

On 09/27/10 23:46, Andre Przywara wrote:
Juergen Gross wrote:

I just stumbled upon the automatic pinning of vcpus on domain creation in
case of NUMA.
This behaviour is questionable IMO, as it breaks correct handling of
scheduling weights on NUMA machines.
I would suggest to switch this feature off per default and make it a
configuration option of xend. It would make sense, however, to change
cpu pool
processor allocation to be NUMA-aware.
Switching NUMA off via boot option would remove NUMA-optimized memory
allocation, which would be sub-optimal :-)
Hi Jürgen,

stumbled over your mail just now, so sorry for the delay.
First: Don't turn off automatic NUMA placement ;-)
In my tests it helped a lot to preserve performance on NUMA machines.

I was just browsing through the ML archive to find your original CPU
pools description from April, and it seems to fit the requirements in
NUMA machines quite well.
I haven't done any experiments with Cpupools nor haven't looked at the
code yet, but just a quick idea:
What about if we marry static NUMA placement and Cpupools?

I'd suggest to introduce static NUMA pools, one for each node. The CPUs
assigned to each pool are fixed and cannot be removed nor added (because
the NUMA topology is fixed).
Is that possible? Can we assign one physical CPUs to multiple pools (to
Pool-0 and to NUMA-0?) Or are they exclusive or hierarchical like the
Linux' cpusets?

A cpu is always member of only one pool.

We could introduce magic names for each NUMA pool, so that people just
say cpupool="NUMA-2" and get their domain pinned to that pool. Without
any explicit assignment the system would pick a NUMA node (like it does
today) and would just use the respective Cpupool. I think that is very
similar to what it does today, only that the pinning nature is more
evident to the user (as it uses the Cpupool name space).
Also it would allow for users to override the pinning by specifying a
different Cpupool explicitly (like Pool-0).

Just tell me what you think about this and whether I am wrong with my
thinking ;-)

With your proposal it isn't possible to start a domU with more vcpus than
cpus in a node without changing cpu pools.

I would suggest to do it the following way:
- use automatic NUMA placement only in Pool-0 (this won't change anything
  for users not using cpu pools), perhaps with an option to switch it off
- change the cpu allocation for pools to be NUMA aware
- optionally add a xl and/or xm command to create one cpu pool per NUMA node


Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@xxxxxxxxxxxxxx
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>