WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] [RFC] pv guest numa [RE: Host Numa informtion in dom0]

To: Dulloor <dulloor@xxxxxxxxx>, Ian Pratt <Ian.Pratt@xxxxxxxxxxxxx>
Subject: RE: [Xen-devel] [RFC] pv guest numa [RE: Host Numa informtion in dom0]
From: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date: Mon, 15 Feb 2010 17:15:53 -0800 (PST)
Cc: Andre Przywara <andre.przywara@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>, tmem-devel@xxxxxxxxxxxxxx
Delivery-date: Mon, 15 Feb 2010 17:17:11 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <940bcfd21002122225i2c79aad2q91ff6432faef3192@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <940bcfd21002122225i2c79aad2q91ff6432faef3192@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi Dulloor --

> I am in the process of making other places of dynamic memory
> mgmt/operations numa-aware - tmem, memory exchange operations, etc.

I'd be interested in your thoughts on numa-aware tmem
as well as the other dynamic memory mechanisms in Xen 4.0.

Tmem is special in that it uses primarily full-page copies
from/to tmem-space to/from guest-space so, assuming the
interconnect can pipeline/stream a memcpy, overhead of
off-node memory vs on-node memory should be less
noticeable.  However tmem uses large data structures
(rbtrees and radix-trees) and the lookup process might
benefit from being NUMA-aware.

Also, I will be looking into adding some page-sharing
techniques into tmem in the near future.  This (and the
existing page sharing feature just added to 4.0) may
create some other interesting challenges for NUMA-awareness.

Dan

> -----Original Message-----
> From: Dulloor [mailto:dulloor@xxxxxxxxx]
> Sent: Friday, February 12, 2010 11:25 PM
> To: Ian Pratt
> Cc: Andre Przywara; xen-devel@xxxxxxxxxxxxxxxxxxx; Nakajima, Jun; Keir
> Fraser
> Subject: [Xen-devel] [RFC] pv guest numa [RE: Host Numa informtion in
> dom0]
> 
> I am attaching (RFC) patches for NUMA-aware pv guests.
> 
> * The patch adds hypervisor interfaces to export minimal numa-related
> information about the memory of pv domain, which can then be used to
> setup the node ranges, virtual cpu<->node maps, and virtual slit
> tables in the pv domain.
> * The guest-domain also maintains a mapping between its vnodes and
> mnodes(actual machine nodes). These mappings can be used in the memory
> operations, such as those in ballooning.
> * In the patch, dom0 is made numa-aware using these interfaces. Other
> domains should be simpler. I am in the process of adding python
> interfaces for this. And, this would work with any node selection
> policy.
> * The patch is tested only for 64-on-64 (on x86_64)
> 
> * Along with the following other patches, this could provide a good
> solution for numa-aware guests -
> - numa-aware ballooning  (previously posted by me on xen-devel)
> - Andre's patch for HVM domains (posted by Andre recently)
> 
> I am in the process of making other places of dynamic memory
> mgmt/operations numa-aware - tmem, memory exchange operations, etc.
> 
> Please let know your comments.
> 
> -dulloor
> 
> On Thu, Feb 11, 2010 at 10:21 AM, Ian Pratt <Ian.Pratt@xxxxxxxxxxxxx>
> wrote:
> >> > If guest NUMA is disabled, we just use a single node mask which is
> the
> >> > union of the per-VCPU node masks.
> >> >
> >> > Where allowed node masks span more than one physical node, we
> should
> >> > allocate memory to the guest's virtual node by pseudo randomly
> striping
> >> > memory allocations (in 2MB chunks) from across the specified
> physical
> >> > nodes. [pseudo random is probably better than round robin]
> >>
> >> Do we really want to support this? I don't think the allowed node
> masks
> >> should span more than one physical NUMA node. We also need to look
> at I/O
> >> devices as well.
> >
> > Given that we definitely need this striping code in the case where
> the guest is non NUMA, I'd be inclined to still allow it to be used
> even if the guest has multiple NUMA nodes. It could come in handy where
> there is a hierarchy between physical NUMA nodes, enabling for example
> striping to be used between a pair of 'close' nodes, while exposing the
> higher-level topology of sets of the paired nodes to be exposed to the
> guest.
> >
> > Ian
> >
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel
> >

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel