[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE



On Mon, 2015-03-02 at 18:19 +0000, Andrew Cooper wrote:
> On 02/03/15 17:52, Ian Campbell wrote:
> >> On 02/03/15 17:43, Andrew Cooper wrote:
> >>> On 02/03/15 17:34, David Vrabel wrote:
> >>>> A guest that previously had 2 vNUMA nodes is migrated to a host with
> >>>> only 1 pNUMA node.  It should still have 2 vNUMA nodes.
> >>> A natural consequence of vNUMA is that the guest must expect the vNUMA
> >>> layout to change across suspend/resume.  The toolstack cannot guarentee
> >>> that it can construct a similar vNUMA layout after a migration.  This
> >>> includes the toolstack indicating that it was unable to make any useful
> >>> NUMA affinity with the memory ranges.

> > In the case you mention above I would expect the 2 vnuma nodes to just
> > point to the same single pnuma node.
> >
Right, that's still doable. But what if we have 4 nodes but none can,
alone, accommodate all the memory of the guest's vnodes? (e.g., 1GB free
on each pnode, guest with 2 vnodes, each 1.2GB wide.)

The point being that it would be quite complicated to have to deal with
all the possible variations such situations in the toolstack, especially
considering that Xen smoothly handles that already (at the cost of
performance, of course).

> > As such I think it's probably not relevant to the need for
> > XEN_NO_NUMA_NODE?
> >
> > Or is that not would be expected?
> 
> If we were to go down that route, the toolstack would need a way of
> signalling "this vNUMA node does not contain memory on a single pNUMA
> node" if there was insufficient free space to make the allocation.
> 
Exactly.

BTW, about the use cases: wanting to test vNUMA without NUMA hardware,
as Jan said. Also, wanting to test NUMA support in the guest OS, or in
an application inside the guest, without having NUMA hardware.

But much more important are the situations that Andrew and David
described.

> In this case, a pnode of XEN_NO_NUMA_NODE seems like precisely the
> correct value to report.
> 
Indeed. It tells Xen: <<hey Xen, toolstack here: we either don't care or
could not come up with any sane vnode-to-pnode mapping, please figure
that out yourself>>.

That makes the code, IMO, simpler at any level. In fact, at Xen level,
there is a default way to deal with the situation (the striping)
already. At the toolstack level, we can only care about trying to come
up with some super-cool and super-good (for performance) configuration
and just give up, if anything like what David and Andrew said occurs.

It's exactly what we're doing right now, BTW, with no vNUMA: we try to
place a domain on one (or on the least possible amount of) NUMA node(s)
but, if we fail, we inform the user that performance may suffer (with a
WARNING), and let Xen do what it wants with the guest's memory.

Regards,
Dario

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.