The patchset will add basic NUMA support to Xen (hypervisor only). We
borrowed from Linux support for NUMA SRAT table parsing, discontiguous
memory tracking (mem chunks), and cpu support (node_to_cpumask etc).
The hypervisor parses the SRAT tables and constructs mappings for each
node such as node to cpu mappings and memory range to node mappings.
Using this information, we also modified the page allocator to provide a
simple NUMA-aware API. The modified allocator will attempt to find
pages local to the cpu where possible, but will fall back on using
memory that is of the requested size rather than fragmenting larger
contiguous chunks to find local pages. We expect to tune this algorithm
in the future after further study.
We also modified Xen's increase_reservation memory op to balance memory
distribution across the vcpus in use by a domain. Relying on previous
patches which have already been committed to xen-unstable, a guest can be
constructed such that its entire memory is contained within a specific
NUMA node.
We've added a keyhandler for exposing some of the NUMA-related
information and statistics that pertain to the hypervisor.
We export NUMA system information via the physinfo hypercall. This
information provides cpu/memory topology and configuration information
gleaned from the SRAT tables to userspace applications. Currently, xend
doesn't leverage any of the information automatically but we intend to
do so in the future.
We've integrated in NUMA information into xentrace so we can track various
points such as page allocator hits and misses as well as other
information. In the process of implementing the trace, we also fixed
some incorrect assumptions about the symmetry of NUMA systems w.r.t the
sockets_per_node value. Details are available a later email with the
patch.
These patches have been tested on several IBM NUMA and non-NUMA systems:
NUMA-aware systems:
IBM Dual Opteron: 2 Node, 2 CPU, 4GB
IBM x445 : 4 Node, 32 CPU, 32GB
IBM x460 : 1 Node, 8 CPU, 16GB
IBM x460 : 2 Node, 32 CPU, 32GB
Non NUMA-aware systems (i.e, no SRAT tables):
IBM Dual Xeon : 1 Node, 2 CPU, 2GB
IBM P4 : 1 Node, 1 CPU, 1GB
We look forward to your review of the patches for acceptance.
--
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253 T/L: 678-9253
ryanh@xxxxxxxxxx
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|