|
|
|
|
|
|
|
|
|
|
xen-devel
[Xen-devel] [RFC] Xen NUMA strategy
Hi,
Anthony Xu and I have had some fruitful discussion about the further
direction of the NUMA support in Xen, I wanted to share the results with
the Xen community and start a discussion:
We came up with two different approaches for better NUMA support in Xen:
1.) Guest NUMA support: spread a guest's resources (CPUs and memory)
over several nodes and propagate the appropriate topology to the guest.
The first part of this is in the patches I sent recently to the list (PV
support is following, bells and whistles like automatic placement will
follow, too.).
***Advantages***:
- The guest OS has better means to deal with the NUMA setup, it can more
easily migrate _processes_ among the nodes (Xen-HV can only migrate
whole domains).
- Changes to Xen are relatively small.
- There is no limit for the guest resources, since they can use more
resources than there are on one node.
- If guests are well spread over the nodes, the system is more balanced
even if guests are destroyed and created later.
***Disadvantages***:
- The guest has to support NUMA. This is not true for older guests
(Win2K, older Linux).
- The guest's workload has to fit NUMA. If the guests tasks are merely
parallelizable or use much shared memory, they cannot take advantage of
NUMA and will degrade in performance. This includes all single task
problems.
In general this approach seems to fit better with smaller NUMA nodes and
larger guests.
2.) Dynamic load balancing and page migration: create guests within one
NUMA node and distribute all guests across the nodes. If the system
becomes imbalanced, migrate guests to other nodes and copy (at least
part of) their memory pages to the other node's local memory.
***Advantages***:
- No guest NUMA support necessary. Older as well a recent guests should
run fine.
- Smaller guests don't have to cope with NUMA and will have 'flat'
memory available.
- Guests running on separate nodes usually don't disturb each other and
can benefit from the higher distributed memory bandwidth.
***Disadvantages***:
- Guests are limited to the resources available on one node. This
applies for both the number of CPUs and the amount of memory.
- Costly migration of guests. In a simple implementation we'd use live
migration, which requires the whole guest's memory to be copied before
the guest starts to run on the other node. If this whole move proves to
be unnecessary a few minutes later, all this was in vain. A more
advanced implementation would do the page migration in the background
and thus can avoid this problem, if only the hot pages are migrated first.
- Integration into Xen seems to be more complicated (at least for the
more ungifted hackers among us).
This approach seems to be more reasonable if you have larger nodes (for
instance 16 cores) and smaller guests (the more usual case nowadays?)
After some discussion we came to the conclusion that both approaches
should be implemented. I want to put this to the list and am looking
forward to any feedback.
Regards,
Andre.
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 277-84917
----to satisfy European Law for business letters:
AMD Saxony Limited Liability Company & Co. KG
Sitz (Geschäftsanschrift): Wilschdorfer Landstr. 101, 01109 Dresden,
Deutschland
Registergericht Dresden: HRA 4896
vertretungsberechtigter Komplementär: AMD Saxony LLC (Sitz Wilmington,
Delaware, USA)
Geschäftsführer der AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [Xen-devel] [RFC] Xen NUMA strategy,
Andre Przywara <=
|
|
|
|
|