Xen project Mailing List

Re: [Xen-devel] [PATCH 08 of 10 v3] libxl: enable automatic placement of guests on NUMA nodes

On Wed, 2012-07-04 at 18:18 +0200, Dario Faggioli wrote: > # HG changeset patch > # User Dario Faggioli <raistlin@xxxxxxxx> > # Date 1341416324 -7200 > # Node ID 7087d3622ee2051654c9e78fe4829da10c2d46f1 > # Parent 6fd693e7f3bc8b4d9bd20befff2c13de5591a7c5 > libxl: enable automatic placement of guests on NUMA nodes > George, Ian (Jackson), > Once we know which ones, among all the possible combinations, represents valid > placement candidates for a domain, use some heuistics for deciding which is > the > best. For instance, smaller candidates are considered to be better, both from > the domain's point of view (fewer memory spreading among nodes) and from the > system as a whole point of view (fewer memoy fragmentation). In case of > candidates of equal sizes (i.e., with the same number of nodes), the amount of > free memory and the number of domain already assigned to their nodes are > considered. Very often, candidates with greater amount of memory are the one > we wants, as this is also good for keeping memory fragmentation under control. > However, if the difference in how much free memory two candidates have, the > number of assigned domains might be what decides which candidate wins. > > [...] > > --- > Changes from v2: > > [...] > > * Comparison function for candidates changed so that it now provides > total ordering, as requested during review. It is still using FP > arithmetic, though. Also I think that just putting the difference > between the amount of free memory and between the number of assigned > domains of two candidates in a single formula (after normalizing and > weighting them) is both clear and effective enough. > > [...] > I thought at what a sensible comparison function should look like, and I also plotted some graphs while randomly generating both the amount of free memory and number of domains. The outcome of all this is I managed in convincing myself the solution below is both clear and understandable as it is effective (confirmed by the test I was able to perform up to now). Basically, the idea is to consider both things (freemem and nr_domains), but with freemem being 3 times more "important" than nr_domains. That is very similar to one of the log()-based solutions proposed by Ian, but I really think just normalizing and weighting is easier to understand, even if quickly looking at the formula/code. Regarding the percent penalty per each domain proposed by George on IRC, I liked the idea a lot, but the figured out that here will always be a number of domains (e.g., 20, if the penalty is set to 5%) starting from which nr_domains counts more than freemem, which is the opposite of what I want to achieve. So, now, comparison between placement candidates happens like this: return a.nr_nodes < b.nr_nodes ? a : b.nr_nodes < a.nr_nodes ? b : 3*norm_diff(b.freemem, a.freemem) - norm_diff(a.nr_domains, b.nr_domains) where: norm_diff(x, y) := (x - y)/max(x, y) Which removes the nasty effects of having that 10% range as in v2 of the series. If that '3' looks too much of a magic number, I can of course enum/#define it, or even make it configurable (although, the latter, not for 4.2, I guess :-) ). > +/* Subtract two values and translate the result in [0, 1] */ > +static double normalized_diff(double a, double b) > +{ > +#define max(a, b) (a > b ? a : b) > + if (!a && a == b) > + return 0.0; > + return (a - b) / max(a, b); > +} > + > +/* > + * The NUMA placement candidates are reordered according to the following > + * heuristics: > + * - candidates involving fewer nodes come first. In case two (or > + * more) candidates span the same number of nodes, > + * - the amount of free memory and the number of domains assigned to the > + * candidates are considered. In doing that, candidates with greater > + * amount of free memory and fewer domains assigned to them are preferred, > + * with free memory "weighting" three times as much as number of domains. > + */ > +static int numa_cmpf(const void *v1, const void *v2) > +{ > + const libxl__numa_candidate *c1 = v1; > + const libxl__numa_candidate *c2 = v2; > +#define sign(a) a > 0 ? 1 : a < 0 ? -1 : 0 > + double freememkb_diff = normalized_diff(c2->free_memkb, c1->free_memkb); > + double nrdomains_diff = normalized_diff(c1->nr_domains, c2->nr_domains); > + > + if (c1->nr_nodes != c2->nr_nodes) > + return c1->nr_nodes - c2->nr_nodes; > + > + return sign(3*freememkb_diff + nrdomains_diff); > +} > + What do you think? Thanks and Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.