[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: NUMA guest: best-fit-nodes algorithm (was Re: [PATCH 00/11] PV NUMA Guests)


  • To: Andre Przywara <andre.przywara@xxxxxxx>
  • From: Dulloor <dulloor@xxxxxxxxx>
  • Date: Sat, 24 Apr 2010 02:51:45 -0400
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>, "Cui, Dexuan" <dexuan.cui@xxxxxxxxx>
  • Delivery-date: Fri, 23 Apr 2010 23:52:31 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=CwSVbKmeLZrX+vi+Q8/k9pudPOEyB+WwV7OTtXDUX0qNUSzcmj4HJhqBSq/Htmsv5/ /Z07xXHQ1VJXBGWS0kuHTxIIJyLCBQl1OjrydmPSbGFeAN6ObVbM2Y5BzubcOhXjYklH crEuOFfsFqwIh99tXqBAwbC5DsrBWW9rUL0zE=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On Fri, Apr 23, 2010 at 8:45 AM, Andre Przywara <andre.przywara@xxxxxxx> wrote:
> Dulloor wrote:
>> Cui, Dexuan <dexuan.cui@xxxxxxxxx> wrote:
>>> xc_select_best_fit_nodes() decides the "min-set" of host nodes that
>>> will be used for the guest. It only considers the current memory
>>> usage of the system. Maybe we should also condider the cpu load? And >>
>>> the number of the nodes must be 2^^n? And how to handle the case
>>> #vcpu is < #vnode?
>>> And looks your patches only consider the guest's memory requirement
>>> -- guest's vcpu requirement is neglected? e.g., a guest may not need
>>> a very large amount of memory while it needs many vcpus.
>>> xc_select_best_fit_nodes() should consider this when
>>> determining the number of vnode.
>> I agree with you. I was planning to consider vcpu load as the next
>> step. Also, I am looking for a good heuristic. I looked at the
>> nodeload heuristic (currently in xen), but found it too naive.
>> But, if you/Andre think it is a good heuristic, I will add the
>> support. Actually, I think in future we should do away with strict
>> vcpu-affinities and rely more on a scheduler with necessary NUMA
>> support to complement our placement strategies.
>>
>> As of now, we don't SPLIT, if #vcpu < #vnode. We use STRIPING in that
>> case.
> Determing the current load of a node is quite a hard thing to do currently
> in Xen. If guests are pinned to nodes (which I'd consider necessary with the
> current credit scheduler), then using this affinity is a good heuristic to
> find good nodes, at least the best I can think of. So until we have a NUMA
> aware scheduler, we should go with this solution. Of course it only measures
> the theoretical load of a node and doesn't distinguish between idle and
> loaded guests. One would need something like a permanently running xm top to
> gather statistics about the guest's load, but that is something for a future
> patch.
> (Or is there a guest load metric already measured in Xen?)
Yeah, for the current credit scheduler, looks like we could use only
affinity for load heuristics.
I will add that to the node selection algorithm -  similar to what you
do in calculating nodeload.
Also, gathering guest load statistics over a period of time could be
useful too. But, it is unclear
how any temporal behaviour could aid permanent memory placement.

I have started looking into load balancing and NUMA-related stuff for
credit2. I hope to send out
something in coming weeks.

>
> Regards,
> Andre.
>
>
> --
> Andre Przywara
> AMD-Operating System Research Center (OSRC), Dresden, Germany
> Tel: +49 351 448-3567-12
>
>
-dulloor

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.