[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1 of 3] libxl: take node distances into account during NUMA placement



On Fri, 2012-10-19 at 15:57 +0100, Ian Jackson wrote:
> Dario Faggioli writes ("Re: [Xen-devel] [PATCH 1 of 3] libxl: take node 
> distances into account during NUMA placement"):
> > On Thu, 2012-10-18 at 16:17 +0100, George Dunlap wrote:
> > > And now we're doing N^2 for each
> > > candidate? 
> > 
> > Again, yes, but that is turning it from Ndoms*Nvcpus to
> > Ndoms*Nvcpus+Nnodes^2, which is still dominated by the first term. IIRC,
> > Andre tried to start >50 domains with 2 vCPUs on a 8 nodes system, which
> > means 50*2 vs 8*8.
> 
> Are you sure about this calculation ?  
>
I am. In fact, I think they answer George's question which was "by how
much we are increasing the complexity of each step?" and not "by how
much we are increasing the overall complexity?".

That of course doesn't mean I don't care about the overall complexity,
just that I don't think this is making things worse than what we have
(which is, btw, the result of the long discussion we had pre 4.2
release).

> It seems to me that 
> doing Nnodes^2 for each candidate multiplies the cost by the number of
> candidates, rather than adding Nnodes^2.
> 
That is definitely true. Again the key is the difference between talking
about "each candidate", as we were, and total.

> There are in the worst case nearly 2^Nnodes candidates (combinations
> of nodes).  So the cost is
>   <= 2^Nnodes * Nnodes^2
> 
> Your algorithm runs with up to 16 nodes.  Your change here increases
> the worst-case cost from
>   2^16 = 65556
> to
>   2^16 * 16^2 = 16777216
> 
Ok, I'm fine with the "let's throw numbers" game, so let me throw
mine[*] ones before proposing a couple of possible strategies. :-D

So, the current exact calculation for the number of candidates that are
generated (in the worst case, i.e., when the only suitable candidate for
the domain being created is the very last one evaluated):

combs=0;
N=16;
for k=1:N
  combs=combs+bincoeff(N,k);
endfor;
combs
combs =  65535

Now, the change about distances adds to each step something that can be
upper bounded by this:

steps=0;
N=16;
for i=0:N
  for j=1:N-i
    steps++;
  endfor;
endfor;
steps
steps =  136

Hence:

combs*steps
ans =  8912760

But this is of course not tight, and a more accurate calculation of the
worst case overall required number of "steps" you're interested in
should look more like this:

combs=0;
N=16;
for k=1:N
  steps=0;
  for i=0:k
    for j=1:k-i
      steps++;
    endfor;
  endfor;
  combs=combs+bincoeff(n,k)*steps;
endfor;
combs
combs =  2490368

So not exactly 16777216, although, don't get me wrong, I agree it is
huge.

So, now that the math has been taken care of, here it comes the time for
the proposals I was talking about before.

1. For this specific issue, I can try to change the way I use the 
  distances matrix and the way I define and calculate a measure of how
  distant the nodes in a candidate are from each others, trying to make
  it at least linear in computation time. It's not immediate, mainly
  because it's a matrix after all, but maybe I can save some
  intermediate results during the process and amortize the complexity
  (I've ideas, but nothing precise enough to share yet). Does that make
  sense to you?

2. Independently from this specific issue, I think we need something to
  determine where a reasonable limit lie, and whether what we have and
  the changes we will make result or not in a decent domain creation  
  time.
  That's something very hard to do, but I was thinking about writing
  some sort of 'simulator', i.e., a standalone program that calls the
  placement function and intercept the libxl_ calls it does, to create
  and let it run in an artificial scenario of choice.
  Something like that has already been suggested during one of the last
  rounds of review (although for different purposes), and I thought it
  was a real good idea... Now I think is something we definitely
  need. :-)
  I know it's not going to reach usec-level precision (hypercall being
  simulated by function calls, and stuff like that), but I think it
  could be useful... You?

Thanks again for looking into this and Regards,
Dario

[*] copying and pasting the lines in octave should work.

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.