[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] permute with 2MB chunk



Ian Pratt wrote:
We also tested building an HVM guest with the permuted ordering of
pages, versus reverse ordering, versus normal ordering. Only the
permuted
ordering showed the problem. We assume that the permute() function has
an
unfortunate interaction with the memory allocator in certain HVM guest
OSes,
causing poor cache utilisation.

It's still very odd that the permutation fn only seems to effect Linux
running as a HVM guest and not as a PV guest. I still think there's
something we're not quite understanding.

Jean: have you definitely verified that building a domain with the
permute function does not affect Linux PV guests?


Here a new version of the permute patch, it has to be applied instead of the previous one. Now it works with PV guests, sorry for the delay.

Signed-off-by: Jean Guyader <jean.guyader@xxxxxxxxxxxxx>

--
Jean Guyader
diff -r 76c9cf11ce23 tools/libxc/xc_domain_save.c
--- a/tools/libxc/xc_domain_save.c      Fri Mar 21 09:45:34 2008 +0000
+++ b/tools/libxc/xc_domain_save.c      Tue Mar 25 12:31:42 2008 +0000
@@ -123,6 +123,32 @@ static inline int count_bits ( int nr, v
     for ( i = 0; i < (nr / (sizeof(unsigned long)*8)); i++, p++ )
         count += hweight32(*p);
     return count;
+}
+
+static inline int permute(unsigned long i, unsigned long nr, unsigned long 
order_nr)
+{
+    /* Need a simple permutation function so that we scan pages in a
+       pseudo random order, enabling us to get a better estimate of
+       the domain's page dirtying rate as we go (there are often
+       contiguous ranges of pfns that have similar behaviour, and we
+       want to mix them up. */
+  
+  unsigned char keep = 9; /* chunk of 2 MB */
+  unsigned char shift_high = (order_nr - keep) / 2;
+  unsigned char shift_low = order_nr - keep - (order_nr - keep) / 2;
+  
+  /* Check if the permutation gives an out of range number. */
+  do
+  {
+    unsigned long high = (i >> (keep + shift_low));
+    unsigned long low = (i >> keep) & ((1 << shift_low) - 1);
+    i = (i & ((1 << keep) - 1)) |
+      (low << (shift_high + keep)) | (high << keep);
+  }
+  while (i >= nr);
+
+
+  return (i);
 }
 
 static uint64_t tv_to_us(struct timeval *new)
@@ -735,6 +761,7 @@ static xen_pfn_t *map_and_save_p2m_table
         p2m_frame_list[i/FPP] = mfn_to_pfn(p2m_frame_list[i/FPP]);
     }
 
+    memset(&ctxt, 0, sizeof (ctxt));
     if ( xc_vcpu_getcontext(xc_handle, dom, 0, &ctxt.c) )
     {
         ERROR("Could not get vcpu context");
@@ -828,6 +855,8 @@ int xc_domain_save(int xc_handle, int io
 
     /* base of the region in which domain memory is mapped */
     unsigned char *region_base = NULL;
+
+    int order_nr = 0;
 
     /* bitmap of pages:
        - that should be sent this iteration (unless later marked as skip);
@@ -937,6 +966,11 @@ int xc_domain_save(int xc_handle, int io
 
     /* pretend we sent all the pages last iteration */
     sent_last_iter = p2m_size;
+
+    /* calculate the power of 2 order of p2m_size, e.g.
+       15->4 16->4 17->5 */
+    for ( i = p2m_size-1, order_nr = 0; i ; i >>= 1, order_nr++ )
+      continue;
 
     /* Setup to_send / to_fix and to_skip bitmaps */
     to_send = malloc(BITMAP_SIZE);
@@ -1088,7 +1122,7 @@ int xc_domain_save(int xc_handle, int io
                    (batch < MAX_BATCH_SIZE) && (N < p2m_size);
                    N++ )
             {
-                int n = N;
+                int n = permute(N, p2m_size, order_nr);
 
                 if ( debug )
                 {
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.