[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH] scrub pages on guest termination



This patch solves the following problem.  When a large VS terminates, the node locks
up. The node locks up because the page_scrub_kick routine sends a softirq to
all processors instructing them to run the page scrub code.  There they interfere
with each other as they serialize behind the page_scrub_lock.

The patch does two things:

(1) In page_scrub_kick, only a single cpu is interrupted.  Some cpu other than
the calling cpu is chosen (if available) because we assume the calling cpu
has other higher priority work to do.

(2) In page_scrub_softirq, if more than one cpu is online, the first cpu
to start scrubbing designates itself as the primary_scrubber.  As such
it is dedicated to scrubbing pages until the list is empty.  Other cpus
might call page_scrub_softirq but they spend only 1 msec scrubbing before
returning to check for other higher priority work.  But, with multiple cpus
online, the node can afford to have one cpu dedicated to scrubbing when
that work needs to be done.

Signed-off-by: Robert Phillips <rphillips@xxxxxxxxxxxxxxx>
Signed-off-by: Ben Guthro <bguthro@xxxxxxxxxxxxxxx>
diff -r 29dc52031954 xen/common/page_alloc.c
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -984,16 +984,23 @@
     void             *p;
     int               i;
     s_time_t          start = NOW();
+    static int        primary_scrubber = -1;
 
-    /* Aim to do 1ms of work every 10ms. */
+    /* Unless SMP, aim to do 1ms of work every 10ms. */
     do {
         spin_lock(&page_scrub_lock);
 
         if ( unlikely((ent = page_scrub_list.next) == &page_scrub_list) )
         {
+            if (primary_scrubber == smp_processor_id())
+                primary_scrubber = -1;
             spin_unlock(&page_scrub_lock);
             return;
         }
+        
+        /* If SMP, dedicate a cpu to scrubbing til the job is done */
+        if (primary_scrubber == -1 && num_online_cpus() > 1)
+            primary_scrubber = smp_processor_id();
         
         /* Peel up to 16 pages from the list. */
         for ( i = 0; i < 16; i++ )
@@ -1020,7 +1027,7 @@
             unmap_domain_page(p);
             free_heap_pages(pfn_dom_zone_type(page_to_mfn(pg)), pg, 0);
         }
-    } while ( (NOW() - start) < MILLISECS(1) );
+    } while ( primary_scrubber == smp_processor_id() || (NOW() - start) < 
MILLISECS(1) );
 
     set_timer(&this_cpu(page_scrub_timer), NOW() + MILLISECS(10));
 }
diff -r 29dc52031954 xen/include/xen/mm.h
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -90,10 +90,21 @@
         if ( !list_empty(&page_scrub_list) )    \
             raise_softirq(PAGE_SCRUB_SOFTIRQ);  \
     } while ( 0 )
-#define page_scrub_kick()                                               \
-    do {                                                                \
-        if ( !list_empty(&page_scrub_list) )                            \
-            cpumask_raise_softirq(cpu_online_map, PAGE_SCRUB_SOFTIRQ);  \
+
+#define page_scrub_kick()                                       \
+    do {                                                        \
+        if ( !list_empty(&page_scrub_list) ) {                  \
+            int cpu;                                            \
+            /* Try to use some other cpu. */                    \
+            for_each_online_cpu(cpu) {                          \
+                if (cpu != smp_processor_id()) {                \
+                    cpu_raise_softirq(cpu, PAGE_SCRUB_SOFTIRQ); \
+                    break;                                      \
+                }                                               \
+            }                                                   \
+            if (cpu >= NR_CPUS)                                 \
+                raise_softirq(PAGE_SCRUB_SOFTIRQ);              \
+        }                                                       \
     } while ( 0 )
 unsigned long avail_scrub_pages(void);
 
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.