[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH]: Allow Xen to boot/run on large memory (>64G) machines



All,
     I've been tracking down a problem where dom0 refuses to boot on very large
memory x86_64 machines.  Here's what happens:

The hypervisor starts up with 1GB in the DMA zone.  Two large allocations come
out of the DMA zone; the frame table (in init_frametable()), and the memory for
dom0 (in construct_dom0()).  With a lot of memory in the box, most of the DMA
zone gets allocated during init_frametable; so much so, in fact, that there is
no room to make the allocation in construct_dom0, and the dom0 fails to boot 
with:

(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Not enough RAM for domain 0 allocation.
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...

The solution (suggested by Keir), is to make the frametable allocated out of
high memory instead of the DMA zone.  The attached patch (against 3.0.3, but the
problem is the same in unstable), does this.  I tested this out on a 96GB
machine; without the patch, the machine would reboot as described above; with
the patch, I was able to boot dom0 and create a PV guest with 92GB of memory.
     I only compile tested this on ia64, but I don't see anything in it that
should cause problems there.
     Note that this is not the end of the story, however.  For even larger
machines, it can *still* be the case that the allocation in construct_dom0()
fails; in particular, if the order goes above 17, it will fail in the same way.
 One way to fix it would be to just allocate that memory out of the normal zone
for x86_64, as well; however, I'm not sure if this will break anything else.
Any comments?

Signed-off-by: Chris Lalancette <clalance@xxxxxxxxxx>
diff -urp xen.orig/arch/x86/mm.c xen/arch/x86/mm.c
--- xen.orig/arch/x86/mm.c      2007-02-21 14:45:38.000000000 -0500
+++ xen/arch/x86/mm.c   2007-02-21 16:11:34.000000000 -0500
@@ -179,7 +179,15 @@ void __init init_frametable(void)
 
     for ( i = 0; i < nr_pages; i += page_step )
     {
+#ifdef __x86_64__
+        /* for x86_64 we want to allocate the frame table from the top
+         * of memory rather than the bottom; otherwise, on large memory
+         * machines (> 64G), we exhaust DMA memory, and dom0 cannot boot
+         */
+        mfn = alloc_boot_pages_reverse(min(nr_pages - i, page_step), 
page_step);
+#else
         mfn = alloc_boot_pages(min(nr_pages - i, page_step), page_step);
+#endif
         if ( mfn == 0 )
             panic("Not enough memory for frame table\n");
         map_pages_to_xen(
diff -urp xen.orig/common/page_alloc.c xen/common/page_alloc.c
--- xen.orig/common/page_alloc.c        2006-10-16 16:07:17.000000000 -0400
+++ xen/common/page_alloc.c     2007-02-21 16:09:38.000000000 -0500
@@ -213,26 +213,44 @@ void init_boot_pages(paddr_t ps, paddr_t
     }
 }
 
-unsigned long alloc_boot_pages(unsigned long nr_pfns, unsigned long pfn_align)
+static unsigned long check_and_map_page(unsigned long pg, unsigned long 
nr_pfns)
 {
-    unsigned long pg, i;
+    unsigned long i;
 
-    for ( pg = 0; (pg + nr_pfns) < max_page; pg += pfn_align )
+    for ( i = 0; i < nr_pfns; i++ )
+        if ( allocated_in_map(pg + i) )
+             break;
+
+    if ( i == nr_pfns )
     {
-        for ( i = 0; i < nr_pfns; i++ )
-            if ( allocated_in_map(pg + i) )
-                 break;
+        map_alloc(pg, nr_pfns);
+        return pg;
+    }
 
-        if ( i == nr_pfns )
-        {
-            map_alloc(pg, nr_pfns);
+    return 0;
+}
+
+unsigned long alloc_boot_pages(unsigned long nr_pfns, unsigned long pfn_align)
+{
+    unsigned long pg;
+
+    for ( pg = 0; (pg + nr_pfns) < max_page; pg += pfn_align )
+        if (check_and_map_page(pg, nr_pfns))
             return pg;
-        }
-    }
 
     return 0;
 }
 
+unsigned long alloc_boot_pages_reverse(unsigned long nr_pfns, unsigned long 
pfn_align)
+{
+    unsigned long pg;
+
+    for ( pg = (max_page - nr_pfns); pg > 0; pg -= pfn_align )
+        if (check_and_map_page(pg, nr_pfns))
+            return pg;
+
+    return 0;
+}
 
 
 /*************************
diff -urp xen.orig/include/xen/mm.h xen/include/xen/mm.h
--- xen.orig/include/xen/mm.h   2006-10-16 16:07:18.000000000 -0400
+++ xen/include/xen/mm.h        2007-02-21 15:57:46.000000000 -0500
@@ -39,6 +39,7 @@ struct page_info;
 /* Boot-time allocator. Turns into generic allocator after bootstrap. */
 paddr_t init_boot_allocator(paddr_t bitmap_start);
 void init_boot_pages(paddr_t ps, paddr_t pe);
+unsigned long alloc_boot_pages_reverse(unsigned long nr_pfns, unsigned long 
pfn_align);
 unsigned long alloc_boot_pages(unsigned long nr_pfns, unsigned long pfn_align);
 void end_boot_allocator(void);
 
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.