WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Re: [PATCH] xen: correctly rebuild mfn list list after m

To: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Subject: Re: [Xen-devel] Re: [PATCH] xen: correctly rebuild mfn list list after migration.
From: Ian Campbell <Ian.Campbell@xxxxxxxxxxxxx>
Date: Thu, 21 Oct 2010 11:10:41 +0100
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Thu, 21 Oct 2010 03:11:52 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4CB74B9F.2060608@xxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Citrix Systems, Inc.
References: <1286878488-9027-1-git-send-email-ian.campbell@xxxxxxxxxx> <4CB650CE.7090701@xxxxxxxx> <1287038979.5663.180.camel@xxxxxxxxxxxxxxxxxxxxx> <1287046280.2003.5564.camel@xxxxxxxxxxxxxxxxxxxxxx> <4CB74B9F.2060608@xxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Thu, 2010-10-14 at 19:27 +0100, Jeremy Fitzhardinge wrote:
> On 10/14/2010 01:51 AM, Ian Campbell wrote:
> > On Thu, 2010-10-14 at 07:49 +0100, Ian Campbell wrote:
> >> On Thu, 2010-10-14 at 01:37 +0100, Jeremy Fitzhardinge wrote:
> >>>
> >>> I'm getting this triggering at boot:
> >>>
> >>> PM: Adding info for No Bus:xen!gntdev
> >>> ------------[ cut here ]------------
> >>> kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480! 
> >> Probably a consequence of the bogus attempt to skip over completely
> >> empty mid levels which you pointed out in your next mail.
> >>
> >> I guess I should test 64 bit guests and not just PAE ones and I guess
> >> pre-ballooning is likely to interact as well.
> > I'm only able to reproduce this by booting ballooned and then ballooning
> > up, are you doing that or were you seeing it just by booting?
> >
> > What are your memory and maxmem settings?
> 
> memory=512, no maxmem.

The BUG_ON I added to alloc_p2m was wrong (converted an mfn to an
address in the direct map and then compared it with a virtual address in
kernel map, IIRC). Here's an update patch which fixes all your previous
comments, this issue and a few other bits and bobs

* s/p2m_mid_mfn_p/p2m_top_mfn_p/g
* Fix BUG_ON in alloc_p2m
* Skip correct number of pfns when mid == p2m_mid_missing
* Use correct value p2m_top_mfn[topidx] when mid == p2m_mid_missing

Ian.

8<--------
>From 0a771251cf7e31ff056e24feb3507b236be2a827 Mon Sep 17 00:00:00 2001
From: Ian Campbell <ian.campbell@xxxxxxxxxx>
Date: Thu, 21 Oct 2010 11:00:46 +0100
Subject: [PATCH] xen: correctly rebuild mfn list list after migration.

Otherwise the second migration attempt fails because the mfn_list_list
still refers to all the old mfns.

We need to update the entires in both p2m_top_mfn and the mid_mfn
pages which p2m_top_mfn refers to.

In order to do this we need to keep track of the virtual addresses
mapping the p2m_mid_mfn pages since we cannot rely on
mfn_to_virt(p2m_top_mfn[idx]) since p2m_top_mfn[idx] will still
contain the old MFN after a migration, which may now belong to another
domain and hence have a different mapping in the m2p.

Therefore add and maintain a third top level page, p2m_top_mfn_p[],
which tracks the virtual addresses of the mfns contained in
p2m_top_mfn[].

We also need to update the content of the p2m_mid_missing_mfn page on
resume to refer to the page's new mfn.

p2m_missing does not need updating since the migration process takes
care of the leaf p2m pages for us.
---
 arch/x86/xen/mmu.c |   50 +++++++++++++++++++++++++++++++++++++-------------
 1 files changed, 37 insertions(+), 13 deletions(-)

diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 15bbccd..ebb74ec 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -186,6 +186,8 @@ DEFINE_PER_CPU(unsigned long, xen_current_cr3);      /* 
actual vcpu cr3 */
  *    / \      / \         /           /
  *  p2m p2m p2m p2m p2m p2m p2m ...
  *
+ * The p2m_mid_mfn pages are mapped by p2m_top_mfn_p.
+ *
  * The p2m_top and p2m_top_mfn levels are limited to 1 page, so the
  * maximum representable pseudo-physical address space is:
  *  P2M_TOP_PER_PAGE * P2M_MID_PER_PAGE * P2M_PER_PAGE pages
@@ -210,6 +212,7 @@ static RESERVE_BRK_ARRAY(unsigned long, 
p2m_mid_missing_mfn, P2M_MID_PER_PAGE);
 
 static RESERVE_BRK_ARRAY(unsigned long **, p2m_top, P2M_TOP_PER_PAGE);
 static RESERVE_BRK_ARRAY(unsigned long, p2m_top_mfn, P2M_TOP_PER_PAGE);
+static RESERVE_BRK_ARRAY(unsigned long *, p2m_top_mfn_p, P2M_TOP_PER_PAGE);
 
 RESERVE_BRK(p2m_mid, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * 
P2M_MID_PER_PAGE)));
 RESERVE_BRK(p2m_mid_mfn, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * 
P2M_MID_PER_PAGE)));
@@ -246,6 +249,14 @@ static void p2m_top_mfn_init(unsigned long *top)
                top[i] = virt_to_mfn(p2m_mid_missing_mfn);
 }
 
+static void p2m_top_mfn_p_init(unsigned long **top)
+{
+       unsigned i;
+
+       for (i = 0; i < P2M_TOP_PER_PAGE; i++)
+               top[i] = p2m_mid_missing_mfn;
+}
+
 static void p2m_mid_init(unsigned long **mid)
 {
        unsigned i;
@@ -302,33 +313,43 @@ EXPORT_SYMBOL(create_lookup_pte_addr);
  */
 void xen_build_mfn_list_list(void)
 {
-       unsigned pfn;
+       unsigned long pfn;
 
        /* Pre-initialize p2m_top_mfn to be completely missing */
        if (p2m_top_mfn == NULL) {
                p2m_mid_missing_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE);
                p2m_mid_mfn_init(p2m_mid_missing_mfn);
 
+               p2m_top_mfn_p = extend_brk(PAGE_SIZE, PAGE_SIZE);
+               p2m_top_mfn_p_init(p2m_top_mfn_p);
+
                p2m_top_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE);
                p2m_top_mfn_init(p2m_top_mfn);
+       } else {
+               /* Reinitialise, mfn's all change after migration */
+               p2m_mid_mfn_init(p2m_mid_missing_mfn);
        }
 
        for (pfn = 0; pfn < xen_max_p2m_pfn; pfn += P2M_PER_PAGE) {
                unsigned topidx = p2m_top_index(pfn);
                unsigned mididx = p2m_mid_index(pfn);
                unsigned long **mid;
-               unsigned long mid_mfn;
                unsigned long *mid_mfn_p;
 
                mid = p2m_top[topidx];
+               mid_mfn_p = p2m_top_mfn_p[topidx];
 
                /* Don't bother allocating any mfn mid levels if
-                  they're just missing */
-               if (mid[mididx] == p2m_missing)
+                * they're just missing, just update the stored mfn,
+                * since all could have changed over a migrate.
+                */
+               if (mid == p2m_mid_missing) {
+                       BUG_ON(mididx);
+                       BUG_ON(mid_mfn_p != p2m_mid_missing_mfn);
+                       p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_missing_mfn);
+                       pfn += (P2M_MID_PER_PAGE - 1) * P2M_PER_PAGE;
                        continue;
-
-               mid_mfn = p2m_top_mfn[topidx];
-               mid_mfn_p = mfn_to_virt(mid_mfn);
+               }
 
                if (mid_mfn_p == p2m_mid_missing_mfn) {
                        /*
@@ -340,11 +361,10 @@ void xen_build_mfn_list_list(void)
                        mid_mfn_p = extend_brk(PAGE_SIZE, PAGE_SIZE);
                        p2m_mid_mfn_init(mid_mfn_p);
 
-                       mid_mfn = virt_to_mfn(mid_mfn_p);
-                       
-                       p2m_top_mfn[topidx] = mid_mfn;
+                       p2m_top_mfn_p[topidx] = mid_mfn_p;
                }
 
+               p2m_top_mfn[topidx] = virt_to_mfn(mid_mfn_p);
                mid_mfn_p[mididx] = virt_to_mfn(mid[mididx]);
        }
 }
@@ -363,7 +383,7 @@ void __init xen_build_dynamic_phys_to_machine(void)
 {
        unsigned long *mfn_list = (unsigned long *)xen_start_info->mfn_list;
        unsigned long max_pfn = min(MAX_DOMAIN_PAGES, xen_start_info->nr_pages);
-       unsigned pfn;
+       unsigned long pfn;
 
        xen_max_p2m_pfn = max_pfn;
 
@@ -453,7 +473,9 @@ static bool alloc_p2m(unsigned long pfn)
        }
 
        top_mfn_p = &p2m_top_mfn[topidx];
-       mid_mfn = mfn_to_virt(*top_mfn_p);
+       mid_mfn = p2m_top_mfn_p[topidx];
+
+       BUG_ON(virt_to_mfn(mid_mfn) != *top_mfn_p);
 
        if (mid_mfn == p2m_mid_missing_mfn) {
                /* Separately check the mid mfn level */
@@ -465,11 +487,13 @@ static bool alloc_p2m(unsigned long pfn)
                        return false;
 
                p2m_mid_mfn_init(mid_mfn);
-               
+
                missing_mfn = virt_to_mfn(p2m_mid_missing_mfn);
                mid_mfn_mfn = virt_to_mfn(mid_mfn);
                if (cmpxchg(top_mfn_p, missing_mfn, mid_mfn_mfn) != missing_mfn)
                        free_p2m_page(mid_mfn);
+               else
+                       p2m_top_mfn_p[topidx] = mid_mfn;
        }
 
        if (p2m_top[topidx][mididx] == p2m_missing) {
-- 
1.5.6.5




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel