WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] [PATCH] x86 shadow: fix race when domain is dying

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] [PATCH] x86 shadow: fix race when domain is dying
From: Kouya Shimura <kouya@xxxxxxxxxxxxxx>
Date: Thu, 26 Nov 2009 17:17:46 +0900
Delivery-date: Thu, 26 Nov 2009 00:18:09 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
There are some cases that shadow_write_p2m_entry() is called after
the domain is killed. It causes Xen to crash.

- The race between xc_map_foreign_batch from qemu-dm and "xm destroy" command.
The actual console log:
(XEN) Xen call trace:
(XEN)    [<ffff82c4801c012e>] hash_foreach+0x87/0x17e
(XEN)    [<ffff82c4801c0362>] sh_remove_all_mappings+0x13d/0x22f
(XEN)    [<ffff82c4801c913d>] shadow_write_p2m_entry+0x14f/0x390
(XEN)    [<ffff82c4801bc73a>] p2m_set_entry+0x23f/0x472
(XEN)    [<ffff82c4801ba213>] set_p2m_entry+0x7d/0xb1
(XEN)    [<ffff82c4801ba3c9>] p2m_remove_page+0x158/0x167
(XEN)    [<ffff82c4801ba5d8>] guest_physmap_remove_page+0xd9/0x13c
(XEN)    [<ffff82c48015e0e4>] arch_memory_op+0x608/0xb3c
(XEN)    [<ffff82c4801138f3>] do_memory_op+0x1944/0x19a1
(XEN)    [<ffff82c480113b98>] do_multicall+0x248/0x390
(XEN)    [<ffff82c4801ec1bf>] syscall_enter+0xef/0x149

- The hypervisor calls domain_crash when PoD fails.
The actual console log:
(XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! tot_pages 
65751 pod_entries 197408
(XEN) domain_crash called from p2m.c:1062
(XEN) Domain 1 reported crashed by domain 0 on cpu#3:
(XEN) ----[ Xen-3.5-unstable  x86_64  debug=y  Tainted:    C ]----
...[snip]
(XEN) Xen call trace:
(XEN)    [<ffff82c4801c152e>] hash_foreach+0x87/0x17e
(XEN)    [<ffff82c4801c1762>] sh_remove_all_mappings+0x13d/0x22f
(XEN)    [<ffff82c4801ca491>] shadow_write_p2m_entry+0x14f/0x390
(XEN)    [<ffff82c4801bdaf6>] p2m_set_entry+0x23f/0x472
(XEN)    [<ffff82c4801bb5b3>] set_p2m_entry+0x7d/0xb1
(XEN)    [<ffff82c4801bdf9f>] p2m_pod_zero_check+0x276/0x3d8
(XEN)    [<ffff82c4801be71f>] p2m_pod_demand_populate+0x61e/0x8dc
(XEN)    [<ffff82c4801beb5c>] p2m_pod_check_and_populate+0x17f/0x1fa
(XEN)    [<ffff82c4801bf228>] p2m_gfn_to_mfn+0x34a/0x3f3
(XEN)    [<ffff82c480166528>] mod_l1_entry+0x1aa/0x7ee
(XEN)    [<ffff82c48016774f>] do_mmu_update+0x56a/0x144b
(XEN)    [<ffff82c4801ed1bf>] syscall_enter+0xef/0x149
(XEN)
(XEN) Pagetable walk from 0000000000000000:
(XEN)  L4[0x000] = 000000011e7c4067 00000000000d9933
(XEN)  L3[0x000] = 000000011e7c3067 00000000000d9934
(XEN)  L2[0x000] = 0000000000000000 ffffffffffffffff
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 3:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=0000]
(XEN) Faulting linear address: 0000000000000000
(XEN) ****************************************


Signed-off-by: Kouya Shimura <kouya@xxxxxxxxxxxxxx>

diff -r c0e32941ee69 xen/arch/x86/mm/p2m.c
--- a/xen/arch/x86/mm/p2m.c     Wed Nov 25 14:19:50 2009 +0000
+++ b/xen/arch/x86/mm/p2m.c     Thu Nov 26 15:56:01 2009 +0900
@@ -1221,6 +1221,12 @@ p2m_gfn_to_mfn(struct domain *d, unsigne
 
     ASSERT(paging_mode_translate(d));
 
+    if ( unlikely(d->is_dying) )
+    {
+        *t = p2m_invalid;
+        return _mfn(INVALID_MFN);
+    }
+
     /* XXX This is for compatibility with the old model, where anything not 
      * XXX marked as RAM was considered to be emulated MMIO space.
      * XXX Once we start explicitly registering MMIO regions in the p2m 
diff -r c0e32941ee69 xen/arch/x86/mm/shadow/common.c
--- a/xen/arch/x86/mm/shadow/common.c   Wed Nov 25 14:19:50 2009 +0000
+++ b/xen/arch/x86/mm/shadow/common.c   Thu Nov 26 15:56:01 2009 +0900
@@ -2171,6 +2171,7 @@ static void hash_foreach(struct vcpu *v,
 
     /* Say we're here, to stop hash-lookups reordering the chains */
     ASSERT(shadow_locked_by_me(d));
+    ASSERT(d->arch.paging.shadow.hash_table);
     ASSERT(d->arch.paging.shadow.hash_walking == 0);
     d->arch.paging.shadow.hash_walking = 1;
 
@@ -3449,6 +3450,12 @@ shadow_write_p2m_entry(struct vcpu *v, u
     
     shadow_lock(d);
 
+    if ( unlikely(d->is_dying) )
+    {
+        shadow_unlock(d);
+        return;
+    }
+
     /* If we're removing an MFN from the p2m, remove it from the shadows too */
     if ( level == 1 )
     {
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
<Prev in Thread] Current Thread [Next in Thread>