Xen project Mailing List

Re: [Xen-devel] Bug on shadow page mode

To: Tim Deegan <tim@xxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>

From: "Hao, Xudong" <xudong.hao@xxxxxxxxx>

Date: Sun, 7 Apr 2013 09:25:39 +0000

Accept-language: en-US

Cc: "xen-devel \(xen-devel@xxxxxxxxxxxxx\)" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Sun, 07 Apr 2013 09:26:46 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: Ac4sRI1XnYYXkrGDSbWhn4ywDiyaQwC7W9cAABITA+D//7X7AIADCr4AgAAElQD/+tkJIA==

Thread-topic: [Xen-devel] Bug on shadow page mode

> -----Original Message----- > From: Tim Deegan [mailto:tim@xxxxxxx] > Sent: Thursday, April 04, 2013 6:35 PM > To: Jan Beulich > Cc: Hao, Xudong; xen-devel (xen-devel@xxxxxxxxxxxxx) > Subject: Re: [Xen-devel] Bug on shadow page mode > > At 11:18 +0100 on 04 Apr (1365074288), Tim Deegan wrote: > > Hi, > > > > At 12:50 +0100 on 02 Apr (1364907054), Jan Beulich wrote: > > > > (XEN) Xen call trace: > > > > (XEN) [<ffff82c4c01e637f>] guest_walk_tables_4_levels+0x135/0x6a6 > > > > (XEN) [<ffff82c4c020d8cc>] sh_page_fault__guest_4+0x505/0x2015 > > > > (XEN) [<ffff82c4c01d2135>] vmx_vmexit_handler+0x86c/0x1748 > > > > (XEN) > > > > (XEN) Pagetable walk from ffff82c406a00000: > > > > (XEN) L4[0x105] = 000000007f26e063 ffffffffffffffff > > > > (XEN) L3[0x110] = 000000005ce30063 ffffffffffffffff > > > > (XEN) L2[0x035] = 0000000014aab063 ffffffffffffffff > > > > (XEN) L1[0x000] = 0000000000000000 ffffffffffffffff > > > > > > Tim, > > > > > > I'm afraid this is something for you. From what I can tell, despite > > > sh_walk_guest_tables() being called from sh_page_fault() without > > > the paging lock held, there doesn't appear to be a way for this to > > > race sh_update_cr3(). And with the way the latter updates > > > guest_vtable, the only way for a page fault to happen upon use > > > of that cached mapping would be between the call to > > > sh_unmap_domain_page_global() and the immediately following > > > one to sh_map_domain_page_global() (i.e. while the pointer is > > > stale). > > > > Hmmm. So the only way I can see that happening is if some foreign agent > > resets the vcpu's state while it's actually running, which AFAICT > > shouldn't happen. > > OTOH, looking at map_domain_page_global, there doesn't seem to be any > locking preventing two CPUs from populating a page of global-map l1es at > the same time. So, here's a different patch to test -- it would be good > to know if this patch by itself fixes the crash. > Holding lock during l1e populating fixes the crash on my side. Thanks -Xudong > Tim. > > diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c > index 7421e03..efda6af 100644 > --- a/xen/arch/x86/domain_page.c > +++ b/xen/arch/x86/domain_page.c > @@ -354,9 +354,10 @@ void *map_domain_page_global(unsigned long mfn) > set_bit(idx, inuse); > inuse_cursor = idx + 1; > > + pl1e = virt_to_xen_l1e(va); > + > spin_unlock(&globalmap_lock); > > - pl1e = virt_to_xen_l1e(va); > if ( !pl1e ) > return NULL; > l1e_write(pl1e, l1e_from_pfn(mfn, __PAGE_HYPERVISOR)); _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.