[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] RE: [PATCH] mem_sharing: fix race condition of nominate and unshare



Hi George:
 
       I am working on the xen mem_sharing,  I think the bug below is related to POD.
(Test shows when POD is enable, it is easily hit the bug, when disabled, no bug occurs).
 
As I know when domU starts will POD, it gets memory from POD cache, and in some
situation, POD cached will scan from Zero pages for reusing(link the page into POD
cache page list), and from the page_info define, list and handle share same posistion,
I think when reusing the page, POD doest't check page type, and if it is a shared page
, it still can be put into POD cache, and thus handle is been overwritten.
      
      So maybe we need to check the page type before putting into cache, What's your opinion?
      thanks.
 
>--------------------------------------------------------------------------------
>From: tinnycloud@xxxxxxxxxxx
>To: juihaochiang@xxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
>CC: tim.deegan@xxxxxxxxxx
>Subject: RE: [PATCH] mem_sharing: fix race condition of nominate and unshare
>Date: Tue, 18 Jan 2011 20:05:16 +0800
>
>Hi:
>
> It is later found that caused by below patch code and I am using the blktap2.
>The handle retruned from here will later become ch in mem_sharing_share_pages, and then
>in mem_sharing_share_pages will have ch = sh, thus caused the problem.
>
>+    /* Return the handle if the page is already shared */
>+    page = mfn_to_page(mfn);
>+    if (p2m_is_shared(p2mt)) {
>+        *phandle = page->shr_handle;
>+        ret = 0;
>+        goto out;
>+    }
>+
>
>But. after I  removed the code, tests still failed, and this handle's value is not make sence.
>
>
>(XEN) ===>total handles 17834 total gfns 255853
>(XEN) handle 13856642536914634
>(XEN) Debug for domain=1, gfn=19fed, Debug page: MFN=349c0a is ci=8000000000000008, ti=8400000000000007, owner_id=32755
>(XEN) ----[ Xen-4.0.0  x86_64  debug=n  Not tainted ]----
>(XEN) CPU:    15
>(XEN) RIP:    e008:[<ffff82c4801bff4b>] mem_sharing_unshare_page+0x19b/0x720
>(XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor
>(XEN) rax: 0000000000000000   rbx: ffff83063fc67f28  ;  rcx: 0000000000000092
>(XEN) rdx: 000000000000000a   rsi: 000000000000000a   rdi: ffff82c48021e9c4
>(XEN) rbp: ffff830440000000   rsp: ffff83063fc67c48   r8:  0000000000000001
>(XEN) r9:  0000000000000000   r10: 00000000fffffff8   r11: 0000000000000005
>(XEN) r12: 0000000000019fed   r13: 0000000000000000   r14: 0000000000000000
>(XEN) r15: ffff82f606938140   cr0: 000000008005003b   cr4: 00000000000026f0
>(XEN) cr3: 000000055513c000   cr2: 0000000000000018
>(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>(XEN) Xen stack trace from rsp=ffff83063fc67c48:
>(XEN)    02c5f6c8b70fed66 39ef64058b487674 ffff82c4801a6082 0000000000000000
>(XEN)    00313a8b00313eca 0000000000000001 0000000000000009 ff ff830440000000
>(XEN)    ffff83063fc67cb8 ffff82c4801df6f9 0000000000000040 ffff83063fc67d04
>(XEN)    0000000000019fed 0000000d000001ed ffff83055458d000 ffff83063fc67f28
>(XEN)    0000000000019fed 0000000000349c0a 0000000000000030 ffff83063fc67f28
>(XEN)    0000000000000030 ffff82c48019baa6 ffff82c4802519c0 0000000d8016838e
>(XEN)    0000000000000000 00000000000001aa ffff8300bf554000 ffff82c4801b3864
>(XEN)    ffff830440000348 ffff8300bf554000 ffff8300bf5557f0 ffff8300bf5557e8
>(XEN)    00000032027b81f2 ffff82c48026f080 ffff82c4801a9337 ffff8300bf448000
>(XEN)    ffff8300bf554000 ffff830000000000 0000000019fed000 ffff8300bf2f2000
>(XEN)    ffff82c48019985d 0000000000000080 ffff8300bf554000 0000000000019fed
>(XEN)    ffff82c4801b08ba 000000000001e000 ffff82c48014931f ff ff8305570c6d50
>(XEN)    ffff82c480251080 00000032027b81f2 ffff8305570c6d50 ffff83052f3e2200
>(XEN)    0000000f027b7de0 ffff82c48011e07a 000000000000000f ffff82c48026f0a0
>(XEN)    0000000000000082 0000000000000000 0000000000000000 0000000000009e44
>(XEN)    ffff8300bf554000 ffff8300bf2f2000 ffff82c48011e07a 000000000000000f
>(XEN)    ffff8300bf555760 0000000000000292 ffff82c48011afca 00000032028a8fc0
>(XEN)    0000000000000292 ffff82c4801a93c3 00000000000000ef ffff8300bf554000
>(XEN)    ffff8300bf554000 ffff8300bf5557e8 ffff82c4801a6082 ffff8300bf554000
>(XEN)    0000000000000000 ffff82c4801a0cc8 ffff8300bf554000 ffff8300bf554000
>(XEN) Xen call trace:
>(XEN)    [<ffff82c4801bff4b>] mem_sharing_unshare_page+0x19b/0x720
>(XEN)    [<ffff82c4801a6082>] v lapic_has_pending_irq+0x42/0x70
>(XEN)    [<ffff82c4801df6f9>] ept_get_entry+0xa9/0x1c0
>(XEN)    [<ffff82c48019baa6>] hvm_hap_nested_page_fault+0xd6/0x190
>(XEN)    [<ffff82c4801b3864>] vmx_vmexit_handler+0x304/0x1a90
>(XEN)    [<ffff82c4801a9337>] pt_restore_timer+0x57/0xb0
>(XEN)    [<ffff82c48019985d>] hvm_do_resume+0x1d/0x130
>(XEN)    [<ffff82c4801b08ba>] vmx_do_resume+0x11a/0x1c0
>(XEN)    [<ffff82c48014931f>] context_switch+0x76f/0xf00
>(XEN)    [<ffff82c48011e07a>] add_entry+0x3a/0xb0
>(XEN)    [<ffff82c48011e07a>] add_entry+0x3a/0xb0
>(XEN)    [<ffff82c48011afca>] schedule+0x1ea/0x500
>(XEN)    [<ffff82c4801a93c3>] pt_update_irq+0x33/0x1e0
>(XEN)    [< ;ffff82c4801a6082>] vlapic_has_pending_irq+0x42/0x70
>(XEN)    [<ffff82c4801a0cc8>] hvm_vcpu_has_pending_irq+0x88/0xa0
>(XEN)    [<ffff82c4801b267b>] vmx_vmenter_helper+0x5b/0x150
>(XEN)    [<ffff82c4801adaa3>] vmx_asm_do_vmentry+0x0/0xdd
>(XEN)   
>(XEN) Pagetable walk from 0000000000000018:
>(XEN)  L4[0x000] = 0000000000000000 ffffffffffffffff
>(XEN)
>(XEN) ****************************************
>(XEN) Panic on CPU 15:
>(XEN) FATAL PAGE FAULT
>(XEN) [error_code=0000]
>(XEN) Faulting linear address: 0000000000000018
>(XEN) ****************************************
>(XEN)
>(XEN) Manual reset required ('noreboot' specified)
>
>
>
>
> ---------------------------------------------------------------------------------------------------
>>From: tinnycloud@xxxxxxxxxxx
>>To: juihaochiang@xxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
>>CC: tim.deegan@xxxxxxxxxx
>>Subject: RE: [PATCH] mem_sharing: fix race condition of nominate and unshare
>>Date: Tue, 18 Jan 2011 17:42:32 +0800
>
>>Hi Tim & Jui-Hao:
>
> >     When I use Linux HVM instead of Windows HVM, more bug shows up.
>
>>      I only start on VM, and when I destroy it , xen crashed on mem_sharing_unshare_page()
>>which in line709, hash_entry is NULL. Later I found the handle has been removed in
>>mem_sharing_share_pages(), please refer logs below.
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.