WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Need help with fixing the Xen waitqueue feature

To: Keir Fraser <keir.xen@xxxxxxxxx>
Subject: Re: [Xen-devel] Need help with fixing the Xen waitqueue feature
From: Olaf Hering <olaf@xxxxxxxxx>
Date: Fri, 11 Nov 2011 23:56:46 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Fri, 11 Nov 2011 14:58:52 -0800
Dkim-signature: v=1; a=rsa-sha1; c=relaxed/relaxed; t=1321052221; l=4634; s=domk; d=aepfle.de; h=In-Reply-To:Content-Type:MIME-Version:References:Subject:Cc:To:From: Date:X-RZG-CLASS-ID:X-RZG-AUTH; bh=RHgZsRKeQbrnM+FHWeAA+Nf5Gpc=; b=YMZwyjoJkuAXgI85i2OR6Pqy4OqD19P2J2V6FITLnXbd/hEnQ/5ASv1lXma3EWevrhG mB89PzvEMtvdx4ISIixF0keIjgp257+5WXpa8EQ4mD1hM+cAGL/9rbO8QkikpnD7CnKWM +88xTRFuxujt2omeF6W4bAh5hUdJPVpszZY=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <CADF63BC.2460E%keir.xen@xxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20111108222011.GA23969@xxxxxxxxx> <CADF63BC.2460E%keir.xen@xxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.21.rev5535 (2011-07-01)
Keir,

just do dump my findings to the list:

On Tue, Nov 08, Keir Fraser wrote:

> Tbh I wonder anyway whether stale hypercall context would be likely to cause
> a silent machine reboot. Booting with max_cpus=1 would eliminate moving
> between CPUs as a cause of inconsistencies, or pin the guest under test.
> Another problem could be sleeping with locks held, but we do test for that
> (in debug builds at least) and I'd expect crash/hang rather than silent
> reboot. Another problem could be if the vcpu has its own state in an
> inconsistent/invalid state temporarily (e.g., its pagetable base pointers)
> which then is attempted to be restored during a waitqueue wakeup. That could
> certainly cause a reboot, but I don't know of an example where this might
> happen.

The crashes also happen with maxcpus=1 and a single guest cpu.
Today I added wait_event to ept_get_entry and this works.

But at some point the codepath below is executed, after that wake_up the
host hangs hard. I will trace it further next week, maybe the backtrace
gives a glue what the cause could be.

Also, the 3K stacksize is still too small, this path uses 3096.

(XEN) prep 127a 30 0
(XEN) wake 127a 30
(XEN) prep 1cf71 30 0
(XEN) wake 1cf71 30
(XEN) prep 1cf72 30 0
(XEN) wake 1cf72 30
(XEN) prep 1cee9 30 0
(XEN) wake 1cee9 30
(XEN) prep 121a 30 0
(XEN) wake 121a 30

(This means 'gfn  (p2m_unshare << 4) in_atomic)'

(XEN) prep 1ee61 20 0
(XEN) max stacksize c18
(XEN) Xen WARN at wait.c:126
(XEN) ----[ Xen-4.2.24114-20111111.221356  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82c48012b85e>] prepare_to_wait+0x178/0x1b2
(XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
(XEN) rax: 0000000000000000   rbx: ffff830201f76000   rcx: 0000000000000000
(XEN) rdx: ffff82c4802b7f18   rsi: 000000000000000a   rdi: ffff82c4802673f0
(XEN) rbp: ffff82c4802b73a8   rsp: ffff82c4802b7378   r8:  0000000000000000
(XEN) r9:  ffff82c480221da0   r10: 00000000fffffffa   r11: 0000000000000003
(XEN) r12: ffff82c4802b7f18   r13: ffff830201f76000   r14: ffff83003ea5c000
(XEN) r15: 000000000001ee61   cr0: 000000008005003b   cr4: 00000000000026f0
(XEN) cr3: 000000020336d000   cr2: 00007fa88ac42000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen stack trace from rsp=ffff82c4802b7378:
(XEN)    0000000000000020 000000000001ee61 0000000000000002 ffff830201aa9e90
(XEN)    ffff830201aa9f60 0000000000000020 ffff82c4802b7428 ffff82c4801e02f9
(XEN)    ffff830000000002 0000000000000000 ffff82c4802b73f8 ffff82c4802b73f4
(XEN)    0000000000000000 ffff82c4802b74e0 ffff82c4802b74e4 0000000101aa9e90
(XEN)    000000ffffffffff ffff830201aa9e90 000000000001ee61 ffff82c4802b74e4
(XEN)    0000000000000002 0000000000000000 ffff82c4802b7468 ffff82c4801d810f
(XEN)    ffff82c4802b74e0 000000000001ee61 ffff830201aa9e90 ffff82c4802b75bc
(XEN)    00000000002167f5 ffff88001ee61900 ffff82c4802b7518 ffff82c480211b80
(XEN)    ffff8302167f5000 ffff82c4801c168c 0000000000000000 ffff83003ea5c000
(XEN)    ffff88001ee61900 0000000001805063 0000000001809063 000000001ee001e3
(XEN)    000000001ee61067 00000000002167f5 000000000022ee70 000000000022ed10
(XEN)    ffffffffffffffff 0000000a00000007 0000000000000004 ffff82c48025db80
(XEN)    ffff83003ea5c000 ffff82c4802b75bc ffff88001ee61900 ffff830201aa9e90
(XEN)    ffff82c4802b7528 ffff82c480211cb1 ffff82c4802b7568 ffff82c4801da97f
(XEN)    ffff82c4801be053 0000000000000008 ffff82c4802b7b58 ffff88001ee61900
(XEN)    0000000000000000 ffff82c4802b78b0 ffff82c4802b75f8 ffff82c4801aaec8
(XEN)    0000000000000003 ffff88001ee61900 ffff82c4802b78b0 ffff82c4802b7640
(XEN)    ffff83003ea5c000 00000000000000a0 0000000000000900 0000000000000008
(XEN)    00000003802b7650 0000000000000004 00000003802b7668 0000000000000000
(XEN)    ffff82c4802b7b58 0000000000000001 0000000000000003 ffff82c4802b78b0
(XEN) Xen call trace:
(XEN)    [<ffff82c48012b85e>] prepare_to_wait+0x178/0x1b2
(XEN)    [<ffff82c4801e02f9>] ept_get_entry+0x81/0xd8
(XEN)    [<ffff82c4801d810f>] gfn_to_mfn_type_p2m+0x55/0x114
(XEN)    [<ffff82c480211b80>] hap_p2m_ga_to_gfn_4_levels+0x1c4/0x2d6
(XEN)    [<ffff82c480211cb1>] hap_gva_to_gfn_4_levels+0x1f/0x2e
(XEN)    [<ffff82c4801da97f>] paging_gva_to_gfn+0xae/0xc4
(XEN)    [<ffff82c4801aaec8>] hvmemul_linear_to_phys+0xf1/0x25c
(XEN)    [<ffff82c4801ab762>] hvmemul_rep_movs+0xe8/0x31a
(XEN)    [<ffff82c48018de07>] x86_emulate+0x4e01/0x10fde
(XEN)    [<ffff82c4801aab3c>] hvm_emulate_one+0x12d/0x1c5
(XEN)    [<ffff82c4801b68a9>] handle_mmio+0x4e/0x1d8
(XEN)    [<ffff82c4801b3a1e>] hvm_hap_nested_page_fault+0x1e7/0x302
(XEN)    [<ffff82c4801d1ff6>] vmx_vmexit_handler+0x12cf/0x1594
(XEN)
(XEN) wake 1ee61 20




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel