[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] live migration fails (assert in shadow_hash_delete)


  • To: Tim Deegan <Tim.Deegan@xxxxxxxxxx>
  • From: Devdutt Patnaik <xendevid@xxxxxxxxx>
  • Date: Tue, 23 Feb 2010 02:51:09 -0800
  • Cc: Ashish Bijlani <ashish.bijlani@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 23 Feb 2010 02:51:43 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=J9sWDlJGMuYMAqKBrRa51QR9TSZSX8eEVx01zcKtb2KAQvByj1rvgbnsQXCRxNOC+N E+ZlKGdI+tSyFtr8oplJv5RWDPVPS9qCGZxQ+ayhdCDwWeP9tVIcHsisaORmrHgT5U6X k5S21MlowzNx/cq2gJHlTOJW4ulpQa1A5bQUI=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Tim,

Its just the stock xen-unstable (Xen4.0-rc3 ) unmodified. Has this feature been evaluated/tested on the latest xen-unstable ?

Alright, we will give Xen 3.4.x a shot.

Thanks,
Devdutt.

On Tue, Feb 23, 2010 at 2:46 AM, Tim Deegan <Tim.Deegan@xxxxxxxxxx> wrote:
At 10:19 +0000 on 23 Feb (1266920353), Devdutt Patnaik wrote:
> We just used the xen-unstable version from 2 weeks ago, and haven't really modified it.
> We tried this with 64-bit versions of 2.6.31.6 and 2.6.32.8 DomU kernels.

OK.  This really needs to be fixed to the 4.0 release.  Keir, have we
had any other testing on 64-bit PV live migrations?

By "haven't really modified it" do you mean you have modified it or not?

> Any suggestions on what might be a better bet in terms of xen, Dom0 and DomU kernel versions.
> We wish to use 64-bit PV VMs for our experiments.

Xen 3.4.x should be stabler if you need to carry on immediately.

Cheers,

Tim.

> We have only been able to do a successful migration 3 times, out of maybe 30 odd attempts.
>
> Thanks,
> Devdutt.
>
> On Tue, Feb 23, 2010 at 1:25 AM, Tim Deegan <Tim.Deegan@xxxxxxxxxx<mailto:Tim.Deegan@xxxxxxxxxx>> wrote:
> Hi,
>
> At 08:57 +0000 on 23 Feb (1266915448), Ashish Bijlani wrote:
> > I'm working on a project that requires live migration of a 64-bit PV
> > VM (on a 64-bit platform). "xm save"  and "xm restore" work fine.
> > However, live migration fails with the following err msg:
>
> Oh dear.  I take it this is on the sending machine. What version of Xen
> are you using?
>
> Does it happen every time or only intermittently?
>
> Does it happen only with one particular guest or all 64bit guests?
>
> Have you made any modifications to Xen?
>
> It looks like the shadow pagetable code has got very confused - a page
> is marked as shadowed but isn't in the hash-table of shadowed pages.
>
> Cheers,
>
> Tim.
>
> > mapping kernel into physical memory
> > about to get started...
> > (XEN) traps.c:2306:d3 Domain attempted WRMSR 000000000000008b from
> > 00000a07:00000000 to 00000000:000000.
> > (XEN) Assertion 'x' failed at common.c:2139
> > (XEN) ----[ Xen-4.0.0-rc3-pre  x86_64  debug=y  Not tainted ]----
> > (XEN) CPU:    0
> > (XEN) RIP:    e008:[<ffff82c4801c8a08>] shadow_hash_delete+0x12e/0x18c
> > (XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor
> > (XEN) rax: ffff8300040e2770   rbx: ffff830223ce0000   rcx: 0000000000000000
> > (XEN) rdx: 0000000000000000   rsi: 0000000000000000   rdi: ffff82f60443c8a0
> > (XEN) rbp: ffff82c4802efb48   rsp: ffff82c4802efb18   r8:  ffff82f600000000
> > (XEN) r9:  0000000000000000   r10: ffff830223ce0000   r11: 00000000000041c5
> > (XEN) r12: 0000000000221e45   r13: 00000000000000ec   r14: ffff82f600000000
> > (XEN) r15: ffff8300cfaea000   cr0: 0000000080050033   cr4: 00000000000006f0
> > (XEN) cr3: 0000000210154000   cr2: ffff8801dd5508c8
> > (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> > (XEN) Xen stack trace from rsp=ffff82c4802efb18:
> > (XEN)    3333333333333333 00000000000041c5 ffff82c4802efb32 000000000000000d
> > (XEN)    00000000000041c5 ffff8300cfaea000 ffff82c4802efba8 ffff82c4801e5d2e
> > (XEN)    0000005600ed79a0 0000000800000000 0000000100000000 0000000000221e45
> > (XEN)    0000000000000008 0000000000000000 ffff82f60443c8a0 ffff82f60404ab00
> > (XEN)    ffff82f600000000 ffff8300cfaea000 ffff82c4802efbd8 ffff82c4801c766d
> > (XEN)    ffff82c4802efc18 0000000000000282 0000000000000281 0000000000221e45
> > (XEN)    ffff82c4802efc28 ffff82c4801cb18f 000000000f69d1d0 ffff830223ce0000
> > (XEN)    ffff82c4802efc18 ffff830223ce0000 ffff82c4802eff28 ffff830223ce0e28
> > (XEN)    0000000000000002 ffff8300040de000 ffff82c4802efc58 ffff82c4801cba8d
> > (XEN)    0000000000000282 ffff82c4802efe58 0000000000010000 0000000000008000
> > (XEN)    ffff82c4802efce8 ffff82c4801bb394 0000000100000000 ffff8302236e8000
> > (XEN)    ffff830223ce0f08 ffff8300040e1000 00000001802efd48 ffff82c48031f640
> > (XEN)    ffff830223ce0000 0000000100000001 ffff82c4802eff28 ffff8300040e0000
> > (XEN)    ffff82c4802efce8 ffff830223ce0000 ffff82c4802efe58 00007fff0f69d1d0
> > (XEN)    ffff82c4802efe48 0000000000000000 ffff82c4802efd08 ffff82c4801bb56a
> > (XEN)    fffffffffffffff3 0000000000f71000 ffff82c4802efdc8 ffff82c48014796c
> > (XEN)    ffff82c4802efd28 ffff82c48016b0d4 ffff82c4802efd48 ffff82c48011dce7
> > (XEN)    0000000000000008 ffff82c480163d8c ffff82c4802efd68 ffff82c480118755
> > (XEN)    0000000000000008 ffff8300cfafa000 ffff82c4802efdc8 0000000000000286
> > (XEN)    ffff82c4802efd98 0000000000000286 ffff82c4802eff28 ffff82c4802eff28
> > (XEN) Xen call trace:
> > (XEN)    [<ffff82c4801c8a08>] shadow_hash_delete+0x12e/0x18c
> > (XEN)    [<ffff82c4801e5d2e>] sh_destroy_l4_shadow__guest_4+0xb5/0x371
> > (XEN)    [<ffff82c4801c766d>] sh_destroy_shadow+0x17d/0x1ad
> > (XEN)    [<ffff82c4801cb18f>] shadow_blow_tables+0x20b/0x302
> > (XEN)    [<ffff82c4801cba8d>] shadow_clean_dirty_bitmap+0xba/0x10a
> > (XEN)    [<ffff82c4801bb394>] paging_log_dirty_op+0x506/0x58c
> > (XEN)    [<ffff82c4801bb56a>] paging_domctl+0x150/0x181
> > (XEN)    [<ffff82c48014796c>] arch_do_domctl+0x5c/0x1f64
> > (XEN)    [<ffff82c4801053b3>] do_domctl+0x1169/0x11e6
> > (XEN)    [<ffff82c4801f11bf>] syscall_enter+0xef/0x149
> > (XEN)
> > (XEN)
> > (XEN) ****************************************
> > (XEN) Panic on CPU 0:
> > (XEN) Assertion 'x' failed at common.c:2139
> > (XEN) ****************************************
> > (XEN)
> > (XEN) Reboot in five seconds...
> >
> > Any ideas what could be wrong here.
> >
> > Thanks,
> > Ashish
>
>
>
>
> Content-Description: ATT00001.txt
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx<mailto:Xen-devel@xxxxxxxxxxxxxxxxxxx>
> Tim Deegan <Tim.Deegan@xxxxxxxxxx<mailto:Tim.Deegan@xxxxxxxxxx>>
> Principal Software Engineer, XenServer Engineering
> Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx<mailto:Xen-devel@xxxxxxxxxxxxxxxxxxx>
> http://lists.xensource.com/xen-devel
>

--
Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, XenServer Engineering
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.