[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] pv 2.6.31 (kernel.org) and save/migrate, domU BUG()



On Sat, Nov 07, 2009 at 07:32:49AM -0800, Dan Magenheimer wrote:
> > > Well, first, I got 2.6.31.5 to boot in a PV guest in another
> > > machine and it fails to save also.  Are you able to save
> > > 2.6.31{,.5} successfully?  On latest xen-unstable?
> > > (NOTE: Yes, I do have CONFIG_XEN_SAVE_RESTORE=y... don't
> > > know if that is important.)
> > 
> > I'll have to try it later today..
> 
> Let me know.
> 

Ok. I just tried with a Fedora 12 (rawhide) PV guest. I was able to 
"xm save" and "xm restore" it without problems. 

But I noticed there was a BUG printed on the guest console:
http://pasik.reaktio.net/xen/debug/dmesg-2.6.31.5-122.fc12.x86_64-saverestore.txt

BUG: sleeping function called from invalid context at kernel/mutex.c:94
in_atomic(): 0, irqs_disabled(): 1, pid: 1052, name: kstop/0
Pid: 1052, comm: kstop/0 Not tainted 2.6.31.5-122.fc12.x86_64 #1
Call Trace:
 [<ffffffff8104021f>] __might_sleep+0xe6/0xe8
 [<ffffffff81419c84>] mutex_lock+0x22/0x4e
 [<ffffffff812afdce>] dpm_resume_noirq+0x21/0x11f
 [<ffffffff81272b05>] xen_suspend+0xca/0xd1
 [<ffffffff8108c172>] stop_cpu+0x8c/0xd2
 [<ffffffff8106350c>] worker_thread+0x18a/0x224
 [<ffffffff81067ae7>] ? autoremove_wake_function+0x0/0x39
 [<ffffffff8141ab29>] ? _spin_unlock_irqrestore+0x19/0x1b
 [<ffffffff81063382>] ? worker_thread+0x0/0x224
 [<ffffffff81067765>] kthread+0x91/0x99
 [<ffffffff81012daa>] child_rip+0xa/0x20
 [<ffffffff81011f97>] ? int_ret_from_sys_call+0x7/0x1b
 [<ffffffff8101271d>] ? retint_restore_args+0x5/0x6
 [<ffffffff81012da0>] ? child_rip+0x0/0x20


More information about my setup:

Host/dom0: Fedora 12 (latest rawhide) with included Xen 3.4.1-5 and
custom 2.6.31.5 x86_64 pv_ops dom0 kernel (a couple of days old).

Guest/domU: Fedora 12 (latest rawhide) with the included/default
2.6.31.5-122.fc12.x86_64 kernel.

> > > (On the machine I couldn't boot 2.6.31.5 as a PV guest, there
> > > was absolutely no console output.  However, I think tools
> > > are out-of-date on that machine so ignore that.)
> > 
> > Did you have "console=hvc0 earlyprintk=xen" in the domU kernel
> > parameters?
> 
> No, but that didn't work either.
> 

Ok.. then it crashes really early.

> > You might also change the xen guest cfgfile so that you have
> > on_crash=preserve and then when the PV guest is crashed run this:
> > 
> > /usr/lib/xen/bin/xenctx -s System.map-domUkernelversion <domid>
> > 
> > (if you have 64b host the xenctx binary might be under /usr/lib64/)
> > 
> > to get a stack trace..
> 
> Very interesting and useful!  I was completely unaware of
> xenctx and could have used it many times in tmem development!
> 
> The results explain why I can get it to run on
> one machine (an older laptop) and not run on another
> machine (a Nehalem system)... looks like this is maybe
> related to the cpuid-extended-topology-leaf bug that Jeremy
> sent a fix for upstream recently.
> 

Did you try with that patch applied? 

-- Pasi

> cs:eip: e019:c040342d xen_cpuid+0x46 
> flags: 00001206 i nz p
> ss:esp: e021:c0779ee4
> eax: 00000001 ebx: 00000002   ecx: 00000100   edx: 00000001
> esi: c0779f1c edi: c0779f18   ebp: c0779f24
>  ds:     e021  es:     e021    fs:     00d8    gs:     0000
> Code (instr addr c040342d)
> 24 04 8b 15 a4 02 7c c0 89 54 24 08 8b 0e 0f 0b 78 65 6e 0f a2 <89> 45 00 8b 
> 04 24 89 18 89 0e 89 
> 
> 
> Stack:
>  c0779f20 ffffffff ffffffff c07c0360 c0779f18 c0779f1c c0779f20 c066fd0f
>  c0779f18 c0779f24 00000002 16aee301 00000001 00000001 16aee301 00000002
>  0000000b c07c03cc c07c0360 c07c0360 c07c03d8 c0670ed8 c0779f58 00000001
>  c07c0360 c0779f60 c066fe6a c0779f60 c0779f60 00000003 00000001 00000000
> 
> Call Trace:
>   [<c040342d>] xen_cpuid+0x46  <--
>   [<c066fd0f>] detect_extended_topology+0xae 
>   [<c0670ed8>] init_intel+0x140 
>   [<c066fe6a>] init_scattered_cpuid_features+0x82 
>   [<c06705e2>] identify_cpu+0x22d 
>   [<c040584c>] xen_force_evtchn_callback+0xc 
>   [<c0405e78>] check_events+0x8 
>   [<c07c9dec>] identify_boot_cpu+0xa 
>   [<c07c9e9a>] check_bugs+0x8 
>   [<c07c27bd>] start_kernel+0x2a0 
>   [<c07c5206>] xen_start_kernel+0x340 




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.