[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility



Ian Jackson wrote:
> Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD 
> flexibility"):
>   
>> I let this run over the weekend and today noticed libvirtd was deadlocked
>>     
>
> I have just retested xl with:
>   * my 3-patch 4.4 fixes series
>   * v2 of my fork series
>   * the extra mutex patch "libxl: fork: Fixup SIGCHLD sharing"
>   * "13/12" and "14/12" just posted
> and it WFM.
>
> Of course I don't have the same setup as Jim.
>
> Jim: if it's not too much trouble, I'd appreciate it if you could try
> that combination.
>
> For your convenience you can find a git branch of it at
>   
> http://xenbits.xen.org/gitweb/?p=people/iwj/xen.git;a=shortlog;h=refs/tags/wip.enumerate-pids-v2.1
> aka
>   git://xenbits.xen.org/people/iwj/xen.git#wip.enumerate-pids-v2.1
>   

I've been testing this branch and notice an occasional libvirtd segfault
that always occurs when calling libxl_domain_create_restore().  By
occasional, I mean my save/restore script might cause the segfault after
2 iterations, or 20 iterations, or ...  But the segfault always occurs
in libxl_domain_create_restore()

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffeef59700 (LWP 12083)]
0x00007ffff74577ef in virObjectIsClass (anyobj=0x2f302f6e69616d6f,
klass=0x5555558a1310)
    at util/virobject.c:362
362         return virClassIsDerivedFrom(obj->klass, klass);
(gdb) bt
#0  0x00007ffff74577ef in virObjectIsClass (anyobj=0x2f302f6e69616d6f,
klass=0x5555558a1310)
    at util/virobject.c:362
#1  0x00007ffff745765b in virObjectLock (anyobj=0x2f302f6e69616d6f) at
util/virobject.c:314
#2  0x00007fffe993cc96 in libxlDomainObjTimeoutModifyEventHook
(priv=0x5555558fc310,
    hndp=0x5555559e5d88, abs_t=...) at libxl/libxl_domain.c:302
#3  0x00007fffe96f8fed in time_deregister (gc=0x7fffeef58220,
ev=0x5555559eee48)
    at libxl_event.c:294
#4  0x00007fffe96facfd in afterpoll_internal (egc=0x7fffeef58220,
poller=0x5555559a4c70, nfds=3,
    fds=0x5555559c09d0, now=...) at libxl_event.c:1008
#5  0x00007fffe96fc312 in eventloop_iteration (egc=0x7fffeef58220,
poller=0x5555559a4c70)
    at libxl_event.c:1455
#6  0x00007fffe96fce58 in libxl__ao_inprogress (ao=0x5555559e9690,
    file=0x7fffe970fadb "libxl_create.c", line=1356,
    func=0x7fffe97105f0 <__func__.16344> "do_domain_create") at
libxl_event.c:1700
#7  0x00007fffe96d711f in do_domain_create (ctx=0x5555559d9fa0,
d_config=0x7fffeef58490,
    domid=0x7fffeef5840c, restore_fd=89, checkpointed_stream=0,
ao_how=0x0, aop_console_how=0x0)
    at libxl_create.c:1356
#8  0x00007fffe96d7238 in libxl_domain_create_restore
(ctx=0x5555559d9fa0, d_config=0x7fffeef58490,
    domid=0x7fffeef5840c, restore_fd=89, params=0x7fffeef58400,
ao_how=0x0, aop_console_how=0x0)
    at libxl_create.c:1387
#...
(gdb) f 2
#2  0x00007fffe993cc96 in libxlDomainObjTimeoutModifyEventHook
(priv=0x5555558fc310,
    hndp=0x5555559e5d88, abs_t=...) at libxl/libxl_domain.c:302
302         virObjectLock(info->priv);
(gdb) p info->priv
$3 = (libxlDomainObjPrivatePtr) 0x2f302f6e69616d6f
(gdb) f 9
#9  0x00007fffe993f2c7 in libxlVmStart (driver=0x5555558c2e50,
vm=0x5555558e6a50,
    start_paused=false, restore_fd=89) at libxl/libxl_driver.c:635
635             res = libxl_domain_create_restore(priv->ctx, &d_config,
&domid,
(gdb) p priv
$2 = (libxlDomainObjPrivatePtr) 0x5555558fc310

It looks like the libxlDomainObjPrivatePtr, stashed as part of
for_app_registration_out when registering the timeout, has been
trampled.  Not sure if the problem is in libvirt or libxl, but it is
late here and I'm calling it a night :).

Regards,
Jim



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.