[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] XL: pv guests dont reboot after migration (xen-4.1.2-rc3) libc-2.11.2 segfault



On Sat, 2011-10-15 at 11:47 +0100, Andreas Olsowski wrote:
> On 10/15/2011 07:45 AM, Ian Campbell wrote:
> > On Sat, 2011-10-15 at 02:01 +0100, Andreas Olsowski wrote:
> >
> >> pv guests dont reboot after migration,
> >> just when xl should reboot the machine syslog shows:
> >>
> >>
> >> Oct 15 02:46:32 netcatarina kernel: xl[14986]: segfault at 7f0ec70a3008
> >> ip 00007f0ec7d517f9 sp 00007fff366cf100 error 4 in
> >> libc-2.11.2.so[7f0ec7cdb000+158000]
> >
> > Can you run under gdb and get a backtrace? Or perhaps core file is
> > dropped somewhere?
> How? xl migrate-receive is not started by hand. Can you point me to the 
> location within the code that calls it so i can put a "gdb" infront of it?

tools/libxl/xl_cmdimpl.c, main_migrate().

Or you can attach gdb to a running xl migrate receive ("gdb -p
<pid> /path/xl"?). I think you can also control the remove command which
is run using the -e option to "xl migrate", maybe. Not so sure about
that last one.

>  > Or perhaps core file is dropped somewhere?
> Wouldnt i have to run a debugging enabled build of xen for that?
> 
> I found this in the log dir:
> 
> 
> root@netcatarina:/var/log/xen# cat xl-testmig--incoming.log
> Waiting for domain testmig--incoming (domid 67) to die [pid 3429]
> Domain 67 is dead
> Action for shutdown reason code 1 is restart
> Domain 67 needs to be cleaned up: destroying the domain
> Done. Rebooting now
> xc: error: 0-length read: Internal error

Interesting. That suggests we've gone back round to the migrate/restore
path, but all the uses after the start: label (where we go back to on
reboot) in create_domain seem to be gated on restore_file != NULL. I
must be missing something...

Adding some logging in create_domain wherever a *fd variable is used
might be interesting, perhaps on the exit paths too.

I notice that we don't appear to close restore_fd in the child process.
That probably isn't related to this problem but would be worth doing I
suspect.

> xc: error: read_exact_timed failed (read rc: 0, errno: 0): Internal error
> xc: error: read: p2m_size (0 = Success): Internal error


> 
> 
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.