[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] libxl crash in event loop accessing fds array

Roger reported to me (pers.comm.) this crash on NetBSD:

  Program terminated with signal 11, Segmentation fault.
  #0  0x00007f7ff743131b in afterpoll_check_fd (poller=<optimized out>, 
fds=0x7f7ff7b241c0, nfds=7, fd=-1, events=1)
      at libxl_event.c:856
  856             if (fds[slot].fd != fd)

More information:
  http://dpaste.org/7zzRT/raw/   (with #define DEBUG in libxl_event.c)
  http://dpaste.org/jeCRG/raw/   (strace, seems a bit broken)

There are, I think, at least three things going wrong here:

 - Firstly, the bootloader is failing for some unknown reason.  I
   think this is (first) manifesting on NetBSD as POLLHUP on the pty
   master fd.  I don't know why this might be but a good way to find
   out would be to run the bootloader by hand.  The debug log lists
   the arguments it's been given and you can just run it by hand and
   see what it does.

 - Secondly, this causes a crash in the libxl event handling
   machinery.  This is due to a mishandled reentrancy hazard.  Patch

 - Thirdly, for some unknown reason libxl is not able to kill and
   reap the bootloader child process:
       libxl: warning: libxl_bootloader.c:292:bootloader_abort:
         after failure, failed to kill bootloader [16432]: No such process

   This is a mystery to me.  I asked for a system call trace to try to
   see what libxl was actually doing but unfortunately strace on
   Roger's NetBSD install doesn't seem to work.  ISTR NetBSD having
   some other system instead which might produce better information.

   It seems to me that libxl ought to be able to kill() an unreaped
   child so I looked to see whether the child might have been reaped.
   There are only three calls to wait* functions in xl/libxl.  The one
   in xl.c can be ruled out because we are constantly within libxl
   rather than libxl, and all the callbacks into xl are accompanied by
   debugging output which we do not see in the trace.  The one in
   libxl_exec.c is in the middle process of libxl__spawn_spawn and can
   be disregarded.

   That leaves the one in libxl__fork_selfpipe_woken.  This is not
   invariably accompanied by debugging output so it is possible that
   something has gone wrong with the handling of the bootloader's
   death.  However looking at the flow after waitpid, in
   childproc_reaped, there seem to be these possibilities:

     - the child was not found, in which case childproc_reaped
       returns ERROR_UNKNOWN_CHILD, and we end up calling
       xl_reaped_callback which will (we hope!) also give
       ERROR_UNKNOWN_CHILD, so that we will call
       libxl_report_child_exitstatus complaining about the unknown
       child.  We don't see this in the trace.

     - the child was found and is still associated with the bootloader
       callback, in which case we call bootloader_finished which
       also always logs.

     - somehow we call some other function which throws the
       information awaw.  I think this is unlikely.

    I therefore conclude that the child has died but been auto-reaped
    somehow.  Perhaps SIGCHLD is set to SIG_IGN ?  That would be Evil,
    Bad and Wrong.

    I think a proper system call trace will be necessary to resolve this.

Naturally it is important not to fix the first of these problems
before debugging the error handling :-).


Here is a patch to reproduce the problem on Linux.  touch the file
"boom" while the bootloader is running.  You need to run the whole
thing under valgrind to spot the illegal access.

diff --git a/tools/libxl/libxl_aoutils.c b/tools/libxl/libxl_aoutils.c
index 99972a2..f7df0b1 100644
--- a/tools/libxl/libxl_aoutils.c
+++ b/tools/libxl/libxl_aoutils.c
@@ -102,6 +102,9 @@ static void datacopier_readable(libxl__egc *egc, 
libxl__ev_fd *ev,
     libxl__datacopier_state *dc = CONTAINER_OF(ev, *dc, toread);
+if (!access("boom",R_OK))
+ revents = 0x10;
     if (revents & ~POLLIN) {
         LOG(ERROR, "unexpected poll event 0x%x (should be POLLIN)"
             " on %s during copy of %s", revents, dc->readwhat, dc->copywhat);

Doing this does not (for me, at least), seem to lose track of the
bootloader process.  Instead, I see (when combined with the patch

libxl: error: libxl_aoutils.c:110:datacopier_readable: unexpected poll event 
0x10 (should be POLLIN) on bootloader pty during copy o  bootloader output for 
domain 22
libxl: debug: libxl_event.c:120:libxl__ev_fd_deregister: ev_fd=0x42b13a8 
deregister fd=13
libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b13c0 
deregister unregistered
libxl: debug: libxl_event.c:120:libxl__ev_fd_deregister: ev_fd=0x42b1348 
deregister fd=17
libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b1360 
deregister unregistered
libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b13a8 
deregister unregistered
libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b13c0 
deregister unregistered
libxl: debug: libxl_event.c:1610:libxl__ao_inprogress: ao 0x42b1038: not ready, 
libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b1348 
deregister unregistered
libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b1360 
deregister unregistered
libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b13a8 
deregister unregistered
libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b13c0 
deregister unregistered
libxl: error: libxl_bootloader.c:557:bootloader_finished: bootloader failed - 
consult logfile /var/log/xen/bootloader.22.log
libxl: error: libxl_exec.c:129:libxl_report_child_exitstatus: bootloader [3615] 
died due to fatal signal Terminated

The fix I have come up with is below.


Subject: [PATCH] libxl: Deal properly with reentrant changes to fd events

In afterpoll_internal, the callback functions may register and
deregister events arbitrarily.  This means that we need to consider
the reentrancy-safety of the event machinery state variables.

Most of the code is safe but the fd handling is not.  Fix this by
arranging to restart the fd scan loop every time we call one of these
callback functions.

Another possible solution would be simply to return from
afterpoll_internal after calling efd->func.  That would be a small and
more obviously correct change but would prevent the process from
handling more than one fd event with a single call to poll.

Reported-by: Roger Pau Monne <roger.pau@xxxxxxxxxx>
Signed-off-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 1957505..991261e 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -838,15 +838,19 @@ int libxl_osevent_beforepoll(libxl_ctx *ctx, int *nfds_io,
 static int afterpoll_check_fd(libxl__poller *poller,
                               const struct pollfd *fds, int nfds,
-                              int fd, int events)
+                              int fd, int events,
+                              int *(rindices[3]))
     /* returns mask of events which were requested and occurred */
     if (fd >= poller->fd_rindices_allocd)
         /* added after we went into poll, have to try again */
         return 0;
+    events |= POLLERR | POLLHUP;
     int i, revents = 0;
     for (i=0; i<3; i++) {
+        if (rindices) rindices[i] = 0;
         int slot = poller->fd_rindices[fd][i];
         if (slot >= nfds)
@@ -858,11 +862,16 @@ static int afterpoll_check_fd(libxl__poller *poller,
         assert(!(fds[slot].revents & POLLNVAL));
-        revents |= fds[slot].revents;
-    }
-    /* we mask in case requested events have changed */
-    revents &= (events | POLLERR | POLLHUP);
+        /* we mask in case requested events have changed */
+        int slot_revents = fds[slot].revents & events;
+        if (!slot_revents)
+            /* this slot is for a different set of events */
+            continue;
+        revents |= slot_revents;
+        if (rindices) rindices[i] = &poller->fd_rindices[fd][i];
+    }
     return revents;
@@ -876,26 +885,78 @@ static void afterpoll_internal(libxl__egc *egc, 
libxl__poller *poller,
     libxl__ev_fd *efd;
-    LIBXL_LIST_FOREACH(efd, &CTX->efds, entry) {
-        if (!efd->events)
-            continue;
+    /*
+     * Warning! Reentrancy hazards!
+     *
+     * Many parts of this function eventually call arbitrary callback
+     * functions which may modify the event handling data structures.
+     *
+     * Of the data structures used here:
+     *
+     *   egc, poller, now
+     *                are allocated by our caller and relate to the
+     *                current thread and its call stack down into the
+     *                event machinery; it is not freed until we return.
+     *                So it is safe.
+     *
+     *   fds          is either what application passed into
+     *                libxl_osevent_afterpoll (which, although this
+     *                isn't explicitly stated, clearly must remain
+     *                valid until libxl_osevent_afterpoll returns) or
+     *                it's poller->fd_polls which is modified only by
+     *                our (non-recursive) caller eventloop_iteration.
+     *   
+     *   CTX          comes from our caller, and applications are
+     *                forbidden from destroying it while we are running.
+     *                So the ctx pointer itself is safe to use; now
+     *                for its contents:
+     *
+     *   CTX->etimes  is used in a simple reentrancy-safe manner.
+     *
+     *   CTX->efds    is more complicated; see below.
+     */
-        int revents = afterpoll_check_fd(poller,fds,nfds, efd->fd,efd->events);
-        if (revents) {
-            DBG("ev_fd=%p occurs fd=%d events=%x revents=%x",
-                efd, efd->fd, efd->events, revents);
-            efd->func(egc, efd, efd->fd, efd->events, revents);
+    for (;;) {
+        /* We restart our scan of fd events whenever we call a
+         * callback function.  This is necessary because such
+         * a callback might make arbitrary changes to CTX->efds.
+         * We invalidate the fd_rindices[] entries which were used
+         * so that we don't call the same function again. */
+        int revents, i, *(rindices[3]);
+        LIBXL_LIST_FOREACH(efd, &CTX->efds, entry) {
+            if (!efd->events)
+                continue;
+            revents = afterpoll_check_fd(poller,fds,nfds,
+                                         efd->fd,efd->events,
+                                         rindices);
+            if (revents)
+                goto found_fd_event;
+        /* no ordinary fd events, then */
+        break;
+    found_fd_event:
+        DBG("ev_fd=%p occurs fd=%d events=%x revents=%x",
+            efd, efd->fd, efd->events, revents);
+        efd->func(egc, efd, efd->fd, efd->events, revents);
+        for (i=0; i<3; i++)
+            if (rindices[i])
+                *(rindices[i]) = INT_MAX;
-    if (afterpoll_check_fd(poller,fds,nfds, poller->wakeup_pipe[0],POLLIN)) {
+    if (afterpoll_check_fd(poller,fds,nfds, poller->wakeup_pipe[0],POLLIN, 0)) 
         int e = libxl__self_pipe_eatall(poller->wakeup_pipe[0]);
         if (e) LIBXL__EVENT_DISASTER(egc, "read wakeup", e, 0);
     int selfpipe = libxl__fork_selfpipe_active(CTX);
     if (selfpipe >= 0 &&
-        afterpoll_check_fd(poller,fds,nfds, selfpipe, POLLIN)) {
+        afterpoll_check_fd(poller,fds,nfds, selfpipe, POLLIN, 0)) {
         int e = libxl__self_pipe_eatall(selfpipe);
         if (e) LIBXL__EVENT_DISASTER(egc, "read sigchld pipe", e, 0);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 7a655b3..647cc7c 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -278,7 +278,7 @@ struct libxl__poller {
     int fd_polls_allocd;
     int fd_rindices_allocd;
-    int (*fd_rindices)[3]; /* see libxl_osevent_beforepoll */
+    int (*fd_rindices)[3]; /* see libxl_event.c:beforepoll_internal */
     int wakeup_pipe[2]; /* 0 means no fd allocated */

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.