[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH] xend: Sleep before sending SIGKILL to device model



Ian Jackson wrote:
The code already has a timeout to forcibly kill the device model after
(I think) 10 seconds.  Surely we should reuse that code path (and the
same timeout value) ?

Restarting xend is not a usual thing to do and I think it's OK if
shutting down a domain started by a previous xend involves waiting for
such a longer timeout.  It's better to err on the side of safety.

O.K. Attached is a revised patch which reuses the existing code path.
10 seconds seems to me a bit too long, but I can agree we had better
keep on the safe side.

Also, your patch was:
  Content-Type: all/allfiles;
This is not a recognised content type and prevented both of my
mailreaders from displaying it to me.  Can you please fix your MUA ?

Sorry for the inconvenience.
This time your mail client can recognize it, I think.

-- Yosuke

Signed-off-by: Yosuke Iwamatsu <y-iwamatsu@xxxxxxxxxxxxx>



diff -r 39517e863cc8 tools/python/xen/xend/image.py
--- a/tools/python/xen/xend/image.py    Mon Jan 26 16:19:42 2009 +0000
+++ b/tools/python/xen/xend/image.py    Thu Jan 29 17:30:20 2009 +0900
@@ -558,24 +558,30 @@
                     os.kill(self.pid, signal.SIGHUP)
                 except OSError, exn:
                     log.exception(exn)
-                try:
-                    # Try to reap the child every 100ms for 10s. Then SIGKILL 
it.
-                    for i in xrange(100):
+                # Try to reap the child every 100ms for 10s. Then SIGKILL it.
+                for i in xrange(100):
+                    try:
                         (p, rv) = os.waitpid(self.pid, os.WNOHANG)
                         if p == self.pid:
                             break
-                        time.sleep(0.1)
-                    else:
-                        log.warning("DeviceModel %d took more than 10s "
-                                    "to terminate: sending SIGKILL" % self.pid)
+                    except OSError:
+                        # This is expected if Xend has been restarted within
+                        # the life of this domain.  In this case, we can kill
+                        # the process, but we can't wait for it because it's
+                        # not our child. We continue this loop, and after it is
+                        # terminated make really sure the process is going away
+                        # (SIGKILL).
+                        pass
+                    time.sleep(0.1)
+                else:
+                    log.warning("DeviceModel %d took more than 10s "
+                                "to terminate: sending SIGKILL" % self.pid)
+                    try:
                         os.kill(self.pid, signal.SIGKILL)
                         os.waitpid(self.pid, 0)
-                except OSError, exn:
-                    # This is expected if Xend has been restarted within the
-                    # life of this domain.  In this case, we can kill the 
process,
-                    # but we can't wait for it because it's not our child.
-                    # We just make really sure it's going away (SIGKILL) first.
-                    os.kill(self.pid, signal.SIGKILL)
+                    except OSError:
+                        # This happens if the process doesn't exist.
+                        pass
                 state = xstransact.Remove("/local/domain/0/device-model/%i"
                                           % self.vm.getDomid())
             finally:

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.