WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [patch]Make xend to take care of dead qemu-dm process

To: Masaki Kanno <kanno.masaki@xxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] [patch]Make xend to take care of dead qemu-dm process
From: shawn <xiaowei.hu@xxxxxxxxxx>
Date: Wed, 21 May 2008 15:34:46 +0800
Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 21 May 2008 00:40:55 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <87C8BB1351E547kanno.masaki@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <1211351585.2219.21.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <87C8BB1351E547kanno.masaki@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi Kan,

Thanks for your comment.

Correctted that,and some other mistakes.

Please review this patch again.

thanks
xiaowei



--- tools/python/xen/xend/server/SrvServer.py.org       2008-05-21
13:53:08.000000000 +0800
+++ tools/python/xen/xend/server/SrvServer.py   2008-05-21
15:28:09.000000000 +0800
@@ -44,6 +44,7 @@
 import re
 import time
 import signal
+import os
 from threading import Thread
 
 from xen.web.httpserver import HttpServer, UnixHttpServer
@@ -148,14 +149,26 @@
 
             # Reaching this point means we can auto start domains
             try:
-                xenddomain().autostart_domains()
+                dom = xenddomain()
+                dom.autostart_domains()
             except Exception, e:
                 log.exception("Failed while autostarting domains")
 
             # loop to keep main thread alive until it receives a
SIGTERM
             self.running = True
             while self.running:
-                time.sleep(100000000)
+                # loop to destroy those hvm domain that whoes DM has
dead unexpectedly.
+                for item in dom.domains.values():
+                    if item.info.is_hvm():
+                        device_model_pid =
item.gatherDom(('image/device-model-pid', str))
+                        dm_stat_cmd = "ps -o stat --no-headers
-p"+device_model_pid
+                        dm_stat =
os.popen(dm_stat_cmd).readline().rstrip()
+                        if dm_stat == 'Z':
+                            log.warn("Devices Model for domain " +
str(item.domid) + "was killed unexpectedly")
+                            item.destroy()
+                        else:
+                            continue
+                time.sleep(30)
                 
             if self.reloadingConfig:
                 log.info("Restarting all XML-RPC and Xen-API
servers...")





On Wed, 2008-05-21 at 16:21 +0900, Masaki Kanno wrote:
> Hi Xiaowei,
> 
> Nonessential comment.
> 
> Your patch includes both tab-indent and space-indent.
> Could you change tab-indent to space-indent?
> 
> Best regards,
>  Kan
> 
> Wed, 21 May 2008 14:33:05 +0800, shawn wrote:
> 
> >Hi all,
> >
> >There is a problem in Xen now,When fatal error happened on VM like
> >qemu-dm process died, xend should take care of it. Don't leave it as
> >defunct process (zombie process). 
> >Because of mis-configuration or some other reason, qemu-dm process would
> >die. 
> >
> >For now, xend haven't taken care of this dead process and it remains as
> >defunct process, and xm list shows VM status assigned to the process as 
> >vserv1134             5  6144     1     ------      0.0 
> >
> >This patch fix xend as when fatal error happened (e.g. qemu-dm process
> >was killed) 
> >log  error meesge then destroy that domain and clean up the process (no
> >zombies)
> >
> >This is caused by the xend daemon, xend forks a process and run the
> >qemu-dm  program, when qemu-dm was killed directly ,xend doesn't have a
> >chance to call 
> >the wait() function to collect this zombie child(qemu-dm is executed by
> >a thread).For the xend doesn't have any idea of the qemu-dm child is
> >alive or being killed. 
> >
> >For the above reason,added some code in xend to check those hvm DM
> >status each 30 seconds. 
> >
> >Have made a patch based on the open source xen3.2.1 source code. 
> >
> >Please review this patch.
> >
> >Thanks.
> >Xiaowei
> >
> >--- xen-3.2.1/tools/python/xen/xend/server/SrvServer.py.org  2008-05-21
> >13:53:08.000000000 +0800
> >+++ xen-3.2.1/tools/python/xen/xend/server/SrvServer.py      2008-05-21
> >13:58:56.000000000 +0800
> >@@ -44,6 +44,7 @@
> > import re
> > import time
> > import signal
> >+import os
> > from threading import Thread
> > 
> > from xen.web.httpserver import HttpServer, UnixHttpServer
> >@@ -148,14 +149,28 @@
> > 
> >             # Reaching this point means we can auto start domains
> >             try:
> >-                xenddomain().autostart_domains()
> >+                dom = xenddomain()
> >+            dom.autostart_domains()
> >             except Exception, e:
> >                 log.exception("Failed while autostarting domains")
> > 
> >             # loop to keep main thread alive until it receives a
> >SIGTERM
> >             self.running = True
> >             while self.running:
> >-                time.sleep(100000000)
> >+            # loop to destroy those hvm domain that whoes DM has dead
> >unexpectedly.
> >+            for item in dom.domains.values():
> >+                    if item.info.is_hvm():
> >+                        device_model_pid =
> >item.gatherDom(('image/device-model-pid', str))
> >+                    dm_stat_cmd = "ps -o stat --no-headers -p"+
> device_model_pid
> >+                    dm_stat = os.popen(dm_stat_cmd).readline().rstrip()
> >+                    log.info("status of the command is:" + dm_stat + "end 
> of output")
> >+                    if dm_stat == 'Z':
> >+                        log.info("status of the command is:" + dm_stat + 
> "end of
> >output")
> >+                        log.warn("Devices Model for " + str(item) + "was 
> killed
> >unexpectedly")
> >+                        item.destroy()
> >+                    else:
> >+                        continue
> >+                time.sleep(30)
> >                 
> >             if self.reloadingConfig:
> >                 log.info("Restarting all XML-RPC and Xen-API
> >servers...")
> >
> >
> >_______________________________________________
> >Xen-devel mailing list
> >Xen-devel@xxxxxxxxxxxxxxxxxxx
> >http://lists.xensource.com/xen-devel
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel