WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH] Replace pyxml/xmlproc-based XML validator with l

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] [PATCH] Replace pyxml/xmlproc-based XML validator with lxml based one.
From: Stephan Peijnik <spe@xxxxxxxxx>
Date: Mon, 11 Oct 2010 18:48:38 +0200
Cc: Ian Campbell <Ian.Campbell@xxxxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>
Delivery-date: Mon, 11 Oct 2010 09:49:34 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1286535153.2029.9.camel@sp-laptop>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <1286535153.2029.9.camel@sp-laptop>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Below is a newly generated patch, now also updating README with 
information on the new dependencies. Additionally the patch should
hopefully not be mangled anymore. However, I also attached the patch
this time in case there are problems again.

The wording in the README is obviously up to discussion. I felt
the need to change the "many distros" wording to "some" as more
recently distributions like Debian and Ubuntu are including
minidom in their core python packages. To be honest I have
not checked how other distros handle this nowadays, hence the
"some".

I also wanted to point out that I do have a xen-4.0-testing
hg repository where these changes can directly be pulled from
at [0]. From my own experience this can also be merged into
xen-unstable painlessly.

Regards,

Stephan

[0] http://bitbucket.org/sp/xen-4.0-testing-sp

--
Pyxml/xmlproc is being used in tools/xen/xm/xenapi_create.py but is
unmaintained for several years now. xmlproc is used only for validating
XML documents against a DTD file. 

This patch replaces the pyxml/xmlproc based XML validation with code
based on lxml, which is actively maintained.

Signed-off-by: Stephan Peijnik <spe@xxxxxxxxx>

diff -r 6e0ffcd2d9e0 -r 7082ce86e492 tools/python/xen/xm/xenapi_create.py
--- a/tools/python/xen/xm/xenapi_create.py      Fri Sep 17 17:06:57 2010 +0100
+++ b/tools/python/xen/xm/xenapi_create.py      Fri Oct 08 12:31:18 2010 +0200
@@ -14,13 +14,15 @@
 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 #============================================================================
 # Copyright (C) 2007 Tom Wilkie <tom.wilkie@xxxxxxxxx>
+# Copyright (C) 2010 ANEXIA Internetdienstleistungs GmbH
+#   Author: Stephan Peijnik <spe@xxxxxxxxx>
 #============================================================================
 """Domain creation using new XenAPI
 """
 
 from xen.xm.main import server, get_default_SR
 from xml.dom.minidom import parse, getDOMImplementation
-from xml.parsers.xmlproc import xmlproc, xmlval, xmldtd
+from lxml import etree
 from xen.xend import sxp
 from xen.xend.XendAPIConstants import XEN_API_ON_NORMAL_EXIT, \
      XEN_API_ON_CRASH_BEHAVIOUR
@@ -35,6 +37,7 @@
 from os.path import join
 import traceback
 import re
+import warnings # Used by lxml-based validator
 
 def log(_, msg):
     #print "> " + msg
@@ -118,62 +121,58 @@
         Use this if possible as it gives nice
         error messages
         """
-        dtd = xmldtd.load_dtd(self.dtd)
-        parser = xmlproc.XMLProcessor()
-        parser.set_application(xmlval.ValidatingApp(dtd, parser))
-        parser.dtd = dtd
-        parser.ent = dtd
-        parser.parse_resource(file)
-
+        try:
+            dtd = etree.DTD(open(self.dtd, 'r'))
+        except IOError:
+            # The old code did neither raise an exception here, nor
+            # did it report an error. For now we issue a warning.
+            # TODO: How to handle a missing dtd file?
+            # --sp
+            warnings.warn('DTD file %s not found.' % (self.dtd),
+                          UserWarning)
+            return
+        
+        tree = etree.parse(file)
+        root = tree.getroot()
+        if not dtd.validate(root):
+            self.handle_dtd_errors(dtd)
+            
     def check_dom_against_dtd(self, dom):
         """
         Check DOM again DTD.
         Doesn't give as nice error messages.
         (no location info)
         """
-        dtd = xmldtd.load_dtd(self.dtd)
-        app = xmlval.ValidatingApp(dtd, self)
-        app.set_locator(self)
-        self.dom2sax(dom, app)
+        try:
+            dtd = etree.DTD(open(self.dtd, 'r'))
+        except IOError:
+            # The old code did neither raise an exception here, nor
+            # did it report an error. For now we issue a warning.
+            # TODO: How to handle a missing dtd file?
+            # --sp
+            warnings.warn('DTD file %s not found.' % (self.dtd),
+                          UserWarning)
+            return
 
-    # Get errors back from ValidatingApp       
-    def report_error(self, number, args=None):
-        self.errors = xmlproc.errors.english
-        try:
-            msg = self.errors[number]
-            if args != None:
-                msg = msg % args
-        except KeyError:
-            msg = self.errors[4002] % number # Unknown err msg :-)
-        print msg 
+        # XXX: This may be a bit slow. Maybe we should use another way
+        # of getting an etree root element from the minidom DOM tree...
+        # -- sp
+        root = etree.XML(dom.toxml())
+        if not dtd.validate(root):
+            self.handle_dtd_errors(dtd)
+
+    # Do the same that was done in report_error before. This is directly
+    # called by check_dtd and check_dom_against_dtd.
+    # We are using sys.stderr instead of print though (python3k clean).
+    def handle_dtd_errors(self, dtd):
+        # XXX: Do we really want to bail out here?
+        # -- sp
+        for err in dtd.error_log:
+            err_str = 'ERROR: %s\n' % (str(err),)
+            sys.stderr.write(err_str)
+        sys.stderr.flush()
         sys.exit(-1)
 
-    # Here for compatibility with ValidatingApp
-    def get_line(self):
-        return -1
-
-    def get_column(self):
-        return -1
-
-    def dom2sax(self, dom, app):
-        """
-        Take a dom tree and tarverse it,
-        issuing SAX calls to app.
-        """
-        for child in dom.childNodes:
-            if child.nodeType == child.TEXT_NODE:
-                data = child.nodeValue
-                app.handle_data(data, 0, len(data))
-            else:
-                app.handle_start_tag(
-                    child.nodeName,
-                    self.attrs_to_dict(child.attributes))
-                self.dom2sax(child, app)
-                app.handle_end_tag(child.nodeName)
-
-    def attrs_to_dict(self, attrs):
-        return dict(attrs.items())     
-
     #
     # Checks which cannot be done with dtd
     #

diff -r 3ce0d5dc606f -r 76fd774f7cd1 README
--- a/README    Sat Oct 09 22:19:41 2010 +0200
+++ b/README    Mon Oct 11 18:31:36 2010 +0200
@@ -137,12 +137,15 @@
 Xend (the Xen daemon) has the following runtime dependencies:
 
     * Python 2.3 or later.
-      In many distros, the XML-aspects to the standard library
+      In some distros, the XML-aspects to the standard library
       (xml.dom.minidom etc) are broken out into a separate python-xml package.
       This is also required.
+      In more recent versions of Debian and Ubuntu the XML-aspects are included
+      in the base python package however (python-xml has been removed
+      from Debian in squeeze and from Ubuntu in intrepid).
 
           URL:    http://www.python.org/
-          Debian: python, python-xml
+          Debian: python
 
     * For optional SSL support, pyOpenSSL:
           URL:    http://pyopenssl.sourceforge.net/
@@ -153,8 +156,9 @@
           Debian: python-pam
 
     * For optional XenAPI support in XM, PyXML:
-          URL:    http://pyxml.sourceforge.net
-          YUM:    PyXML
+          URL:    http://codespeak.net/lxml/
+         Debian: python-lxml
+          YUM:    python-lxml
 
 
 Intel(R) Trusted Execution Technology Support

Attachment: lxml_validator.patch
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel