[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 19308: regressions - FAIL



On Mon, Sep 16, 2013 at 09:49:42AM +0200, Roger Pau Monné wrote:
> On 15/09/13 14:50, Ian Campbell wrote:
> > On Sun, 2013-09-15 at 07:09 +0100, xen.org wrote:
> >> flight 19308 xen-unstable real [real]
> >> http://www.chiark.greenend.org.uk/~xensrcts/logs/19308/
> >>
> >> Regressions :-(
> >>
> >> Tests which did not succeed and are blocking,
> >> including tests which could not be run:
> >>  test-amd64-i386-qemuu-rhel6hvm-intel 11 leak-check/check  fail REGR. vs. 
> >> 19208
> >>  test-amd64-i386-rhel6hvm-intel 11 leak-check/check        fail REGR. vs. 
> >> 19208
> >>  test-amd64-i386-qemuu-rhel6hvm-amd 11 leak-check/check    fail REGR. vs. 
> >> 19208
> >>  test-amd64-i386-qemut-rhel6hvm-amd 11 leak-check/check    fail REGR. vs. 
> >> 19208
> >>  test-amd64-i386-rhel6hvm-amd 11 leak-check/check          fail REGR. vs. 
> >> 19208
> > 
> > These are due to /var/run/xen-hotplug/block getting leaked
> > 

The error message in XenStore shows blkback tries to get hold of the
block device 0:0 but there's no such device entry in system.

> >>  test-amd64-i386-xl-win7-amd64 12 guest-localmigrate/x10   fail REGR. vs. 
> >> 19208
> >>  test-amd64-amd64-xl-win7-amd64 12 guest-localmigrate/x10  fail REGR. vs. 
> >> 19208
> >>  test-amd64-amd64-xl-qemut-winxpsp3 12 guest-localmigrate/x10 fail REGR. 
> >> vs. 19208
> > 
> > These are:
> >         libxl: error: libxl_device.c:894:device_backend_callback: unable
> >         to add device with path /local/domain/0/backend/vbd/9/5632
> >         libxl: error: libxl_create.c:935:domcreate_launch_dm: unable to add 
> > disk devices
> > 
> > /var/log/xen/xenhotplug.log contains:
> >         xenstore-read: couldn't read path backend/vbd/9/5632/node
> > 
> > For both of these I'm suspicious of:
> > 11a63a1 libxl, hotplug/Linux: default to phy backend for raw format file
> 
> Hello,
> 
> I've tracked this down to libxl writing a wrong physical-device 
> xenstore node when using regular files. When using block devices libxl 
> can write the physical-device because it can be fetched without 
> requiring the execution of the block script, but with regular files it 
> is not true, we must first execute the block script in order to mount 
> the regular file into a loop device and then fetch the physical-device 
> from the loop device to which the image has been mounted. Following 
> patch solves the issue for me.
>

Yes, that's the in question I think. That code snippet was introduced in:

commit 15116f1c254a8aa7774e2f73a3e1340a6decd867
Author: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
Date:   Tue Aug 7 14:26:29 2012 +0100

    libxl: write physical-device node if user did not supply a block script
    
    This reverts one of the intentional changes from 25733:353bc0801b11.
    That change exposed an issue with the xl migration protocol, which
    although safe triggers the hotplug scripts device sharing logic.
    
    For 4.2 we disable this logic by writing the physical-device xenstore
    node ourselves if a user did not supply a script. If the user did
    supply a script then we continue to rely on it to write the
    physical-device node (not least because the script may create the
    device and therefore it is not available before we run the script).
    
    This means that to support localhost migration a block hotplug script
    needs to be robust against adding a device twice and should not
    deactivate the device until it has been removed twice.
    
    This should be revisited for 4.3.
    
    Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
    Acked-by: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
    Committed-by: Ian Campbell <ian.campbell@xxxxxxxxxx>

And in the commit message it says this behavior should be revisited.

Tracing back to 25733 
(http://xenbits.xen.org/hg/xen-unstable.hg/rev/353bc0801b11)
things look more complicated. One interesting snippet in the commit
message is:

- libxl should not write the "physical-device" node. This is the
  responsibility of the block script. Writing the "physical-device"
  node in libxl basically completely short-cuts the standard block
  hotplug script which uses "physical-device" to know if it has run
  already or not.

That makes me believe the following fix is the correct thing to do in
long term.

I have to admit that I cannot fully consume the commit message of 25733
in one day so unless you (Ian) can confirm Roger's fix will not cause further
regression otherwise I would suggest reverting my change at the moment.

Wei.

> 8<-------------------------------------------------------------------
> >From e150f00565bfe291809441e73630b243e21a52b0 Mon Sep 17 00:00:00 2001
> From: Roger Pau Monne <roger.pau@xxxxxxxxxx>
> Date: Mon, 16 Sep 2013 09:39:05 +0200
> Subject: [PATCH] libxl: don't write physical-device vbd xenstore node in
>  libxl
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> libxl used to write the physical-device xenstore node needed by the
> phy backend type, because the phy backend type could only be used with
> block devices. If libxl allows the backend type phy to be used with
> regular files, it can no longer write physical-device because the
> hotplug script has to be executed first in order to mount the regular
> file into a loop device and then write the physical-device of the loop
> device used.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> ---
>  tools/libxl/libxl.c |   15 ---------------
>  1 files changed, 0 insertions(+), 15 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 0879f23..326a378 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -2101,21 +2101,6 @@ static void device_disk_add(libxl__egc *egc, uint32_t 
> domid,
>                                           libxl__xen_script_dir_path());
>                  flexarray_append_pair(back, "script", script);
>  
> -                /* If the user did not supply a block script then we
> -                 * write the physical-device node ourselves.
> -                 *
> -                 * If the user did supply a script then that script is
> -                 * responsible for this since the block device may not
> -                 * exist yet.
> -                 */
> -                if (!disk->script &&
> -                    disk->backend_domid == LIBXL_TOOLSTACK_DOMID) {
> -                    int major, minor;
> -                    libxl__device_physdisk_major_minor(dev, &major, &minor);
> -                    flexarray_append_pair(back, "physical-device",
> -                            libxl__sprintf(gc, "%x:%x", major, minor));
> -                }
> -
>                  assert(device->backend_kind == LIBXL__DEVICE_KIND_VBD);
>                  break;
>  
> -- 
> 1.7.7.5 (Apple Git-26)
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.