This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] hanging tapdisk2 processes and improper udev rules

To: Andreas Olsowski <andreas.olsowski@xxxxxxxxxxx>, Daniel Stodden <Daniel.Stodden@xxxxxxxxxx>
Subject: Re: [Xen-devel] hanging tapdisk2 processes and improper udev rules
From: Ian Campbell <Ian.Campbell@xxxxxxxxxxxxx>
Date: Fri, 22 Jul 2011 15:07:43 +0100
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Fri, 22 Jul 2011 07:09:11 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4E2960B6.5000902@xxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Citrix Systems, Inc.
References: <4E294068.2030700@xxxxxxxxxxx> <1311326922.12772.14.camel@xxxxxxxxxxxxxxxxxxxxxx> <4E2960B6.5000902@xxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Fri, 2011-07-22 at 12:36 +0100, Andreas Olsowski wrote:
> On 07/22/2011 11:28 AM, Ian Campbell wrote:
> > This is because udev and forward/backward compatibility are strangers
> > passing in the night. I presume if you make the recommended change to
> > SYMLINK+= instead of NAME= in your udev script this goes away?
> You assume correctly.
> > I posted a patch to fix this "libxl: attempt to cleanup tapdisk
> > processes on disk backend destroy" a couple of times, most recently at
> > http://marc.info/?l=xen-devel&m=131066210526755 but it hasn't been
> > applied yet. Can you try it?
> I tried it:
> make -j7 tools:
> ...
> libxl_device.c: In function ‘libxl__device_destroy’:
> libxl_device.c:253: error: incompatible type for argument 1 of 
> ‘libxl__device_destroy_tapdisk’
> libxl_internal.h:321: note: expected ‘struct libxl__gc *’ but argument 
> is of type ‘libxl__gc’
> libxl_device.c:274: error: incompatible type for argument 1 of 
> ‘libxl__device_destroy_tapdisk’
> libxl_internal.h:321: note: expected ‘struct libxl__gc *’ but argument 
> is of type ‘libxl__gc’
> My expertise with C is barely existant, but i took a look at 
> tools/libxl/libxl_device.c
> and changed your
> libxl__device_destroy_tapdisk(gc, be_path);
> into
> libxl__device_destroy_tapdisk(&gc, be_path);
> as i have seen some &gc on other lines of code.

That looks right. I think this is just a difference between current
xen-unstable and xen-4.1 (due to 23045:c426a7140c99 FWIW).

> And it compiled.
> I then created a guest, shut it down.
> First it kept beeing in a -ps--- state, i wanted to take a look at the 
> runing processes with "ps auxww" but the ps process hung itself.
> I could no longer run "ps" successfully after this point.

Uh. That really shouldn't happen :-/ In fact baring a bug in the host OS
itself I'm not sure how ps can ever get into that state...

> syslog showed:
> ul 22 13:00:07 xenturio1 xl: tap-err:tap_ctl_read_message: failure 
> reading message
> Jul 22 13:00:07 xenturio1 xl: tap-err:tap_ctl_send_and_receive: failed 
> to receive 'unknown' message
> Either my hack to get your code to compile was no good or your patch has 
> some unforseen side effects.

It's possible that it relies on something in xen-unstable that I'm not
aware of. Would it be possible for you to try and repro this issue with
xen-unstable.hg and this patch?

Daniel, have you got any idea what might be going on here?

> I have now rebooted the server.
> As i went on to check if multipath had any effect on it i added
> devnode "^td" to the blacklist.
> Now when i xl create a vm it only boots up to a certain point and then 
> does nothing.
> If that certain point were to be the login prompt everything would be 
> fine, but it isnt:
> http://pastebin.com/Lmie6KwY
> This is how it should look like:
> http://pastebin.com/CsgYypbk
> I will try to backtrace my steps and see what i did do to break my system.
> In the meantime i have other systems i can test stuff on.
> -with best regards

Xen-devel mailing list