This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Making snapshot of logical volumes handling HVM domU cau

To: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Subject: Re: [Xen-devel] Making snapshot of logical volumes handling HVM domU causes OOPS and instability
From: Scott Garron <xen-devel@xxxxxxxxxxxxxxxxxx>
Date: Sat, 11 Sep 2010 15:16:16 -0400
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Daniel Stodden <daniel.stodden@xxxxxxxxxx>
Delivery-date: Sat, 11 Sep 2010 12:17:21 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4C8116EE.9030204@xxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4C7864BB.1010808@xxxxxxxxxxxxxxxxxx> <4C7BE1C6.5030602@xxxxxxxx> <1283195639.26797.451.camel@xxxxxxxxxxxxxxxxxxxxxxx> <4C7C14F7.9090308@xxxxxxxxxxxxxxxxxx> <1283246428.3092.3476.camel@xxxxxxxxxxxxxxxxxxx> <4C7D44B0.9060105@xxxxxxxxxxxxxxxxxx> <4C80ABA6.6000203@xxxxxxxxxxxxxx> <4C8116EE.9030204@xxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/20100825 Thunderbird/3.1.3
Scott Garron wrote:
dom0 console and HVM domUs would periodically hang for several
seconds and then return as if nothing was wrong.  [.snip.] I ended
 up fixing it by unsetting CONFIG_NO_HZ in the kernel

Jeremy Fitzhardinge wrote:
What kernel is this?  This sounds like a symptom of the sched_clock
problem I fixed a few weeks ago.
ref: refs/heads/xen/stable-2.6.32.x

git log shows this as the most recent commit (from Aug 30):
commit 2968b258b1ca6bd16d758dd68900669419caff2b

It could just be slightly different architecture or the fact that
the machine has overall less RAM (4G instead of 8G).

What happens if you boot that system with "mem=4G"

I managed to finally be able to try this last night, and it didn't seem
to make any difference.  It did seem to last a bit longer (I had it
creating and removing snapshots every 6 seconds while the backup process
was also creating and removing them as needed, and it went along for
about 20 minutes before becoming unstable).  The OOPS message was
different than last time, but similar to the first one I sent when
reporting this.

After it crashed, I also went ahead and flashed the BIOS to the latest
version, to see if it made any difference.  After flashing, I booted
normally (without mem=4G), and got it to crash again - this time with a
similar OOPS message to the one I sent to you in my previous e-mail.
The new BIOS didn't help, obviously.  I've appended the ps -eH
-owchan,nwchan,cmd outputs and kernel OOPS messages from last night to
the end of the text file at:


udevd: worker did not accept message -1 (Connection refused) kill

Are they atypical?

I don't recall seeing them before, but after flashing the BIOS, they are
no longer occurring.

This post seems to be eerily similar to the problem I'm
experiencing.  http:[...]xen-devel/2010-09/msg00169.html

Aside from udev being involved, the symptom looks quite different.

I suppose that's true, but he mentions in this post:


that lvcreate and udev are hanging while creating a snapshot volume.
That's the reason I thought it was similar.  (That, and he seems to do
backups in a similar way that I do:  creates a snapshot, makes a copy of
the snapshot [although, he block-attaches the volume to a domU to do it
whereas I just use dom0], then removes the snapshot.)

Scott Garron

Xen-devel mailing list