This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-users] Xen locks down on specific server after 1-3 days

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-users] Xen locks down on specific server after 1-3 days
From: Silviu Paragina <silviu@xxxxxxxxxxx>
Date: Mon, 14 Sep 2009 15:39:06 +0300
Delivery-date: Mon, 14 Sep 2009 05:39:59 -0700
Dkim-signature: v=1; a=rsa-sha1; c=relaxed; d=paragina.ro; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; s=doradusmail; bh=jRCcj ii5ybC3kb3O7DkfWCmWaZ8=; b=mdJs1Go3+aCNNJCmRk9htrBccfmduBTMggMOu c4syb6ZftEGqRjcIY2o4LByN0RltVDM5n6Hqz5gCqjieLBTBxIXag3uGWzHoJA+a NAVOsBXgstVU5sFQTBy11PrO2yToz/uyGq+ugxpz4/D2Sosmm3RWY8dpf/kZfbHS riadhU=
Domainkey-signature: a=rsa-sha1; c=nofws; d=paragina.ro; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; q=dns; s=doradusmail; b= GuBGgpSqHmZgLqBBwvtm+v/9EBLoIeBmN+rWDfkhvb3ITtbjGPeXOSgUwCmtIcXI jqYUhtMXnq8tpo+0kjkdewnKiAXkVKf3GEPf1xaH7uV5BZ963Quk/bqEyNgX/819 uv7yZzF267VdnOP1v6d1S8l0pfdNXZIOw27Yd129fSo=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4A9B874D.5060608@xxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <4A9B874D.5060608@xxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird (Windows/20090812)
For other users having a similar problem. It seems that the ubuntu 8.04 xen kernel is the culprit here. So if a similar problem arises either install the debian kernel or compile from source.

The debian kernel can be found here:
image: http://packages.debian.org/lenny/i386/linux-image-2.6.26-2-xen-686/download modules: http://packages.debian.org/lenny/i386/linux-modules-2.6.26-2-xen-686/download

Or if you want to compile from source (and you want a kernel newer than 2.6.18) the relevant links (as mentioned in other threads on this list) are: Patches for kerneles newer than 2.6.18 are here http://code.google.com/p/gentoo-xen-kernel/downloads/list
patches for pvops (i think, not sure) http://x17.eu/xen/

Also a step by step guide to compile is here http://www.infohit.net/blog/post/compiling-a-xen-dom0-kernel-for-ubuntu-jaunty.html

If you wish to build packages for the kernel (ie debs) use make-kpkg, also look at this thread http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=508487 because you will encounter some errors with xen patches+make-kpkg

Good luck
 Silviu Paragina

PS Thank you for pointing me in the right direction with compiling newer kernel versions. Also if you want to ask me a question on this topic cc me without "[xen-users]" on the subject line (I don't get to read the mailing list very often)

Silviu Paragina wrote:
I've tried quite a few things, googleing gives some results but they don't seem to be related.

I have two servers one for testing one for production. I needed xen on the production one so I could get windows running on top of linux. Unfortunately the test server is different from the production one. The testing one has a Xeon X3210(4 cores), 2GB ram, the production one has an Xeon 3040 (2 cores), 2GB ram. So there is quite a difference. Considering the fact that the server locked up without any domUs I will not post the windows domain config, which in this case seems irrelevant.

Here is the story
The production server has Ubuntu 8.04 LTS on it so I thought I should use it. Did an apt-get for xen packages from the backports repository (xen 3.3.0). All worked out perfectly on the test machine. On the network side I made a config similar to network-bridge (actualy a striped down network-bridge) with the sole exception that it doesn't replace the physical interface (ie all virtual machine are attached to a bridge, and nothing else) Did an apt-get on the production server for xen, rebooted with xen, then started to upload the windows image from the testing machine (no configs yet). Having a slow connection I left it upload over the weekend (this was on a Friday).

2 days later (Sunday) I couldn't ssh into the machine, the response wasn't a timeout, but an Connection closed message after waiting a long time (longer than the usual timeout when the machine is down). I got back pings (only thing that seems to work), and all the other services (vpn/http) were behaving the same: connection seems to establish but the actual services don't seem to respond.

After a reset I noticed the logs full of BUG: soft lockup - CPU#0 stuck for 11s! [sshd:..] (see attached log file). At that time I had hoped it was a fluke (despite the fact that my logic was yelling otherwise). The config file of that time is attached as xend-config.sxp (or the nocomment one).

After this incident I went with installing the windows directly on the machine. All seemed fine till another 2 days (or so) passed and it locked up again this time without any log entries.

After another few lockups (without any log entries) and desperate config changes (memory related config changes, see xend-config.sxp.diff), I tried building the 3.3.2 xen packages. Did an apt-get source xen-hypervisor-3.3 replaced the source in the package with the one from xen.org(3.3.2), removed the ubuntu patches and built it. Unfortunately the kernel source package seemed a bit too complex for my understanding and went with the stock ubuntu(xen) one. It booted, everything seemed fine on the test machine (it ran without a lock over a weekend, friday till monday), deployed on the production, and this time after 3 days (actualy about 2 day and 16 hours) it locked up.

Probably irrelevant, but still: yesterday something weird came up on the test machine and only the test machine, whenever i shutdown the windows guest the vm lock on state s (even if i do xm shutdown machine or shut it down from inside windows).

Now I'm here. I shall try forcing dom0 to go with one cpu and the windows with the other. But this should be only a temporary solution because dom0 is running some services, and it requires some processing power sometimes, and the same goes for windows.

Right now I'm not sure
- if I should try compiling from sources directly from xen.org. 3.3? 3.4?
- if I should try compiling the kernel from xen.org
- if I should downgrade to 3.2 which is in the standard ubuntu 8.04, not from back-ports - if 3.2 can run without problems windows 2008 (3.3 seemed to be the first one, deducing from the version changelog, that could run windows 2008 server)
- if I should upgrade to ubuntu jaunty

Any help or suggestions are appreciated I've been trying stuff for 2-3 weeks now :(



Xen-users mailing list

Xen-users mailing list

<Prev in Thread] Current Thread [Next in Thread>
  • Re: [Xen-users] Xen locks down on specific server after 1-3 days, Silviu Paragina <=