Re: [Xen-users] Xen locks down on specific server after 1-3 days
For other users having a similar problem. It seems that the ubuntu 8.04
xen kernel is the culprit here. So if a similar problem arises either
install the debian kernel or compile from source.
The debian kernel can be found here:
Or if you want to compile from source (and you want a kernel newer than
2.6.18) the relevant links (as mentioned in other threads on this list) are:
Patches for kerneles newer than 2.6.18 are here
patches for pvops (i think, not sure) http://x17.eu/xen/
Also a step by step guide to compile is here
If you wish to build packages for the kernel (ie debs) use make-kpkg,
also look at this thread
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=508487 because you will
encounter some errors with xen patches+make-kpkg
PS Thank you for pointing me in the right direction with compiling newer
kernel versions. Also if you want to ask me a question on this topic cc
me without "[xen-users]" on the subject line (I don't get to read the
mailing list very often)
Silviu Paragina wrote:
I've tried quite a few things, googleing gives some results but they
don't seem to be related.
I have two servers one for testing one for production. I needed xen on
the production one so I could get windows running on top of linux.
Unfortunately the test server is different from the production one.
The testing one has a Xeon X3210(4 cores), 2GB ram, the production one
has an Xeon 3040 (2 cores), 2GB ram. So there is quite a difference.
Considering the fact that the server locked up without any domUs I
will not post the windows domain config, which in this case seems
Here is the story
The production server has Ubuntu 8.04 LTS on it so I thought I should
use it. Did an apt-get for xen packages from the backports repository
All worked out perfectly on the test machine. On the network side I
made a config similar to network-bridge (actualy a striped down
network-bridge) with the sole exception that it doesn't replace the
physical interface (ie all virtual machine are attached to a bridge,
and nothing else)
Did an apt-get on the production server for xen, rebooted with xen,
then started to upload the windows image from the testing machine (no
configs yet). Having a slow connection I left it upload over the
weekend (this was on a Friday).
2 days later (Sunday) I couldn't ssh into the machine, the response
wasn't a timeout, but an Connection closed message after waiting a
long time (longer than the usual timeout when the machine is down).
I got back pings (only thing that seems to work), and all the other
services (vpn/http) were behaving the same: connection seems to
establish but the actual services don't seem to respond.
After a reset I noticed the logs full of BUG: soft lockup - CPU#0
stuck for 11s! [sshd:..] (see attached log file). At that time I had
hoped it was a fluke (despite the fact that my logic was yelling
otherwise). The config file of that time is attached as
xend-config.sxp (or the nocomment one).
After this incident I went with installing the windows directly on the
machine. All seemed fine till another 2 days (or so) passed and it
locked up again this time without any log entries.
After another few lockups (without any log entries) and desperate
config changes (memory related config changes, see
xend-config.sxp.diff), I tried building the 3.3.2 xen packages. Did an
apt-get source xen-hypervisor-3.3 replaced the source in the package
with the one from xen.org(3.3.2), removed the ubuntu patches and built
it. Unfortunately the kernel source package seemed a bit too complex
for my understanding and went with the stock ubuntu(xen) one. It
booted, everything seemed fine on the test machine (it ran without a
lock over a weekend, friday till monday), deployed on the production,
and this time after 3 days (actualy about 2 day and 16 hours) it
Probably irrelevant, but still: yesterday something weird came up on
the test machine and only the test machine, whenever i shutdown the
windows guest the vm lock on state s (even if i do xm shutdown machine
or shut it down from inside windows).
Now I'm here. I shall try forcing dom0 to go with one cpu and the
windows with the other. But this should be only a temporary solution
because dom0 is running some services, and it requires some processing
power sometimes, and the same goes for windows.
Right now I'm not sure
- if I should try compiling from sources directly from xen.org. 3.3? 3.4?
- if I should try compiling the kernel from xen.org
- if I should downgrade to 3.2 which is in the standard ubuntu 8.04,
not from back-ports
- if 3.2 can run without problems windows 2008 (3.3 seemed to be the
first one, deducing from the version changelog, that could run windows
- if I should upgrade to ubuntu jaunty
Any help or suggestions are appreciated I've been trying stuff for 2-3
weeks now :(
Xen-users mailing list
Xen-users mailing list
|<Prev in Thread]
||[Next in Thread>|
- Re: [Xen-users] Xen locks down on specific server after 1-3 days,
Silviu Paragina <=