WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] xen dom0 server freezes every one or two hours

To: Sebastian Reitenbach <sebastia@xxxxxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-users] xen dom0 server freezes every one or two hours
From: Igor Chubin <igor@xxxxxxx>
Date: Fri, 4 Jan 2008 12:37:50 +0200
Cc: s.seitz@xxxxxxxxxxxx, xen-users@xxxxxxxxxxxxxxxxxxx
Delivery-date: Fri, 04 Jan 2008 02:37:21 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <20080104073948.AB7E338482@xxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <20080104073948.AB7E338482@xxxxxxxxxxxxxxxxxxxxxxxxx>
Reply-to: Igor Chubin <igor@xxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.17 (2007-11-01)
On Fr, Jan 04, 2008 at 08:39:47 +0100, Sebastian Reitenbach wrote:
> Igor Chubin <igor@xxxxxxx> wrote: 
> > 
> > Hello Sebastian,
> > 
> > any news about your problem?
> > 
> > 
> > I've recommended my friend to try pci=routeirq
> > like Stephan advised, but without success :(
> > Nevertheless, thank you for your idea
> 
> Yes, I have, we have done a lot of tests, it took a bit of time.
> While testing, we got the server freeze without xen kernel at all, so xen is 
> not the problem. It turned out, that the server is very stable, when I 
> disable network interface bonding.
> I found a bug report in the novell bugzilla:
> https://bugzilla.novell.com/show_bug.cgi?id=278475


Congratulations!

> 
> do you also have bonding enabled on your dom0's interfaces?


No, there is no bonding in that installation.
So there is something else that cause hanging.


It appears only when using 8021q tagging, but nothing else.

> 
> kind regards
> Sebastian
> 
> > 
> > 
> > On Do, Jan 03, 2008 at 08:48:45 +0100, Sebastian Reitenbach wrote:
> > > Hi,
> > > 
> > > Stephan Seitz <s.seitz@xxxxxxxxxxxx> wrote: 
> > > > I had similar problems on one machine which has been solved by adding
> > > > pci=routeirq to the kernel parameters at boot time.
> > > > 
> > > > I somewhat sure that your problem is caused by other issues, but ...
> > > > maybe it helps ;)
> > > thanks for this tip, I tried, but unfortunately it made the things 
> worse. 
> > > I started the copyjob over NFS again, it only took 10 minutes, and the 
> > > server was frozen again.
> > > Now I'll try to NFS export sth. from dom0, not from a domU as before, 
> and 
> > > start the copy job again, just to see what happens. 
> > > 
> > > kind regards,
> > > Sebastian
> > > > 
> > > > Regards,
> > > > 
> > > > Stephan
> > > > 
> > > > 
> > > > 
> > > > 
> > > > Sebastian Reitenbach schrieb:
> > > > > Hi,
> > > > > Igor Chubin <igor@xxxxxxx> wrote: 
> > > > >>> same problem here and it can be reproduced. I use Gentoo 2007.0 
> with 
> > > Xen 
> > > > >>> 3.1.2 and kernel 2.6.22 (xen-sources) in 64bit mode.
> > > > >>> The Server is a Dual Opteron 275 running in PV mode.
> > > > >>> The Dom0 freezes every time if you generate system high-load, for 
> > > > > example 
> > > > >>> starting a boinc-client or doing big filesystem transfers.
> > > > >>> -> Network hangs, SATA Devices time out
> > > > >> The problem I have mentioned earlier 
> > > > >> as far as I remember is on a Gentoo system too.
> > > > >> But there are no problems with the disk. 
> > > > >> Only network.
> > > > > 
> > > > > I think this is the problem here too. Over Christmas I downloaded 
> the 
> > > > > opensuse KOTD, hoping that it maybe fixes the problem. the dom0 was 
> > > > > disconnected from network, and I had two domU's running, and I 
> copied a 
> > > 650 
> > > > > MB file between these two via scp, for thousand times.
> > > > > 
> > > > > Two days ago, I connected the dom0 to the network again, and started 
> > > using 
> > > > > the domU's as file/print/... servers again.
> > > > > It took about an hour, and the server was frozen again, without any 
> > > notice 
> > > > > in /var/log/messages.
> > > > > 
> > > > > I created a bugreport, maybe you can add your observations there 
> too.
> > > > > 
> > > > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1131
> > > > > 
> > > > >> May be if they try to generate big load on the system,
> > > > >> disk drives will hang too. 
> > > > >>
> > > > >>> Normally the system freezes every 2 hours.
> > > > >> At that case much more seldom.
> > > > >> Guys have said me that it hangs every several days
> > > > >> (but if it wants to it can hang several times a day).
> > > > >>
> > > > >>> I tried to play with the Xen version compatibility in the kernel, 
> but 
> > > > > that 
> > > > >>> doesn't make a difference.
> > > > >>>
> > > > >>> Due to the HDD timeout I can't find anything in the logs...
> > > > >>>
> > > > >> Just a guess:
> > > > >>
> > > > >> it may not be related to Xen baloon driver?
> > > > >>
> > > > >> Do you use dom0_mem as a parameter for the hypervisor?
> > > > > I use dom0_mem, yes, but with and without this parameter, in both 
> cases 
> > > the 
> > > > > dom0 froze.
> > > > > 
> > > > > kind regards
> > > > > Sebastian
> > > > > 
> > > 
> > > 
> > > _______________________________________________
> > > Xen-users mailing list
> > > Xen-users@xxxxxxxxxxxxxxxxxxx
> > > http://lists.xensource.com/xen-users
> > 
> > -- 
> > WBR, i.m.chubin
> > 
> > 
> 

-- 
WBR, i.m.chubin


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users