This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-users] Network stall

To: <xen-users@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-users] Network stall
From: "Boudreau Luc" <luc.boudreau@xxxxxxxxxxxx>
Date: Wed, 26 Sep 2007 13:21:27 -0400
Delivery-date: Wed, 26 Sep 2007 10:21:44 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcgAYawOX1O/45DEQyii0aPM5iXtrQ==
Thread-topic: Network stall


I’ve got a bit of a problem here with my network. I’m using CentOS 5 as a dom0 with many different domUs. I can reproduce this bug at will, so I guess it’s something serious. I’ve also read some reply on this list which told it is a known bug, but I can’t find any trace of it.


I share a file via any protocol on any domU. The file has to be of a big enough size so that the upload takes a while. 1 gig usually does the trick. Then, I logon to another machine (physical machine, that is…), and start retreiving it from the domU instance. After some random number of seconds, the network stalls, the link goes down, and the packets start beeing dropped.


Luckily, I can logon locally to the dom0 and confirm that the link is down and that packets are dropped. Briging the peth0 interface down and up again doesn’t fix anything, nor does a physical disconnect/reconnect. I have to reboot the whole machine physically to make it work again. It’s not a DoS problem, because I can confirm that I don’t end up with thousands of TIME_WAIT connections.


Now, how do I diagnose / fix the problem ? I’m out of ideas and have been looking around for 3 days now… sadly.


Xen : 3.0.3-25

Kernel : 2.6.18-8.1.8

Network : bridged

Example domU config :

name = "XXXXX"

memory = "1500"

disk = [ 'phy:/dev/XenGuests0/XXXXX,xvda,w', ]

vif = [ 'mac=00:04:75:4f:XX:XX, bridge=xenbr0', ]

uuid = "c3463377-ba64-caf7-87b1-bc34294274b7"




on_reboot   = 'restart'

on_crash    = 'restart'



Xen-users mailing list
<Prev in Thread] Current Thread [Next in Thread>