WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

[Xen-users] domU network fails under load - vif breaks

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] domU network fails under load - vif breaks
From: Paul <xensource@xxxxxxxxxxx>
Date: Mon, 01 Sep 2008 23:57:28 +1000
Delivery-date: Mon, 01 Sep 2008 06:58:44 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-AU; rv:1.8.1.16) Gecko/20080829 Lightning/0.8 Thunderbird/2.0.0.16 Mnenhy/0.7.5.0
Hello,

I am running Xen 3.2.1 under a Gentoo 2.6.21-xen kernel. I have four domains running under a single dom0, two hvm: Windows XP and Windows 2k3, and two pv: a Gentoo 2.6.25 kernel, and a Ubuntu 2.6.24 kernel. I have also experienced the same behaviour under a 2.6.21-xen gentoo kernel.

All the domU networks are bridged to a single 1gb nic, and I have tried an alternative physical nic. There is very little load on this nic - this is a test environment.

At a certain point that I have not established exactly, the network load takes out the pv network. For example, if I initiate a bittorrent session in a pv domU, I get a slow build up of network load, and then connectivity is lost to both of the pv domU's. If I console into them, they cannot ping outside the network, but they can ping their own interfaces. A tcpdump on the dom0 physical shows no traffic. However, during all this, the hvm domains are able to use their network connections without issues. A shutdown of the broken domU doesn't work, as they have nfs shares loaded and it hangs on the nfs unmount, but I suspect that without this they would shutdown cleanly. In any case, I have to destroy them. If I attempt to recreate, I get:

Error: Device 0 (vif) could not be connected. Hotplug scripts not working.

The xend log only shows:
   ...
[2008-08-31 19:14:11 5531] DEBUG (DevController:595) hotplugStatusCallback /local/domain/0/backend/vif/11/0/hotplug-status. [2008-08-31 19:15:51 5531] DEBUG (XendDomainInfo:1897) XendDomainInfo.destroy: domid=11
   ...

Although the hvm domUs are still working, if I shut them down, they hang on start up, again with the vif problem.

I used the bittorrent example above to demonstrate it is at a certain traffic load, however if I do a large cp from the domU to an nfs share, it will fail almost instantly.

Restarting xend has no effect. The only thing I can fix the problem with is a reboot of the dom0.

Here is the dom0 kernel line:

   module /xen-2.6.21-noreal root=/dev/sda2 max_loop=255

(the noreal just refers to the realtek drivers being removed from the kernel, as I tried to use realteks own drivers on their website to resolve the problem, but the behaviour is the same).

Here is the domU cfg:

==========================================
kernel = '/xen/kernels/xen-2.6.25-pae'
ramdisk = '/etc/xen/kernels/initramfs-genkernel-x86-2.6.25-gentoo-r7-ich10'
extra = 'console=hvc0'

memory = '768'

disk = [
               'phy:sda7,hda3,w',
               'file:/xen/domains/zenayonswap.img,hdb,w'
]

name = 'zenayon'

vif = ['bridge=eth1, mac=00:16:3E:11:11:12']
root='/dev/xvda3'
cpu_cap = 100
#sdl=0
#acpi=0
#apic=0
localtime=1
================================

The dmesg of the dom0 and domU don't have any clues that I can see, nor log/messages, nor the xen logs.

I am at a bit of a loss as to how to diagnose this. All the other networking related issues seem to have been resolved in earlier releases and/or are related to routed mode. It seems to be related to the dom0 kernel or xend, as these are the things that haven't changed in my testing. Perhaps I have a setting in my dom0 kernel that is not compatible?

Thanks for any help,

Paul

**

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users