RE: [Xen-users] XEN - Broadcom issue: survey

We have seen similar problems on Xen 2 (based on NetBSD 3.01) and Xen3.0.3 (based on Fedora 7). It does not appear on Xen 3.0.3 on DebianEtch.

A work-around appears to be to transfer files in smaller fragments orsmaller block sizes. For example, we see this repeatedly when NFSmounting from a central NAS server to domU's. By using UDP rather thanTCP, this problem occurs much less frequently. It appears to be aprotocol buffer problem between the bridge and TCP layers on theemulated network. It does not appear on native NetBSD, Fedora7 orDebian systems.


-Steve Senator
 sts+xen@xxxxxxxxxxxxxxxxxxxxxx


Quoting Boudreau Luc <luc.boudreau@xxxxxxxxxxxx>:

A bit more information on this issue. We decided to buy another NIC(other than Broadcom). The part number is NC110T from HP. It's anIntel gigabit server NIC. The problem still happens, thuseliminating the NIC problem. The card has the latest firmware andthe latest drivers (e1000 ver. 7.6.9.1-1).

The problem is still happening when we transfer large files througha domU->External. It doesn't happen when transferringdom0->External. It is not a simple tcp_timewait issue since theproblem doesn't resolve itself after the tcp timeout.


Is there anything I can test from me new setup that would help investigate ?

______________________________________________________

Luc Boudreau
Registrariat, Université de Montréal


-----Message d'origine-----

De : xen-users-bounces@xxxxxxxxxxxxxxxxxxx[mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] De la part de Pezza

Envoyé : 15 novembre 2007 21:44
À : xen-users@xxxxxxxxxxxxxxxxxxx
Objet : Re: [Xen-users] XEN - Broadcom issue: survey


Steven,

thank you for your suggestions.


Steven Smith-9 wrote:


vifX.Y interfaces are only used to send packets to PV network devices
in the guest.  Pure HVM domains (those without any PV drivers) send
all packets over the relevant tapX interface instead.  Errors observed
on the vif interface are therefore completely irrelevant in this case.
If the tap device has nothing strange then you'll have to look
somewhere else.

Ok that's a very good hint.

So far, this is the status:

Steven Smith-9 wrote:


-- Do you see the same problems with dom0<->domU networking?  If so,
it would be a good idea to fix that before worrying about problems
with the NIC.  Packets which don't need to leave the host don't touch
the physical hardware.

Dom0<->DomU is showing the same problem and, yes, you're right: probably
it's not a network card related issue at this point...


Steven Smith-9 wrote:


-- I understand you're seeing connections stall for significant
periods of time, and that this happens across a wide variety of
services, yes?  It would be interesting to know if other connections
to the same VM continue working when this happens.

Yes they do.


Steven Smith-9 wrote:


-- Is there a firewall enabled in the guest?  Turning it off might
help.  The dom0 firewall might also be relevant, although that's less
likely.

I disabled firewalling in Dom0 and in DomU to take it out of the loop.

I tried again with another machine (which is running Xen 3.0.4), and, on the
same network (which is a gigabit network), it works fine. It's slow of
course (no PV), but there's no corruption and it's stable.

I'm willing to try to uninstall Xen 3.1 and try with 3.0.3 (the current Xen
release for CentOS 5), maybe there is something else hidden somewhere in the
background.


M.




_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

WARNING - OLD ARCHIVES

xen-users

RE: [Xen-users] XEN - Broadcom issue: survey