[Xen-devel] tx offload issue w/stubdoms + igb

Running a very recent xen-unstable and xen/stable-2.6.32.x along withany Linux domU using HVM and a stubdom, I notice that TCP performancewhen downloading from certain sites is extremely low with dom0's txoffload enabled on the stubdom's vif. For instance, from kernel.org, Isee a paltry 30-50 K/s from inside the domU:

testvds5 ~ # wgethttp://mirrors.kernel.org/gentoo/releases/amd64/10.1/livedvd-amd64-multilib-10.1.iso--2010-12-14 04:29:53--http://mirrors.kernel.org/gentoo/releases/amd64/10.1/livedvd-amd64-multilib-10.1.iso

Resolving mirrors.kernel.org... 149.20.20.135, 204.152.191.39
Connecting to mirrors.kernel.org|149.20.20.135|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2798649344 (2.6G) [application/x-iso9660-image]
Saving to: `livedvd-amd64-multilib-10.1.iso.1'

0%[] 680,237 31.5K/s eta 18h 39m ^C


testds5 ~ #

But, if I turn off tx offload for the stubdom's vif with a line likethis in the dom0..


ethtool -K vif59.0 tx off

.. I then get normal speeds in the domU:

testvds5 ~ # wgethttp://mirrors.kernel.org/gentoo/releases/amd64/10.1/livedvd-amd64-multilib-10.1.iso--2010-12-14 04:31:44--http://mirrors.kernel.org/gentoo/releases/amd64/10.1/livedvd-amd64-multilib-10.1.iso

Resolving mirrors.kernel.org... 149.20.20.135, 204.152.191.39
Connecting to mirrors.kernel.org|149.20.20.135|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2798649344 (2.6G) [application/x-iso9660-image]
Saving to: `livedvd-amd64-multilib-10.1.iso.5'

2%[=>] 73,762,776 1.59M/s eta 37m 57s ^C


I tested further and found that:

* dom0 does't have the issue, normal PV domains do not have the issue,and Windows GPLPV-based domains do not have the issue. It seems to bespecific to stubdom-based domains.

* Other machines running the exact same Xen release and kernel version,but that use the e1000 driver instead of the igb driver, don't seem tohave the problem. I don't know if it's related (I have not yet been ableto test with MSI disabled), but those machines without the problem alsoaren't using MSI-X, whereas the igb-based machine that shows the problemis. From dmesg:


[   21.209923] Intel(R) Gigabit Ethernet Network Driver - version 1.3.16-k2
[   21.210026] Copyright (c) 2007-2009 Intel Corporation.
[   21.210140] xen: registering gsi 28 triggering 0 polarity 1
[   21.210145] xen: --> irq=28
[   21.210151] igb 0000:01:00.0: PCI INT A -> GSI 28 (level, low) -> IRQ 28
[   21.210279] igb 0000:01:00.0: setting latency timer to 64

[ 21.382336] igb 0000:01:00.0: Intel(R) Gigabit Ethernet NetworkConnection[ 21.382435] igb 0000:01:00.0: eth0: (PCIe:2.5Gb/s:Width x4)00:25:90:09:e4:00

[   21.382605] igb 0000:01:00.0: eth0: PBA No: ffffff-0ff

[ 21.382698] igb 0000:01:00.0: Using MSI-X interrupts. 4 rx queue(s),4 tx queue(s)

(Both the e1000 and igb machines have the hvm_directio flag in the "xlinfo" output.)

* Different GSO/TSO settings do not appear to make a difference. Onlythe tx offload setting does.

* Inside the problematic domU, the bad segment counter increments whenthe issue is occurring:


testvds5 ~ # netstat -s eth0
Ip:
    22162 total packets received
    44 with invalid addresses
    0 forwarded
    0 incoming packets discarded
    22113 incoming packets delivered
    19582 requests sent out
Icmp:
    2694 ICMP messages received
    0 input ICMP message failed.
    ICMP input histogram:
        timeout in transit: 2447
        echo replies: 247
    2698 ICMP messages sent
    0 ICMP messages failed
    ICMP output histogram:
        destination unreachable: 2
IcmpMsg:
        InType0: 247
        InType11: 2447
        OutType3: 2
        OutType69: 2696
Tcp:
    4 active connections openings
    3 passive connection openings
    0 failed connection attempts
    0 connection resets received
    3 connections established
    18819 segments received
    16795 segments send out
    0 segments retransmited
    2366 bad segments received.
    8 resets sent
Udp:
    65 packets received
    2 packets to unknown port received.
    0 packet receive errors
    89 packets sent
UdpLite:
TcpExt:
    1 TCP sockets finished time wait in fast timer
    172 delayed acks sent
    Quick ack mode was activated 89 times
    3 packets directly queued to recvmsg prequeue.
    33304 bytes directly in process context from backlog
    3 bytes directly received in process context from prequeue
    7236 packet headers predicted
    23 packets header predicted and directly queued to user
    3117 acknowledgments not containing data payload received
    89 DSACKs sent for old packets
    2 DSACKs sent for out of order packets
    2 connections reset due to unexpected data
IpExt:
    InBcastPkts: 533
    InOctets: 23420805
    OutOctets: 1601733
    InBcastOctets: 162268
testvds5 ~ #

* Some sites transfer quickly to the domU quickly regardless of the txoffload setting, exhibiting the symptoms less. For instance, uiuc.eduwith tx on:

root@testvds5:~# wgethttp://gentoo.cites.uiuc.edu/pub/gentoo/releases/amd64/10.1/livedvd-amd64-multilib-10.1.iso--2010-12-14 03:53:50--http://gentoo.cites.uiuc.edu/pub/gentoo/releases/amd64/10.1/livedvd-amd64-multilib-10.1.iso

Resolving gentoo.cites.uiuc.edu... 128.174.5.78
Connecting to gentoo.cites.uiuc.edu|128.174.5.78|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2798649344 (2.6G) [text/plain]
Saving to: `livedvd-amd64-multilib-10.1.iso'

0% [ ] 25,754,272 3.06M/s eta17m 7s ^C

root@testvds5:~#

(netstat shows 23 bad segments received over the length of that test)

and with tx off:

root@testvds5:~# wgethttp://gentoo.cites.uiuc.edu/pub/gentoo/releases/amd64/10.1/livedvd-amd64-multilib-10.1.iso--2010-12-14 03:54:45--http://gentoo.cites.uiuc.edu/pub/gentoo/releases/amd64/10.1/livedvd-amd64-multilib-10.1.iso

Resolving gentoo.cites.uiuc.edu... 128.174.5.78
Connecting to gentoo.cites.uiuc.edu|128.174.5.78|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2798649344 (2.6G) [text/plain]
Saving to: `livedvd-amd64-multilib-10.1.iso.1'

1% [ ] 47,677,960 3.95M/s eta12m 0s ^C


* The issue also occurs in xen-4.0-testing, as of c/s 21392.

For reference, Xen and kernel version output:

nyc-dodec266 src # xl info
host                   : nyc-dodec266
release                : 2.6.32.26-g862ef97
version                : #4 SMP Wed Dec 8 16:38:18 EST 2010
machine                : x86_64
nr_cpus                : 24
nr_nodes               : 2
cores_per_socket       : 12
threads_per_core       : 1
cpu_mhz                : 2674

hw_caps :bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000

virt_caps              : hvm hvm_directio
total_memory           : 49143
free_memory            : 9178
free_cpus              : 0
xen_major              : 4
xen_minor              : 1
xen_extra              : -unstable

xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32hvm-3.0-x86_32p hvm-3.0-x86_64

xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : Wed Dec 08 10:46:31 2010 +0000 22467:89116f28083f
xen_commandline        : dom0_mem=2550M dom0_max_vcpus=4
cc_compiler            : gcc version 4.4.4 (Gentoo 4.4.4-r2 p1.2, pie-0.4.5)
cc_compile_by          : root
cc_compile_domain      : nuclearfallout.net
cc_compile_date        : Fri Dec 10 00:51:50 EST 2010
xend_config_format     : 4
nyc-dodec266 src # uname -a

Linux nyc-dodec266 2.6.32.26-g862ef97 #4 SMP Wed Dec 8 16:38:18 EST 2010x86_64 Intel(R) Xeon(R) CPU X5650 @ 2.67GHz GenuineIntel GNU/Linux

For now, I can use the "tx off" workaround by having a script set it forall newly created domains. Is anyone up for nailing this down andfinding a real fix? Failing that, applying the workaround in the Xentools might be a good idea.


-John

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] tx offload issue w/stubdoms + igb