Xen project Mailing List

Re: [Xen-devel] [RFC] netif: staging grants for requests

On 01/04/2017 01:54 PM, Wei Liu wrote: > Hey! Hey! > Thanks for writing this detailed document! Thanks a lot for the review and comments! > > On Wed, Dec 14, 2016 at 06:11:12PM +0000, Joao Martins wrote: >> Hey, >> >> Back in the Xen hackaton '16 networking session there were a couple of ideas >> brought up. One of them was about exploring permanently mapped grants between >> xen-netback/xen-netfront. >> >> I started experimenting and came up with sort of a design document (in >> pandoc) >> on what it would like to be proposed. This is meant as a seed for discussion >> and also requesting input to know if this is a good direction. Of course, I >> am willing to try alternatives that we come up beyond the contents of the >> spec, or any other suggested changes ;) >> >> Any comments or feedback is welcome! >> >> Cheers, >> Joao >> >> --- >> % Staging grants for network I/O requests >> % Joao Martins <<joao.m.martins@xxxxxxxxxx>> >> % Revision 1 >> >> \clearpage >> >> -------------------------------------------------------------------- >> Status: **Experimental** >> >> Architecture(s): x86 and ARM >> > > Any. OK. > >> Component(s): Guest >> >> Hardware: Intel and AMD > > No need to specify this. OK. > >> -------------------------------------------------------------------- >> >> # Background and Motivation >> > > I skimmed through the middle -- I think you description of transmissions > in both directions is accurate. > > The proposal to replace some steps with explicit memcpy is also > sensible. Glad to hear that! > >> \clearpage >> >> ## Performance >> >> Numbers that give a rough idea on the performance benefits of this extension. >> These are Guest <-> Dom0 which test the communication between backend and >> frontend, excluding other bottlenecks in the datapath (the software switch). >> >> ``` >> # grant copy >> Guest TX (1vcpu, 64b, UDP in pps): 1 506 170 pps >> Guest TX (4vcpu, 64b, UDP in pps): 4 988 563 pps >> Guest TX (1vcpu, 256b, UDP in pps): 1 295 001 pps >> Guest TX (4vcpu, 256b, UDP in pps): 4 249 211 pps >> >> # grant copy + grant map (see next subsection) >> Guest TX (1vcpu, 260b, UDP in pps): 577 782 pps >> Guest TX (4vcpu, 260b, UDP in pps): 1 218 273 pps >> >> # drop at the guest network stack >> Guest RX (1vcpu, 64b, UDP in pps): 1 549 630 pps >> Guest RX (4vcpu, 64b, UDP in pps): 2 870 947 pps >> ``` >> >> With this extension: >> ``` >> # memcpy >> data-len=256 TX (1vcpu, 64b, UDP in pps): 3 759 012 pps >> data-len=256 TX (4vcpu, 64b, UDP in pps): 12 416 436 pps > > This basically means we can almost get line rate for 10Gb link. > > It is already a good result. I'm interested in knowing if there is > possibility to approach 40 or 100 Gb/s? Certainly, so with bulk transfer we can already saturate a 40 Gbit/s NIC, sending out from a guest to an external host. I got ~80 Gbit/s too but between guests on the same host (some time ago back in xen 4.7). 100 Gbit/s is also on my radar. The problem comes with smaller packets <= MTU (and request/response workloads with small payloads) and there is where we lack the performance. Specially speaking of the workload with the very small packets, linux has a hard time saturating those NICs (with XDP now rising up to the challenge); I think only DPDK is able to at this point [*]. [*] Section 7.1, https://download.01.org/packet-processing/ONPS2.1/Intel_ONP_Release_2.1_Performance_Test_Report_Rev1.0.pdf > It would be good if we design this extension with higher goals in mind. Totally agree! >> data-len=256 TX (1vcpu, 256b, UDP in pps): 3 248 392 pps >> data-len=256 TX (4vcpu, 256b, UDP in pps): 11 165 355 pps >> >> # memcpy + grant map (see next subsection) >> data-len=256 TX (1vcpu, 260b, UDP in pps): 588 428 pps >> data-len=256 TX (4vcpu, 260b, UDP in pps): 1 668 044 pps >> >> # (drop at the guest network stack) >> data-len=256 RX (1vcpu, 64b, UDP in pps): 3 285 362 pps >> data-len=256 RX (4vcpu, 64b, UDP in pps): 11 761 847 pps >> >> # (drop with guest XDP_DROP prog) >> data-len=256 RX (1vcpu, 64b, UDP in pps): 9 466 591 pps >> data-len=256 RX (4vcpu, 64b, UDP in pps): 33 006 157 pps >> ``` >> >> Latency measurements (netperf TCP_RR request size 1 and response size 1): >> ``` >> 24 KTps vs 28 KTps >> 39 KTps vs 50 KTps (with kernel busy poll) >> ``` >> >> TCP Bulk transfer measurements aren't showing a representative increase on >> maximum throughput (sometimes ~10%), but rather less retransmissions and >> more stable. This is probably because of being having a slight decrease in >> rtt >> time (i.e. receiver acknowledging data quicker). Currently trying exploring >> other data list sizes and probably will have a better idea on the effects of >> this. >> >> ## Linux grant copy vs map remark >> >> Based on numbers above there's a sudden 2x performance drop when we switch >> from >> grant copy to also grant map the ` gref`: 1 295 001 vs 577 782 for 256 and >> 260 >> packets bytes respectivally. Which is all the more visible when removing the >> grant >> copy with memcpy in this extension (3 248 392 vs 588 428). While there's been >> discussions of avoid the TLB unflush on unmap, one could wonder what the >> threshold of that improvement would be. Chances are that this is the least of >> our concerns in a fully poppulated host (or with an oversubscribed one). >> Would >> it be worth experimenting increasing the threshold of the copy beyond the >> header? >> > > Yes, it would be interesting to see more data points and provide > sensible default. But I think this is secondary goal because "sensible > default" can change overtime and on different environments. Indeed; I am experimenting with more data points and other workloads to add up here. >> \clearpage >> >> # References >> >> [0] http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg01504.html >> >> [1] >> https://github.com/freebsd/freebsd/blob/master/sys/dev/netmap/netmap_mem2.c#L362 >> >> [2] https://www.freebsd.org/cgi/man.cgi?query=vale&sektion=4&n=1 >> >> [3] https://github.com/iovisor/bpf-docs/blob/master/Express_Data_Path.pdf >> >> [4] >> http://prototype-kernel.readthedocs.io/en/latest/networking/XDP/design/requirements.html#write-access-to-packet-data >> >> [5] >> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c#L2073 >> >> [6] >> http://lxr.free-electrons.com/source/drivers/net/ethernet/mellanox/mlx4/en_rx.c#L52 >> >> # History >> >> A table of changes to the document, in chronological order. >> >> ------------------------------------------------------------------------ >> Date Revision Version Notes >> ---------- -------- -------- ------------------------------------------- >> 2016-12-14 1 Xen 4.9 Initial version. >> ---------- -------- -------- ------------------------------------------- _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.