[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Network blocked after sending several packets larger than 128 bytes when using Driver Domain



Hi, all:

I am trying to use a HVM with PCI pass-through NIC as network driver domain. However, when I send packets whose size are larger than 128 bytes from DomU using pkt-gen tools, after several seconds, the network between driver domain and destination host will be blocked. 

The networking structure when testing is shown below:
Pkt-gen (in DomU) <--> Virtual Eth (in DomU) <---> VIF (in Driver Domain) <--> OVS (in Driver Domain) <--> pNIC (passthrough nic in Driver Domain) <---> Another Host
	
The summarized results are as follows:
1. When we just ping from DomU to another host, the network seems ok.
2. When sending 64 or 128 bytes UDP packets from DomU, the network will not be blocked
3. When sending 256, 1024 or 1400 bytes UDP packets from DomU, and if the scatter-gather feature of passthrough NIC in driver domain is on, the network will be blocked
4. When sending 256, 1024 or 1400 bytes UDP packets from DomU, and only if the scatter-gather feature of passthrough NIC in driver domain is off, the network will not be blocked

As shown in detailed syslog below, when network is blocked, it seems that the passthrough NIC's driver entry an exception state and the tx queue is hung.
As far as I know, when sending 64 or 128 bytes package, the skb generated by netback only has the linearized data, and the data is stored in the PAGE allocated from the driver domain's memory. But for packets whose size is larger than 128 bytes, the skb will also has a frag page which is grant mapped from DomU's memory. And if we disable the scatter-gather feature of NIC, the skb sent from netback will be linearized firstly, and it will make the skb's data is stored in the PAGE allocated from the driver domain other than the DomU's memory.

I am wondering if it is the problem caused by PCI-passthrough and DMA operations, or if there is some wrong configuration in our environment. How can I continue to debug this problem? I am looking forward to your replay and advice, Thanks.

The environment we used are as follows: a. Dom0: SUSE 12 (kernel: 3.12.28) b. XEN: 4.4.1_0602.2 (provided by SUSE 12) c. DomU: kernel 3.17.4 d. Driver Domain: kernel 3.17.8 e. OVS: 2.1.2 f. Host: Huawei RH2288, CPU Intel Xenon E5645@xxxxxxx, disabled HyperThread, enabled VT-d g. pNIC: we tried Intel 82599 10GE NIC (ixgbe v3.23.2), Intel 82576 1GE NIC (igb) and Broadcom NetXtreme II BCM 5709 1GE NIC (bnx2 v2.2.5) h. para-virtulization driver: netfront/netback i. MTU: 1500 The detailed Logs in Driver Domain after the network is blocked are as follows: 1. When using 82599 10GE NIC, syslog and dmesg includes infos below. The log shows that the Tx unit Hang is detected and driver will try to reset the adapter repeatly, however, the network is still blocked. <snip> ixgbe: 0000:00:04.0 eth10: Detected Tx Unit Hang Tx Queue <0> TDH, TDT <1fd>, <5a> next_to_use <5a> next_to_clean <1fc> ixgbe: 0000:00:04.0 eth0: tx hang 11 detected on queue 0, resetting adapter ixgbe: 0000:00:04.0 eth10: Reset adapter ixgbe: 0000:00:04.0 eth10: PCIe transaction pending bit also did not clear ixgbe: 0000:00:04.0 master disable timed out ixgbe: 0000:00:04.0 eth10: detected SFP+: 3 ixgbe: 0000:00:04.0 eth10: NIC Link is Up 10 Gbps, Flow Control: RX/TX ... </snip> I have tried to remove the "reset adpater" call in ixgbe driver's ndo_tx_timeout function, and the logs are shown below. The log shows that when network is blocked, the "TDH" and the nic cannot be incremented any more. <snip> ixgbe 0000:00:04.0 eth3: Detected Tx Unit Hang Tx Queue <0> TDH, TDT <1fd>, <5a> next_to_use <5a> next_to_clean <1fc> ixgbe 0000:00:04.0 eth3: tx_buffer_info[next_to_clean] time_stamp <1075b74ca> jiffies <1075b791c> ixgbe 0000:00:04.0 eth3: Fake Tx hang detected with timeout of 5 seconds ixgbe 0000:00:04.0 eth3: Detected Tx Unit Hang Tx Queue <0> TDH, TDT <1fd>, <5a> next_to_use <5a> next_to_clean <1fc> ixgbe 0000:00:04.0 eth3: tx_buffer_info[next_to_clean] time_stamp <1075b74ca> jiffies <1075b7b11> ... </snip> I have also compared the nic's corresponding pci status before and after the network is hung, and found that the "DevSta" filed changed from "TransPend-" to "TransPend+" after the network is blocked: <snip> DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend+ </snip> The network can only be recovered after we reload the ixgbe module in driver domain. 2. When using BCM5709 NIC, the results is smiliar. After the network is blocked, the syslog has info below: <snip> bnx2 0000:00:04.0 eth14: <--- start FTQ dump ---> bnx2 0000:00:04.0 eth14: RV2P_PFTQ_CTL 00010000 bnx2 0000:00:04.0 eth14: RV2P_TFTQ_CTL 00020000 ... bnx2 0000:00:04.0 eth14: CP_CPQ_FTQ_CTL 00004000 bnx2 0000:00:04.0 eth14: CPU states: bnx2 0000:00:04.0 eth14: 045000 mode b84c state 80001000 evt_mask 500 pc 8001280 pc 8001288 instr 8e030000 ... bnx2 0000:00:04.0 eth14: 185000 mode b8cc state 80000000 evt_mask 500 pc 8000ca8 pc 8000920 instr 8ca50020 bnx2 0000:00:04.0 eth14: <--- end FTQ dump ---> bnx2 0000:00:04.0 eth14: <--- start TBDC dump ---> ... </snip> The difference of lspci command results before and after the network is hung show that the Status field changed from "MAbort-" to "MAbort+": <snip> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx- </snip>
The network can not be recovered even after we reload the bnx2 module in driver domain.

----------
openlui
Best Regards


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.