[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] bnx2x DMA mapping errors cause iscsi problems



Hi Patrick,

Sorry this email won't help resolve your issue but I'm highlighting that
a design I have will help resolve this problem in the long run.

On 23/04/14 09:07, Patrick Vranckx wrote:
> Hi,
> 
> We are running open source Xen 4.1.4 on Debian 7.4 amd64
> HW is HP BL460c Gen8. Nic is Broadcom Corporation NetXtreme II BCM57810
> 10 Gigabit Ethernet (rev 10)
> 
> We are experiencing sporadic network blackouts after a few days on eth4
> which is used for iscsi block storage for the VMs. VMs file sytems are
> switching to read-only and so we loose all the VMs. Then we have to
> reboot the hypervisor to regain network connectivity.
> 
> MTU = 9000 on eth4.
> 
> We were using Broadcom kernel driver from Debian 7.4 official kernel
> (3.2.54-2). Now we've updated with the latest driver published on
> Broadcom website, we have some more login :
> 
> [1200406.207855] [bnx2x_alloc_rx_data:1009(eth4)]Can't map rx data
> [1200406.207978] [bnx2x_alloc_rx_data:1009(eth4)]Can't map rx data
> .....
>

This is exactly the issue that the linked design is trying to address:

http://lists.xen.org/archives/html/xen-devel/2014-04/msg01632.html

> Here are bnx2x module versions we tried :
> Debian 7.4 stock kernel : 1.70.30
> Broadcom website (latest) : 1.78.58
> Broadcom Firmware : Latest from HP BROADCOM 2.9.26 CP021537 package
> 
> Looking at bnx2x source code (bnx2x_cmn.c), it appears this error is
> caused by a DMA mapping error for rx buffers (memory leak ?)
> 
> static int bnx2x_alloc_rx_data(struct bnx2x *bp, struct bnx2x_fastpath *fp,
> u16 index, gfp_t gfp_mask)
> {
> ....
> mapping = dma_map_single(&bp->pdev->dev, data + NET_SKB_PAD,
> fp->rx_buf_size,
> DMA_FROM_DEVICE);
> 
> if (unlikely(dma_mapping_error(&bp->pdev->dev, mapping))) {
> 
> #ifdef BCM_HAS_BUILD_SKB /* BNX2X_UPSTREAM */
> bnx2x_frag_free(fp, data);
> #else
> dev_kfree_skb_any(data);
> #endif
> BNX2X_ERR("Can't map rx data\n");
> return -ENOMEM;
> }
> ...
> 
> We found several other references of people suffering from the same
> problem.
> Here are two threads concerning Citrix XenServer 6 showing the exact
> same problem on BL460C G6 and Gen8
> http://discussions.citrix.com/topic/324343-xenserver-61-bnx2x-sw-iommu/
> http://discussions.citrix.com/topic/333281-xenserver-62-crash-bug/page-3
> 
> It seems from other references that most of the time, similar problems
> occuring with this driver are related to virtualized environments.
> 
> We found a rather old workaround from VMWare. The solution is to reduce
> the number of queues used by the driver (num_queues parameter).
> Unfortunately, the problem still occurs but after a longer period.

Reducing the queues, increasing the swiotlb size (as Jan suggested) and
you can try adding  "disable_tpa=1" to bnx2x module parameters to work
around this issue. There will be a potential reduction in network
performance from these parameters.
> 
> There are threads in this mailing list related to DMA allocation in Xen
> ( http://markmail.org/message/uududlw5w6xlqcp2 ) but I'm not able to
> understand if those threads are related to our problem.
> 
> Thanks for your help,
> 
> Patrick
> 

Malcolm


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.