[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Dom0 physical networking/swiotlb/something issue in 3.7-rc1



On Fri, Oct 12, 2012 at 11:28:08AM +0100, Ian Campbell wrote:
> Hi Konrad,
> 
> The following patch causes fairly large packet loss when transmitting
> from dom0 to the physical network, at least with my tg3 hardware, but I
> assume it can impact anything which uses this interface.

Ah, that would explain why one of my machines suddenly started
developing checksum errors (and had a tg3 card). I hadn't gotten
deep into it.
> 
> I suspect that the issue is that the compound pages allocated in this
> way are not backed by contiguous mfns and so things fall apart when the
> driver tries to do DMA.

So this should also be easily reproduced on barmetal with 'iommu=soft' then.
> 
> However I don't understand why the swiotlb is not fixing this up
> successfully? The tg3 driver seems to use pci_map_single on this data.
> Any thoughts? Perhaps the swiotlb (either generically or in the Xen
> backend) doesn't correctly handle compound pages?

The assumption is that it is just a page. I am surprsed that the other
IOMMUs aren't hitting this as well - ah, that is b/c they do handle
a virtual address of more than one PAGE_SIZE..
> 
> Ideally we would also fix this at the point of allocation to avoid the
> bouncing -- I suppose that would involve using the DMA API in
> netdev_alloc_frag?

Using pci_alloc_coherent would do it.. but
> 
> We have a, sort of, similar situation in the block layer which is solved
> via BIOVEC_PHYS_MERGEABLE. Sadly I don't think anything similar can
> easily be retrofitted to the net drivers without changing every single
> one.

.. I think the right way would be to fix the SWIOTLB. And since I am now
officially the maintainer of said subsystem you have come to the right
person!

What is the easiest way of reproducing this? Just doing large amount
of netperf/netserver traffic both ways?
> 
> Ian.
> 
> commit 69b08f62e17439ee3d436faf0b9a7ca6fffb78db
> Author: Eric Dumazet <edumazet@xxxxxxxxxx>
> Date:   Wed Sep 26 06:46:57 2012 +0000
> 
>     net: use bigger pages in __netdev_alloc_frag
>     
>     We currently use percpu order-0 pages in __netdev_alloc_frag
>     to deliver fragments used by __netdev_alloc_skb()
>     
>     Depending on NIC driver and arch being 32 or 64 bit, it allows a page to
>     be split in several fragments (between 1 and 8), assuming PAGE_SIZE=4096
>     
>     Switching to bigger pages (32768 bytes for PAGE_SIZE=4096 case) allows :
>     
>     - Better filling of space (the ending hole overhead is less an issue)
>     
>     - Less calls to page allocator or accesses to page->_count
>     
>     - Could allow struct skb_shared_info futures changes without major
>       performance impact.
>     
>     This patch implements a transparent fallback to smaller
>     pages in case of memory pressure.
>     
>     It also uses a standard "struct page_frag" instead of a custom one.
>     
>     Signed-off-by: Eric Dumazet <edumazet@xxxxxxxxxx>
>     Cc: Alexander Duyck <alexander.h.duyck@xxxxxxxxx>
>     Cc: Benjamin LaHaise <bcrl@xxxxxxxxx>
>     Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
> 
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.