[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] net: allow configuration of the size of page in __netdev_alloc_frag



On Wed, Oct 24, 2012 at 06:43:20PM +0200, Eric Dumazet wrote:
> On Wed, 2012-10-24 at 17:22 +0100, Ian Campbell wrote:
> > On Wed, 2012-10-24 at 16:21 +0100, Eric Dumazet wrote:
> 
> > > If you really have such problems, why locally generated TCP traffic
> > > doesnt also have it ?
> > 
> > I think it does. The reason I noticed the original problem was that ssh
> > to the machine was virtually (no pun intended) unusable.
> > 
> > > Your patch doesnt touch sk_page_frag_refill(), does it ?
> > 
> > That's right. It doesn't. When is (sk->sk_allocation & __GFP_WAIT) true?
> > Is it possible I'm just not hitting that case?
> > 
> 
> I hope not. GFP_KERNEL has __GFP_WAIT.
> 
> > Is it possible that this only affects certain traffic patterns (I only
> > really tried ssh/scp and ping)? Or perhaps its just that the swiotlb is
> > only broken in one corner case and not the other.
> 
> Could you try a netperf -t TCP_STREAM ?

For fun I did a couple of tests - I setup two machines (one r8168, the other
e1000e) and tried to do netperf/netserver. Both of them are running a baremetal
kernel and one of them has 'iommu=soft swiotlb=force' to simulate the worst
case. This is using v3.7-rc3.

The r8169 is booted without any arguments, the e1000e is using 'iommu=soft
swiotlb=force'.

So r8169 -> e1000e, I get ~940 (this is odd, I expected that the e1000e
on the recv side would be using the bounce buffer, but then I realized it
sets up using pci_alloc_coherent an 'dma' pool).

The other way - e1000e -> r8169 got me around ~128. So it is the sending
side that ends up using the bounce buffer and it slows down considerably.

I also swapped the machine that had e1000e with a tg3 - and got around
the same numbers.

So all of this points to the swiotlb and to just make sure that nothing
was amiss I wrote a little driver that would allocate a compound page,
setup DMA mapping, do some writes, sync and unmap the DMA page. And it works
correctly - so swiotlb (and the xen variant) work right just right.
Attached for your fun.

Then I decided to try v3.6.3, with the same exact parameters.. and
the problem went away.

The e1000e -> r8169 which got me around ~128, now gets ~940! Still
using the swiotlb bounce buffer.


> 
> Because ssh use small packets, and small TCP packets dont use frags but
> skb->head.
> 
> You mentioned a 70% drop of performance, but what test have you used
> exactly ?

Note, I did not provide any arguments to netperf, but it did pick the
test you wanted:

> netperf -H tst019
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tst019.dumpdata.com 
(192.168.101.39) port 0 AF_INET

> 
> 

Attachment: dma_test.c
Description: Text document

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.