[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] SWIOMMU redundant copies?



Hi,

On Fri, 2008-03-14 at 16:39 +0000, Keir Fraser wrote:

> > I'm still trying to understand the corner cases involved.  Some
> > hardware seems to have more trouble than others --- it could well be
> > special-case sg segments such as DMA drain buffers which are causing the
> > trouble for those, in which case doing the copy for every map_sg() will
> > potentially be a significant problem.  (MarkMC hit that same problem
> > bringing the dom0 patches onto 2.6.25 which has reworked drain buffer
> > code.)
> 
> It certainly seems sensible to improve the page-discontiguity check.
> Possibly we can stick it inside page_straddles_boundary() (perhaps with a
> name change) and fix every caller.

Patch below does exactly that, and seems to fix the problem for me.  I
can no longer provoke a BUG() simply by booting with swiotlb=off, and
other boxes which were showing the SWIOTLB-full regression are fixed.

Also, isn't the contig bitmap completely superfluous with this new
check?

> Also, I agree it sounds like something fishier is going on and this is
> perhaps only a workaround that fortuitously works in this case... :-(

I _think_ the new test makes things pretty solid.  But I'm not sure
whether it leaves us open to some corner cases that might end up with a
lot of potential swiotlb traffic once we add the test.  The bright side
is, the only cases that may risk that are cases which are actually
broken without the change, and might BUG the kernel unnecessarily if we
have swiotlb=off.

Tested against RHEL-5 kernels, but I've also checked that it applies and
builds on current linux-2.6.18-xen.hg too, it's the same code here.

Cheers,
 Stephen

commit 952c63d05820a0ac730a8e0a6d902df5cafd6aff
Author: Stephen Tweedie <sct@xxxxxxxxxx>
Date:   Thu Mar 13 17:49:28 2008 +0000

    xen dma: avoid unnecessarily SWIOTLB bounce buffering.
    
    On Xen kernels, BIOVEC_PHYS_MERGEABLE permits merging of disk IOs that
    span multiple pages, provided that the pages are both pseudophysically-
    AND machine-contiguous ---
    
        (((bvec_to_phys((vec1)) + (vec1)->bv_len) == bvec_to_phys((vec2))) && \
         ((bvec_to_pseudophys((vec1)) + (vec1)->bv_len) == \
          bvec_to_pseudophys((vec2))))
    
    However, this best-effort merging of adjacent pages can occur in
    regions of dom0 memory which just happen, by virtue of having been
    initially set up that way, to be machine-contiguous.  Such pages
    which occur outside of a range created by xen_create_contiguous_
    region won't be seen as contiguous by range_straddles_page_boundary(),
    so the pci-dma-xen.c code for dma_map_sg() will send these regions
    to the swiotlb for bounce buffering.
    
    This patch adds a new check, check_pages_physically_contiguous(),
    to the test for pages stradding page boundaries both in
    swiotlb_map_sg() and dma_map_sg(), to capture these ranges and map
    them directly via virt_to_bus() mapping rather than through the
    swiotlb.
    
    Signed-off-by: Stephen Tweedie <sct@xxxxxxxxxx>

diff --git a/arch/i386/kernel/pci-dma-xen.c b/arch/i386/kernel/pci-dma-xen.c
index cdeda5a..14f3539 100644
--- a/arch/i386/kernel/pci-dma-xen.c
+++ b/arch/i386/kernel/pci-dma-xen.c
@@ -110,6 +110,39 @@ do {                                                       
\
        }                                               \
 } while (0)
 
+static int check_pages_physically_contiguous(unsigned long pfn, 
+                                            unsigned int offset,
+                                            size_t length)
+{
+       unsigned long next_mfn;
+       int i;
+       int nr_pages;
+       
+       next_mfn = pfn_to_mfn(pfn);
+       nr_pages = (offset + length + PAGE_SIZE-1) >> PAGE_SHIFT;
+       
+       for (i = 1; i < nr_pages; i++) {
+               if (pfn_to_mfn(++pfn) != ++next_mfn) 
+                       return 0;
+       }
+       return 1;
+}
+
+int range_straddles_page_boundary(paddr_t p, size_t size)
+{
+       extern unsigned long *contiguous_bitmap;
+       unsigned long pfn = p >> PAGE_SHIFT;
+       unsigned int offset = p & ~PAGE_MASK;
+
+       if (offset + size <= PAGE_SIZE)
+               return 0;
+       if (test_bit(pfn, contiguous_bitmap))
+               return 0;
+       if (check_pages_physically_contiguous(pfn, offset, size))
+               return 0;
+       return 1;
+}
+
 int
 dma_map_sg(struct device *hwdev, struct scatterlist *sg, int nents,
           enum dma_data_direction direction)
diff --git a/include/asm-i386/mach-xen/asm/dma-mapping.h 
b/include/asm-i386/mach-xen/asm/dma-mapping.h
index 18b1a0d..fc917f2 100644
--- a/include/asm-i386/mach-xen/asm/dma-mapping.h
+++ b/include/asm-i386/mach-xen/asm/dma-mapping.h
@@ -22,13 +22,7 @@ address_needs_mapping(struct device *hwdev, dma_addr_t addr)
        return (addr & ~mask) != 0;
 }
 
-static inline int
-range_straddles_page_boundary(paddr_t p, size_t size)
-{
-       extern unsigned long *contiguous_bitmap;
-       return ((((p & ~PAGE_MASK) + size) > PAGE_SIZE) &&
-               !test_bit(p >> PAGE_SHIFT, contiguous_bitmap));
-}
+extern int range_straddles_page_boundary(paddr_t p, size_t size);
 
 #define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f)
 #define dma_free_noncoherent(d, s, v, h) dma_free_coherent(d, s, v, h)
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.