[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: xen-swiotlb issue when NVMe driver is enabled in Dom0 on ARM



On Sun, 17 Apr 2022, Rahul Singh wrote:
> > On 15 Apr 2022, at 6:40 pm, Stefano Stabellini <sstabellini@xxxxxxxxxx> 
> > wrote:
> > On Fri, 15 Apr 2022, Christoph Hellwig wrote:
> >> On Thu, Apr 14, 2022 at 01:39:23PM -0700, Stefano Stabellini wrote:
> >>> OK, now we know that the code path with Xen is correct and it is the
> >>> same code path taken (dma_alloc_direct) as when !CONFIG_XEN and !SMMU.
> >>> That is how it should be.
> >>> 
> >>> I cannot explain why dma_alloc_direct() would fail when called from
> >>> xen_swiotlb_alloc_coherent(), but it would succeed when called from
> >>> dma_alloc_attrs() without Xen.
> >>> 
> >>> I am not aware of any restrictions that xen or swiotlb-xen would
> >>> introduce in that regard. Unless you are just running out of memory
> >>> because dom0_mem too low.
> >> 
> >> The crash is deep down in the page allocator.  Even if memory was low
> >> it should no crash.  So there is some odd interaction between Xen
> >> and the page allocator going on.  I think nvme and dma-direct really
> >> are only the messenger here.
> > 
> > 
> > I cannot think of anything but if that is the case I guess it is more
> > likely related to reserved-memory not properly advertised or ACPI tables
> > not properly populated.
> 
> I am not sure if it is true as we are able to boot with the same reserved 
> memory or
> the same ACPI table populated if we boot without swiotlb-xen dma ops.
> 
> > 
> > 
> > Rahul,
> > 
> > What happens if you boot Linux on Xen with swiotlb-xen disabled?
> 
> Linux boots fine without any issue if we disable swiotlb-xen as mentioned 
> below.

The plot thinkens.

Without swiotlb-xen, Linux boots fine. With swiotlb-xen it crashes.
However, in both cases, the very same memory allocation function is
used: dma_direct_alloc. In one case it works, in the other case it
crashes.  Everything else is the same.

There are a couple of questionable things with dma masks in
xen_swiotlb_alloc_coherent, but they are *after* the call to
xen_alloc_coherent_pages, which is the one that crashes. So they cannot
be the cause of the crash.

Before the call to xen_alloc_coherent_pages, there is only:

  1) flags &= ~(__GFP_DMA | __GFP_HIGHMEM);
  2) size = 1UL << (order + XEN_PAGE_SHIFT);


1) is already done by dma_alloc_attrs, so it is superfluous. I couldn't
explain how 2) could possibly trigger the crash.  XEN_PAGE_SHIFT is
always 12 even on 64K pages kernels. You can try removing 2) from
xen_swiotlb_alloc_coherent, but we are really wandering in the dark
here.

Then there is xen_swiotlb_init() which allocates some memory for
swiotlb-xen at boot. It could lower the total amount of memory
available, but if you disabled swiotlb-xen like I suggested,
xen_swiotlb_init() still should get called and executed anyway at boot
(it is called from arch/arm/xen/mm.c:xen_mm_init). So xen_swiotlb_init()
shouldn't be the one causing problems.

That's it -- there is nothing else in swiotlb-xen that I can think of.

I don't have any good ideas, so I would only suggest to add more printks
and report the results, for instance:


diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 2b385c1b4a99..c81f9dc7e5a0 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -284,6 +284,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t 
size,
        phys_addr_t phys;
        dma_addr_t dev_addr;
 
+       printk("DEBUG %s %d size=%lu flags=%x 
attr=%lx\n",__func__,__LINE__,size,flags,attrs);
        /*
        * Ignore region specifiers - the kernel's ideas of
        * pseudo-phys memory layout has nothing to do with the
@@ -295,6 +296,8 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t 
size,
        /* Convert the size to actually allocated. */
        size = 1UL << (order + XEN_PAGE_SHIFT);
 
+       printk("DEBUG %s %d size=%lu flags=%x 
attr=%lx\n",__func__,__LINE__,size,flags,attrs);
+
        /* On ARM this function returns an ioremap'ped virtual address for
         * which virt_to_phys doesn't return the corresponding physical
         * address. In fact on ARM virt_to_phys only works for kernel direct
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 51bb8fa8eb89..549b2c85999c 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -429,9 +429,11 @@ void *dma_alloc_attrs(struct device *dev, size_t size, 
dma_addr_t *dma_handle,
        if (dma_alloc_from_dev_coherent(dev, size, dma_handle, &cpu_addr))
                return cpu_addr;
 
+       printk("DEBUG %s %d size=%lu flags=%x 
attr=%lx\n",__func__,__LINE__,size,flags,attrs);
        /* let the implementation decide on the zone to allocate from: */
        flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
 
+       printk("DEBUG %s %d size=%lu flags=%x 
attr=%lx\n",__func__,__LINE__,size,flags,attrs);
        if (dma_alloc_direct(dev, ops))
                cpu_addr = dma_direct_alloc(dev, size, dma_handle, flag, attrs);
        else if (ops->alloc)



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.