[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Xen on RP4



+ xen-devel

On Fri, 23 Oct 2020, Elliott Mitchell wrote:
> On Thu, Oct 22, 2020 at 06:02:46PM -0700, Stefano Stabellini wrote:
> > On Thu, 22 Oct 2020, Elliott Mitchell wrote:
> > > On Thu, Oct 22, 2020 at 04:27:23PM -0700, Stefano Stabellini wrote:
> > > > On Wed, 21 Oct 2020, Elliott Mitchell wrote:
> > > > > Due to experimenting with "proper" console on serial port, I ended up
> > > > > getting output.  Apparently domain 0 was panicing when trying to setup
> > > > > xen-blkback due to the swiotlb code being unable to allocate a bounce
> > > > > buffer.
> > > > > 
> > > > > Stefano, what is the status of swiotlb in the 5.8 kernel series?
> > > > 
> > > > The swiotlb fixes for RPi4 are not in 5.8. Linux 5.9 has just been
> > > > released, and it should come with everything you need.
> > > 
> > > I had 13 patches applied to Debian's 5.8 kernel source.  Two of the
> > > batch I had against 5.6 had gotten into mainline.  No issues were visible
> > > during normal operation.
> > > 
> > > Problem showed up when trying to start a domain.  By using Xen's console
> > > device I managed to get the messages:
> > > 
> > > xen-blkback: backend/vbd/3/51712: using 2 queues, protocol 1 (arm-abi) 
> > > persistent grants
> > > Kernel panic - not syncing: Can not allocate SWIOTLB buffer earlier and 
> > > can't now provide you with the DMA bounce buffer
> > > 
> > > Worth noting that by the time when I was starting this domain, the device
> > > had an uptime of more than an hour.  There could be a problem of swiotlb
> > > needing the ability to claim DMA-viable pages after they've been in use
> > > for other purposes.
> >  
> > I'll have a look
> 
> Finally came up with one detail of a change I'd made in the right time
> frame to trigger this issue.  As such I can now control this behavior and
> get it to occur or not.
> 
> I have some suspicion my planned workload approach differs from others.
> 
> During the runs where I was able to successfully boot a child domain,
> domain 0 had been allocated 512MB of memory.  During the unsuccessful run
> where the above message occurred, domain 0 had been allocated 2GB of
> memory.  This appears to reliably control the occurrence of this bug.

This is what is going on. kernel/dma/swiotlb.c:swiotlb_init gets called
and tries to allocate a buffer for the swiotlb. It does so by calling

  memblock_alloc_low(PAGE_ALIGN(bytes), PAGE_SIZE);

In your case, the allocation must fail, no_iotlb_memory is set, and I
expect you see this warning among the boot messages:

  Cannot allocate buffer

Later during initialization swiotlb-xen comes in
(drivers/xen/swiotlb-xen.c:xen_swiotlb_init) and given that io_tlb_start
is != 0 it thinks the memory is ready to use when actually it is not.

When the swiotlb is actually needed, swiotlb_tbl_map_single gets called
and since no_iotlb_memory is set the kernel panics.


The reason why you are only seeing it with a 2G dom0 is because
swiotlb_init is only called when:

  max_pfn > PFN_DOWN(arm64_dma_phys_limit ? : arm64_dma32_phys_limit))

see arch/arm64/mm/init.c:mem_init. So when dom0 is 512MB swiotlb_init is
not called at all. swiotlb-xen does the allocation itself with
memblock_alloc and it succeeds.

Note that I tried to repro the issue here at my end but it works for me
with device tree. So the swiotlb_init memory allocation failure probably
only shows on ACPI, maybe because ACPI is reserving too much low memory.

In any case, I think the issue might be "fixed" by this patch:



diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index c19379fabd20..84e15e7d3929 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -231,6 +231,7 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned long 
nslabs, int verbose)
                io_tlb_orig_addr[i] = INVALID_PHYS_ADDR;
        }
        io_tlb_index = 0;
+       no_iotlb_memory = false;
 
        if (verbose)
                swiotlb_print_info();
@@ -263,8 +264,11 @@ swiotlb_init(int verbose)
                return;
 
        if (io_tlb_start)
+       {
                memblock_free_early(io_tlb_start,
                                    PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT));
+               io_tlb_start = 0;
+       }
        pr_warn("Cannot allocate buffer");
        no_iotlb_memory = true;
 }
@@ -362,6 +366,7 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
                io_tlb_orig_addr[i] = INVALID_PHYS_ADDR;
        }
        io_tlb_index = 0;
+       no_iotlb_memory = false;
 
        swiotlb_print_info();



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.