[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: xen-swiotlb issue when NVMe driver is enabled in Dom0 on ARM


  • To: Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • From: Rahul Singh <Rahul.Singh@xxxxxxx>
  • Date: Thu, 21 Apr 2022 17:45:32 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=03t4vCHpSQTztr1WldDOWGW9sRhZpaD3oNd4i4lVIZg=; b=ZjQrQo79+/C7kPeGKI1rz+P6Jfzabp9yXVmRxp1Gdb98j1Aa9imSf5SrBjya7oezVg0gXRkp0KRYcsFQT9j1eBTfZJYqEUUMYBpF0XBVDW/prIw1pPx7/0R+9l99Q2Op7ObYJAvv8LTwQdenjuQG/GJy2lQr9M4B3kyRLBqg0i0Y9fV68YMmI3Xz0fO5AK7SmZcGipLUjxVm1h7cJhlmSUJaypTPAy7HS6mS0ZC6CKipkeXdrdAoZCmkv26GiMNOclpeVLZZnLqqFOrdzRmhP/54xxb+/q0VGzefwjb5mRZeyQUsYBPI37mk/4jWbgB/4fDkELeMTzgtWirxCs+08A==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RspjWtg8/rwRQlQ2MJbiZu1Dt16uiI7rVCzsCoRxcyTitf+54jHvJNvv7RYVo/XQuBPMz9x6Rej/9z9/9smaQ5SmSxzGerSPQtb4N88gGQCmuQR6ZkW5w0LjWrQsIzhshmWAJBG9EeZN8SFeIe8I5RDrx1z+OpAYch+BlJ7HwWOTqRJOQCIdU6528UwtPUmUCRzCiNnfzmiftJWYu7cIf9+kaajxSeM8l/glHj15keSovHrn0xD3a+ZgpjYCJiLvY79f9Jt3a186L5VwfYGAow134J4Rg44bOgsNQlIvqMfpgFEbn2LS3E3ILHa9UU6FhCW6ur9farp/DqrcBP6z5g==
  • Authentication-results-original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Cc: Christoph Hellwig <hch@xxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, "jgross@xxxxxxxx" <jgross@xxxxxxxx>, "boris.ostrovsky@xxxxxxxxxx" <boris.ostrovsky@xxxxxxxxxx>
  • Delivery-date: Thu, 21 Apr 2022 17:46:04 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHYTzcUggEUWxM2006iot1qnYxlIKzuW0oAgAFU1ACAADDmgIAApv+AgAC5dQCAAp7ugIACQDIAgAEmAYCAANnhAIAAjguAgADEGACAAT4mAA==
  • Thread-topic: xen-swiotlb issue when NVMe driver is enabled in Dom0 on ARM

Hi Stefano,

> On 20 Apr 2022, at 11:46 pm, Stefano Stabellini <sstabellini@xxxxxxxxxx> 
> wrote:
> 
> On Wed, 20 Apr 2022, Rahul Singh wrote:
>>> On 20 Apr 2022, at 3:36 am, Stefano Stabellini <sstabellini@xxxxxxxxxx> 
>>> wrote:
>>>>> Then there is xen_swiotlb_init() which allocates some memory for
>>>>> swiotlb-xen at boot. It could lower the total amount of memory
>>>>> available, but if you disabled swiotlb-xen like I suggested,
>>>>> xen_swiotlb_init() still should get called and executed anyway at boot
>>>>> (it is called from arch/arm/xen/mm.c:xen_mm_init). So xen_swiotlb_init()
>>>>> shouldn't be the one causing problems.
>>>>> 
>>>>> That's it -- there is nothing else in swiotlb-xen that I can think of.
>>>>> 
>>>>> I don't have any good ideas, so I would only suggest to add more printks
>>>>> and report the results, for instance:
>>>> 
>>>> As suggested I added the more printks but only difference I see is the 
>>>> size apart
>>>> from that everything looks same .
>>>> 
>>>> Please find the attached logs for xen and native linux boot.
>>> 
>>> One difference is that the order of the allocations is significantly
>>> different after the first 3 allocations. It is very unlikely but
>>> possible that this is an unrelated concurrency bug that only occurs on
>>> Xen. I doubt it.
>> 
>> I am not sure but just to confirm with you, I see below logs in every 
>> scenario.
>> SWIOTLB memory allocated by linux swiotlb and used by xen-swiotlb. Is that 
>> okay or it can cause some issue.
>> 
>> [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
>> [    0.000000] software IO TLB: mapped [mem 
>> 0x00000000f4000000-0x00000000f8000000] (64MB)
>> 
>> snip from int __ref xen_swiotlb_init(int verbose, bool early)
>> /*                                                                         
>>      * IO TLB memory already allocated. Just use it.                         
>>   
>>      */                                                                      
>>   
>>     if (io_tlb_start != 0) {                                                 
>>   
>>         xen_io_tlb_start = phys_to_virt(io_tlb_start);                       
>>   
>>         goto end;                                                            
>>   
>>     }
> 
> Unfortunately there is nothing obvious in the logs. I think we need to
> look at the in-details executions of Linux on Xen with swiotlb-xen and
> Linux on Xen without swiotlb-xen. The comparison with Linux on native is
> not very interesting because the memory layout is a bit different.
> 
> The comparison between the two executions should be simple because
> swiotlb-xen should be transparent: in this simple case swiotlb-xen
> should end up calling always the same functions that would end up being
> called anyway without swiotlb-xen. Basically, it should only add a
> couple of extra steps in between, nothing else.
> 
> As we have already discussed:
> 
> - [no swiotlb-xen] dma_alloc_attrs --> dma_direct_alloc
> - [swiotlb-xen] dma_alloc_attrs --> xen_swiotlb_alloc_coherent --> 
> dma_direct_alloc
> 
> The result should be identical. In xen_swiotlb_alloc_coherent the code
> path taken should be:
> 
> - xen_alloc_coherent_pages
> - if (((dev_addr + size - 1 <= dma_mask)) &&
>      !range_straddles_page_boundary(phys, size)) {
>      *dma_handle = dev_addr;
> - return ret
> 
> So basically, it should make zero difference. That is expected because
> swiotlb-xen really only comes into play for domU pages. For booting
> dom0, it should only be a "useless" indirection.
> 
> In the case of xen_swiotlb_map_page, it should be similar. The path
> taken should be:
> 
>       if (dma_capable(dev, dev_addr, size, true) &&
>           !range_straddles_page_boundary(phys, size) &&
>               !xen_arch_need_swiotlb(dev, phys, dev_addr) &&
>               swiotlb_force != SWIOTLB_FORCE)
>               goto done;
> 
> which I think should correspond to this prints in your logs at line 400:
> 
>    DEBUG xen_swiotlb_map_page 400 phys=80003c4f000 dev_addr=80003c4f000
> 
> So that should be OK too. If different paths are taken, then we have a
> problem. If the paths above are taken there should be zero difference
> between the swiotlb-xen and the non-swiotlb-xen cases.
> 
> Which brings me to your question about xen_swiotlb_init and this
> message:
> 
>    software IO TLB: mapped [mem 0x00000000f4000000-0x00000000f8000000] (64MB)
> 
> The swiotlb-xen buffer should *not* be used if the code paths taken are
> the ones above. So it doesn't matter if it is allocated or not. You
> could comment out the code in xen_swiotlb_init and everything should
> still behave the same.
> 
> Finally, my suggestion. Considering all the above, I would look *very*
> closely at the execution of Linux on Xen with and without swiotlb-xen.
> The differences should be really minimal. Adds prints to all the
> swiotlb-xen functions, but really only the following should matter:
> - xen_swiotlb_alloc_coherent
> - xen_swiotlb_map_page
> - xen_swiotlb_unmap_page
> 
> What are the differences between the two executions? From the logs:
> 
> - the allocation of the swiotlb-xen buffer which leads to 64MB of less
>  memory available, but actually if you compared to Linux on Xen
>  with/without swiotlb-xen this different would go away because
>  xen_swiotlb_init would be called in both cases anyway
> 
> - the size upgrade in xen_swiotlb_alloc_coherent: I can see several
>  instances of the allocation size being increased. Is that causing the
>  problem? It seems unlikely and you have already verified it is not the
>  case by removing the size increase in xen_swiotlb_alloc_coherent
> 
> - What else is different? There *must* be something, but it is not
>  showing in the logs so far.
> 
> 
> The only other observation that I have, but it doesn't help, is that the
> failure happens on the second 4MB allocation when there is another
> concurrent memory allocation of 4K. Neither the 4MB nor the 4K are
> size-upgrades by xen_swiotlb_alloc_coherent.
> 
> 4MB is an larger-than-usual size, but it shouldn't make that much of a
> difference. Is that problem that the 4MB have to be contiguous? I don't
> see how swiotlb-xen could have an impact in that regard, if not for the
> size increase in xen_swiotlb_alloc_coherent.
> 
> Please let me know what you find.

I debug the issue more today and found out that the only difference when
calling dma_alloc_attrs() from the NVMe driver [1] and the other driver is the
attribute “DMA_ATTR_NO_KERNEL_MAPPING". 

I remove the attribute "DMA_ATTR_NO_KERNEL_MAPPING” before
calling the xen_alloc_coherent_pages() , NVMe DMA allocation is successful
and the issue is not observed.

Do you have any idea why attribute DMA_ATTR_NO_KERNEL_MAPPING is
causing the the issue with xen-swiotlb.

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 2b385c1b4a99..3c18395dd567 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -292,6 +292,8 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t 
size,
        */
        flags &= ~(__GFP_DMA | __GFP_HIGHMEM);
 
+       attrs &= ~(DMA_ATTR_NO_KERNEL_MAPPING);
+
        /* Convert the size to actually allocated. */
        size = 1UL << (order + XEN_PAGE_SHIFT);
 
@@ -359,6 +361,8 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
size, void *vaddr,
            TestClearPageXenRemapped(page))
                xen_destroy_contiguous_region(phys, order);
 
+       attrs &= ~(DMA_ATTR_NO_KERNEL_MAPPING);
+
        xen_free_coherent_pages(hwdev, size, vaddr, phys_to_dma(hwdev, phys),
                                attrs);
 }

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/nvme/host/pci.c#n2053

Regards,
Rahul


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.