[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen-swiotlb: exchange memory with Xen only when pages are contiguous



Hi all,

I just discussed this patch with Boris in private, his opinions(Boris,
please correct me if any misunderstood) are:

1. With/without the check, both are incorrect, he thought we need to
   prevented unalloc'd free at here. 
2. On freeing, if upper layer already checked the memory was DMA-able,
   the checking at here does not make sense, we can remove all checks.
3. xen_create_contiguous_region() and xen_destroy_contiguous_region()
   to come in pairs.

For #1 and #3, I think we need something associate it, like a list, on
allocating, add addr to it, on freeing, check if in the list.

For #2, I'm was not found anywhere validated the address on 
dma_free_coherent() callpath, not just xen-swiotlb.

From my side, I think the checks are make sense, it prevented to exchange
non-contiguous memory with Xen also make sure Xen has enough DMA memory
for DMA also for guest creation. I'm not sure if we can merge this patch
to avoid exchanged non-contiguous memory with Xen?

Any input will appreciate.

Thanks,
Joe 

On 10/25/18 9:28 AM, Joe Jin wrote:
> On 10/25/18 9:10 AM, Boris Ostrovsky wrote:
>> On 10/25/18 10:23 AM, Joe Jin wrote:
>>> On 10/25/18 4:45 AM, Boris Ostrovsky wrote:
>>>> On 10/24/18 10:43 AM, Joe Jin wrote:
>>>>> On 10/24/18 6:57 AM, Boris Ostrovsky wrote:
>>>>>> On 10/24/18 9:02 AM, Konrad Rzeszutek Wilk wrote:
>>>>>>> On Tue, Oct 23, 2018 at 08:09:04PM -0700, Joe Jin wrote:
>>>>>>>> Commit 4855c92dbb7 "xen-swiotlb: fix the check condition for
>>>>>>>> xen_swiotlb_free_coherent" only fixed memory address check condition
>>>>>>>> on xen_swiotlb_free_coherent(), when memory was not physically
>>>>>>>> contiguous and tried to exchanged with Xen via 
>>>>>>>> xen_destroy_contiguous_region it will lead kernel panic.
>>>>>>> s/it will lead/which lead to/?
>>>>>>>
>>>>>>>> The correct check condition should be memory is in DMA area and
>>>>>>>> physically contiguous.
>>>>>>> "The correct check condition to make Xen hypercall to revert the
>>>>>>> memory back from its 32-bit pool is if it is:
>>>>>>>  1) Above its DMA bit mask (for example 32-bit devices can only address
>>>>>>> up to 4GB, and we may want 4GB+2K), and
>>>>>> Is this "and' or 'or'?
>>>>>>
>>>>>>>  2) If it not physically contingous
>>>>>>>
>>>>>>> N.B. The logic in the code is inverted, which leads to all sorts of
>>>>>>> confusions."
>>>>>> I would, in fact, suggest to make the logic the same in both
>>>>>> xen_swiotlb_alloc_coherent() and xen_swiotlb_free_coherent() to avoid
>>>>>> this. This will involve swapping if and else in the former.
>>>>>>
>>>>>>
>>>>>>> Does that sound correct?
>>>>>>>
>>>>>>>> Thank you Boris for pointing it out.
>>>>>>>>
>>>>>>> Fixes: 4855c92dbb7 ("xen-sw..") ?
>>>>>>>
>>>>>>>> Signed-off-by: Joe Jin <joe.jin@xxxxxxxxxx>
>>>>>>>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
>>>>>>>> Cc: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
>>>>>>> Reported-by: Boris Ostrovs... ?
>>>>>>>> Cc: Christoph Helwig <hch@xxxxxx>
>>>>>>>> Cc: Dongli Zhang <dongli.zhang@xxxxxxxxxx>
>>>>>>>> Cc: John Sobecki <john.sobecki@xxxxxxxxxx>
>>>>>>>> ---
>>>>>>>>  drivers/xen/swiotlb-xen.c | 4 ++--
>>>>>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
>>>>>>>> index f5c1af4ce9ab..aed92fa019f9 100644
>>>>>>>> --- a/drivers/xen/swiotlb-xen.c
>>>>>>>> +++ b/drivers/xen/swiotlb-xen.c
>>>>>>>> @@ -357,8 +357,8 @@ xen_swiotlb_free_coherent(struct device *hwdev, 
>>>>>>>> size_t size, void *vaddr,
>>>>>>>>        /* Convert the size to actually allocated. */
>>>>>>>>        size = 1UL << (order + XEN_PAGE_SHIFT);
>>>>>>>>  
>>>>>>>> -      if (((dev_addr + size - 1 <= dma_mask)) ||
>>>>>>>> -          range_straddles_page_boundary(phys, size))
>>>>>>>> +      if ((dev_addr + size - 1 <= dma_mask) &&
>>>>>>>> +          !range_straddles_page_boundary(phys, size))
>>>>>>>>                xen_destroy_contiguous_region(phys, order);
>>>>>> I don't think this is right.
>>>>>>
>>>>>> if ((dev_addr + size - 1 > dma_mask) || 
>>>>>> range_straddles_page_boundary(phys, size))
>>>>>>
>>>>>> No?
>>>>> No this is not correct.
>>>>>
>>>>> When allocate memory, it tried to allocated from Dom0/Guest, then check 
>>>>> if physical
>>>>> address is DMA memory also contiguous, if no, exchange with Hypervisor, 
>>>>> code as below:
>>>>>
>>>>> 326         phys = *dma_handle;                                           
>>>>>           
>>>>> 327         dev_addr = xen_phys_to_bus(phys);                             
>>>>>           
>>>>> 328         if (((dev_addr + size - 1 <= dma_mask)) &&                    
>>>>>           
>>>>> 329             !range_straddles_page_boundary(phys, size))               
>>>>>           
>>>>> 330                 *dma_handle = dev_addr;                               
>>>>>           
>>>>> 331         else {                                                        
>>>>>           
>>>>> 332                 if (xen_create_contiguous_region(phys, order,         
>>>>>           
>>>>> 333                                                  fls64(dma_mask), 
>>>>> dma_handle) != 0) {
>>>>> 334                         xen_free_coherent_pages(hwdev, size, ret, 
>>>>> (dma_addr_t)phys, attrs);
>>>>> 335                         return NULL;                                  
>>>>>           
>>>>> 336                 }                                                     
>>>>>           
>>>>> 337         }                                                             
>>>>>           
>>>>>                                                                      
>>>>>
>>>>> On freeing, need to return the memory to Xen, otherwise DMA memory will 
>>>>> be used
>>>>> up(this is the issue the patch intend to fix), so when memory is DMAable 
>>>>> and
>>>>> contiguous then call xen_destroy_contiguous_region(), return DMA memory 
>>>>> to Xen.
>>>> So if you want to allocate 1 byte at address 0 (and dev_addr=phys),
>>>> xen_create_contiguous_region() will not be called. And yet you will call
>>>> xen_destroy_contiguous_region() in the free path.
>>>>
>>>> Is this the expected behavior?
>>> I could not say it's expected behavior, but I think it's reasonable.
>>
>> I would expect xen_create_contiguous_region() and
>> xen_destroy_contiguous_region() to come in pairs. If a region is
>> created, it needs to be destroyed. And vice versa.
>>
>>
>>>
>>> On allocating, it used __get_free_pages() to allocate memory, if lucky the 
>>> memory is 
>>> DMAable, will not exchange memory with hypervisor, obviously this is not 
>>> guaranteed.
>>>
>>> And on freeing it could not be identified if memory from Dom0/guest own 
>>> memory
>>> or hypervisor
>>
>>
>> I think it can be. if (!(dev_addr + size - 1 <= dma_mask) ||
>> range_straddles_page_boundary()) then it must have come from the
>> hypervisor, because that's the check we make in
>> xen_swiotlb_alloc_coherent().
> 
> This is not true.
> 
> dev_addr was came from dma_handle, *dma_handle will be changed  after called
> xen_create_contiguous_region():
> 
> 2590 int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order, 
>        
> 2591                                  unsigned int address_bits,              
>        
> 2592                                  dma_addr_t *dma_handle)                 
>        
> 2593 {                                                                        
>        
> ......
> 2617         success = xen_exchange_memory(1UL << order, 0, in_frames,        
>        
> 2618                                       1, order, &out_frame,              
>        
> 2619                                       address_bits);                     
>        
> 2620                                                                          
>        
> 2621         /* 3. Map the new extent in place of old pages. */               
>        
> 2622         if (success)                                                     
>        
> 2623                 xen_remap_exchanged_ptes(vstart, order, NULL, 
> out_frame);       
> 2624         else                                                             
>        
> 2625                 xen_remap_exchanged_ptes(vstart, order, in_frames, 0);   
>        
> 2626                                                                          
>        
> 2627         spin_unlock_irqrestore(&xen_reservation_lock, flags);            
>        
> 2628                                                                          
>        
> 2629         *dma_handle = virt_to_machine(vstart).maddr;                     
>        
> 2630         return success ? 0 : -ENOMEM;                                    
>        
> 2631 }                                                                        
>        
> 
> 
> So means dev_addr check on xen_swiotlb_alloc_coherent() is not same one on
> xen_swiotlb_free_coherent().
> 
> Thanks,
> Joe
> 
> 
>>
>>
>> -boris
>>
>>
>>> , if don't back memory to hypervisor which will lead hypervisor DMA 
>>> memory be used up, then on Dom0/guest, DMA request maybe failed, the worse 
>>> thing is
>>> could not start any new guest.
>>>
>>> Thanks,
>>> Joe
>>>
>>>> -boris
>>>>
>>


-- 
Oracle <http://www.oracle.com>
Joe Jin | Software Development Director 
ORACLE | Linux and Virtualization
500 Oracle Parkway Redwood City, CA US 94065

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.