[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [XEN PATCH v1] xen/arm : Add support for SMMUv3 driver


  • To: Julien Grall <julien@xxxxxxx>
  • From: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
  • Date: Thu, 29 Oct 2020 14:22:44 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=E61SZG2vUV+Pu0oXHvXKigcaIzgjPeXzQD8dTqxzB9I=; b=dlGueqsXYNLNRMHcy/FXVw5qJH0pedYsIy0FVimW3lbbzk8NP8fYxqmpvJgymP14oy5D13YbSiF/YMjn6SFATI+KLfWIKgK3ANDqn+QaLGoaWlxmNdicuTn/TIbQtUrATzWPl8KQQkOgFyWO4iWxnM5JnVy6qPo94ex8G99wsd/rknCnqjCulwmF+5uSQ2dTJ+i+E0VCfTKAQaWul/ZQPmHIeMYeDGPQlLWuhJnLNppiqwo+Y7eLayZOZKiQX7poRJQ+mT2hqDlHJgBXj0Sso9uFzxqB2KwJ5YfOigD67bWnSEs1VRJ06rw4ui2OmwV7S84/V5Ety4Y4Tj6xDQkAng==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BNJTbMm51qS3xPLWKGb6K0+1L6o4n2ZRgdAFjferw+Hq4A34Fh6dFB4cMo6rFkmqfXCPwr8zxHNwoLSQeoK5h2f9+R7n1qQrIXBT4iJNpnC7uIpdRcNutfCvAi9o+b8MehJ1cnPBbsaINUI6h4LJdP8e4VkztYcDwAO1uG5aS7q9wTKtVyFpJNrvmdIdn7HBt3K2ykY0sY90SA+PtgKi9a6Ixx8nYw3nqPOa2TRRWjjscBPfFJhnNu2zHSzMrW2gjrD7LZOCoKNOrdvPS64kQwO/00WQOrB0zfpiQzkmKk62mr6smnFaTYqxHXbg4aindbhMI6ZYw5JH3ed+85Wrug==
  • Authentication-results-original: xen.org; dkim=none (message not signed) header.d=none;xen.org; dmarc=none action=none header.from=arm.com;
  • Cc: Rahul Singh <Rahul.Singh@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Paul Durrant <paul@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Delivery-date: Thu, 29 Oct 2020 14:23:33 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: xen.org; dkim=none (message not signed) header.d=none;xen.org; dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHWoVvyGh0jWzCHpk6w/U7XN+08x6mgw0AAgAEzsQCAAWItgIABA+GAgADBXQCAABfAAIAAGI8AgAAORwCABG97AIADrU2AgAFBfQA=
  • Thread-topic: [XEN PATCH v1] xen/arm : Add support for SMMUv3 driver

Hi Julien,

> On 28 Oct 2020, at 19:12, Julien Grall <julien@xxxxxxx> wrote:
> 
> 
> 
> On 26/10/2020 11:03, Rahul Singh wrote:
>> Hello Julien,
> 
> Hi Rahul,
> 
>>> On 23 Oct 2020, at 4:19 pm, Julien Grall <julien@xxxxxxx> wrote:
>>> 
>>> 
>>> 
>>> On 23/10/2020 15:27, Rahul Singh wrote:
>>>> Hello Julien,
>>>>> On 23 Oct 2020, at 2:00 pm, Julien Grall <julien@xxxxxxx> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> On 23/10/2020 12:35, Rahul Singh wrote:
>>>>>> Hello,
>>>>>>> On 23 Oct 2020, at 1:02 am, Stefano Stabellini <sstabellini@xxxxxxxxxx> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> On Thu, 22 Oct 2020, Julien Grall wrote:
>>>>>>>>>> On 20/10/2020 16:25, Rahul Singh wrote:
>>>>>>>>>>> Add support for ARM architected SMMUv3 implementations. It is based 
>>>>>>>>>>> on
>>>>>>>>>>> the Linux SMMUv3 driver.
>>>>>>>>>>> Major differences between the Linux driver are as follows:
>>>>>>>>>>> 1. Only Stage-2 translation is supported as compared to the Linux 
>>>>>>>>>>> driver
>>>>>>>>>>>    that supports both Stage-1 and Stage-2 translations.
>>>>>>>>>>> 2. Use P2M  page table instead of creating one as SMMUv3 has the
>>>>>>>>>>>    capability to share the page tables with the CPU.
>>>>>>>>>>> 3. Tasklets is used in place of threaded IRQ's in Linux for event 
>>>>>>>>>>> queue
>>>>>>>>>>>    and priority queue IRQ handling.
>>>>>>>>>> 
>>>>>>>>>> Tasklets are not a replacement for threaded IRQ. In particular, they 
>>>>>>>>>> will
>>>>>>>>>> have priority over anything else (IOW nothing will run on the pCPU 
>>>>>>>>>> until
>>>>>>>>>> they are done).
>>>>>>>>>> 
>>>>>>>>>> Do you know why Linux is using thread. Is it because of long running
>>>>>>>>>> operations?
>>>>>>>>> 
>>>>>>>>> Yes you are right because of long running operations Linux is using 
>>>>>>>>> the
>>>>>>>>> threaded IRQs.
>>>>>>>>> 
>>>>>>>>> SMMUv3 reports fault/events bases on memory-based circular buffer 
>>>>>>>>> queues not
>>>>>>>>> based on the register. As per my understanding, it is time-consuming 
>>>>>>>>> to
>>>>>>>>> process the memory based queues in interrupt context because of that 
>>>>>>>>> Linux
>>>>>>>>> is using threaded IRQ to process the faults/events from SMMU.
>>>>>>>>> 
>>>>>>>>> I didn’t find any other solution in XEN in place of tasklet to defer 
>>>>>>>>> the
>>>>>>>>> work, that’s why I used tasklet in XEN in replacement of threaded 
>>>>>>>>> IRQs. If
>>>>>>>>> we do all work in interrupt context we will make XEN less responsive.
>>>>>>>> 
>>>>>>>> So we need to make sure that Xen continue to receives interrupts, but 
>>>>>>>> we also
>>>>>>>> need to make sure that a vCPU bound to the pCPU is also responsive.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> If you know another solution in XEN that will be used to defer the 
>>>>>>>>> work in
>>>>>>>>> the interrupt please let me know I will try to use that.
>>>>>>>> 
>>>>>>>> One of my work colleague encountered a similar problem recently. He 
>>>>>>>> had a long
>>>>>>>> running tasklet and wanted to be broken down in smaller chunk.
>>>>>>>> 
>>>>>>>> We decided to use a timer to reschedule the taslket in the future. 
>>>>>>>> This allows
>>>>>>>> the scheduler to run other loads (e.g. vCPU) for some time.
>>>>>>>> 
>>>>>>>> This is pretty hackish but I couldn't find a better solution as 
>>>>>>>> tasklet have
>>>>>>>> high priority.
>>>>>>>> 
>>>>>>>> Maybe the other will have a better idea.
>>>>>>> 
>>>>>>> Julien's suggestion is a good one.
>>>>>>> 
>>>>>>> But I think tasklets can be configured to be called from the idle_loop,
>>>>>>> in which case they are not run in interrupt context?
>>>>>>> 
>>>>>>  Yes you are right tasklet will be scheduled from the idle_loop that is 
>>>>>> not interrupt conext.
>>>>> 
>>>>> This depends on your tasklet. Some will run from the softirq context 
>>>>> which is usually (for Arm) on the return of an exception.
>>>>> 
>>>> Thanks for the info. I will check and will get better understanding of the 
>>>> tasklet how it will run in XEN.
>>>>>>> 
>>>>>>>>>>> 4. Latest version of the Linux SMMUv3 code implements the commands 
>>>>>>>>>>> queue
>>>>>>>>>>>    access functions based on atomic operations implemented in Linux.
>>>>>>>>>> 
>>>>>>>>>> Can you provide more details?
>>>>>>>>> 
>>>>>>>>> I tried to port the latest version of the SMMUv3 code than I observed 
>>>>>>>>> that
>>>>>>>>> in order to port that code I have to also port atomic operation 
>>>>>>>>> implemented
>>>>>>>>> in Linux to XEN. As latest Linux code uses atomic operation to 
>>>>>>>>> process the
>>>>>>>>> command queues 
>>>>>>>>> (atomic_cond_read_relaxed(),atomic_long_cond_read_relaxed() ,
>>>>>>>>> atomic_fetch_andnot_relaxed()) .
>>>>>>>> 
>>>>>>>> Thank you for the explanation. I think it would be best to import the 
>>>>>>>> atomic
>>>>>>>> helpers and use the latest code.
>>>>>>>> 
>>>>>>>> This will ensure that we don't re-introduce bugs and also buy us some 
>>>>>>>> time
>>>>>>>> before the Linux and Xen driver diverge again too much.
>>>>>>>> 
>>>>>>>> Stefano, what do you think?
>>>>>>> 
>>>>>>> I think you are right.
>>>>>> Yes, I agree with you to have XEN code in sync with Linux code that's 
>>>>>> why I started with to port the Linux atomic operations to XEN  then I 
>>>>>> realised that it is not straightforward to port atomic operations and it 
>>>>>> requires lots of effort and testing. Therefore I decided to port the 
>>>>>> code before the atomic operation is introduced in Linux.
>>>>> 
>>>>> Hmmm... I would not have expected a lot of effort required to add the 3 
>>>>> atomics operations above. Are you trying to also port the LSE support at 
>>>>> the same time?
>>>> There are other atomic operations used in the SMMUv3 code apart from the 3 
>>>> atomic operation I mention. I just mention 3 operation as an example.
>>> 
>>> Ok. Do you have a list you could share?
>>> 
>> Yes. Please find the list that we have to port to the XEN in order to merge 
>> the latest SMMUv3 code.
> 
> Thanks!
> 
>> If we start to port the below list we might have to port another atomic 
>> operation based on which below atomic operations are implemented. I have not 
>> spent time on how these atomic operations are implemented in detail but as 
>> per my understanding, it required an effort to port them to XEN and required 
>> a lot of testing.
> 
> For the beginning, I think it is fine to implement them with a stronger 
> memory barrier than necessary or in a less efficient. This can be refined 
> afterwards.
> 
> Those helpers can directly be defined in the SMMUv3 drivers so we know they 
> are not for general purpose :).
> 
>> 1. atomic_set_release
> 
> This could be implemented as:
> 
> smp_mb();
> atomic_set();
> 
>> 2. atomic_fetch_andnot_relaxed
> 
> This would need to be imported.
> 
>> 3. atomic_cond_read_relaxed
> 
> This would need to be imported. The simplest version seems to be the generic 
> version provided by include/asm-generic/barrier.h (see smp_cond_load_relaxed).
> 
>> 4. atomic_long_cond_read_relaxed
>> 5. atomic_long_xor
> 
> The two would require the implementation of atomic64. Volodymyr also required 
> a version. I offered my help, however I didn't find enough time to do it yet 
> :(.
> 
> For Arm64, it would be possible to do a copy/paste of the existing helpers 
> and replace anything related to a 32-bit register with a 64-bit one.
> 
> For Arm32, they are a bit more complex because you now need to work with 2 
> registers.
> 
> However, for your purpose, you would be using atomic_long_t. So the the Arm64 
> implementation should be sufficient.
> 
>> 6. atomic_set_release
> 
> Same as 1.
> 
>> 7. atomic_cmpxchg_relaxed might be we can use atomic_cmpxchg that is 
>> implemented in XEN need to check.
> atomic_cmpxchg() is strongly ordered. So it would be fine to use it for 
> implementing the helper. Although, more inefficient :).
> 
>> 8. atomic_dec_return_release
> 
> Xen implements a stronger version atomic_dec_return(). You can re-use it here.
> 
>> 9. atomic_fetch_inc_relaxed
> 
> This would need to be imported.

We do agree that this would be the work required but some of the things to be 
imported have dependencies and this is not
a simple change on the patch done by Rahul and it would require to almost 
restart the implementation and testing from the
very beginning.

In the meantime could we process with the review of this SMMUv3 driver and 
include it in Xen as is ?

We can set me and Rahul as maintainers and flag the driver as experimental in 
Support.md (it is already
protected by the EXPERT configuration in Kconfig).


Cheers
Bertrand


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.