[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] vpci: introduce per-domain lock to protect vpci structure


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>
  • Date: Tue, 15 Feb 2022 11:45:11 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6guLOEroULM9IKHWLhN0puwOFHN5Z3jUhE3Y1oJbSyM=; b=U+H0HxojelU1ZIIPt0wpvjhHc7Qe8dAY6QNfSyVbKBKRFDcvceOVNKoLvJMwsquQ4Dqyhr63PWQqHGqP/ycxb/4ljGo0f6wc40N+UNHZnfKUJPwA24uFowU7yrRyLSOIoolmiJjE8Ul9fRAmODP+IA9FIMSxvmYRw6xswpFZGBuyAZwBxiJJ4Ns5g6SAKwfqsdEIk+7zVQd31YtwWBiG+lcZBKZ56653VJBlSDTF29XeZIQRzMTkdpChXi1dbnjepX+L/2C3oT9uBr3XSK9ETeLpqTd6rn+61wJnTAhuBlJMhGbizKALtDriSAaHfmFHW1a1e5c1LWV7E3zqrcbVFw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=J2Rgt8WY0KM4Y/5kVd5Hd5q1vONmD5/0c92w4bYjwEgVsQ/5FQyr44mXc1YzJr4o3usm3CwGzTvSYVESbBjltICFCu7h0JgF8D+skW8BFTb+qHkE2Z2EYZHTK2KFOiNhOEpry8zWLB/gYT/n845oauj8B+IWOek6ZUxfXV9DW7HriZkMHFVYsASWaqHotRAAY1oLC7B376urmBoAJ4vxAn4vy+p0pmT+p2LoDLUOcSJ9N6Hv4JP7aFc3zLXIz94zqg43l1gC3LhQFSlTE5BO+N7XnyBBKi/02eNmJvacQ1llr2S+zx+/rRp5g96o3DrGqdD+yX40jh7D2zq3Z7NJ/Q==
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "jbeulich@xxxxxxxx" <jbeulich@xxxxxxxx>, "julien@xxxxxxx" <julien@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Artem Mygaiev <Artem_Mygaiev@xxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>
  • Delivery-date: Tue, 15 Feb 2022 11:45:28 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHYIkOs9pmsZb0l5kefj1m3qdTVCayUboeAgAAGwYCAAAeVgIAAAZQA
  • Thread-topic: [PATCH v2] vpci: introduce per-domain lock to protect vpci structure


On 15.02.22 13:39, Roger Pau Monné wrote:
> On Tue, Feb 15, 2022 at 11:12:23AM +0000, Oleksandr Andrushchenko wrote:
>>
>> On 15.02.22 12:48, Roger Pau Monné wrote:
>>> On Tue, Feb 15, 2022 at 10:11:35AM +0200, Oleksandr Andrushchenko wrote:
>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
>>>>
>>>> Introduce a per-domain read/write lock to check whether vpci is present,
>>>> so we are sure there are no accesses to the contents of the vpci struct
>>>> if not. This lock can be used (and in a few cases is used right away)
>>>> so that vpci removal can be performed while holding the lock in write
>>>> mode. Previously such removal could race with vpci_read for example.
>>>>
>>>> 1. Per-domain's vpci_rwlock is used to protect pdev->vpci structure
>>>> from being removed.
>>>>
>>>> 2. Writing the command register and ROM BAR register may trigger
>>>> modify_bars to run, which in turn may access multiple pdevs while
>>>> checking for the existing BAR's overlap. The overlapping check, if done
>>>> under the read lock, requires vpci->lock to be acquired on both devices
>>>> being compared, which may produce a deadlock. It is not possible to
>>>> upgrade read lock to write lock in such a case. So, in order to prevent
>>>> the deadlock, check which registers are going to be written and acquire
>>>> the lock in the appropriate mode from the beginning.
>>>>
>>>> All other code, which doesn't lead to pdev->vpci destruction and does not
>>>> access multiple pdevs at the same time, can still use a combination of the
>>>> read lock and pdev->vpci->lock.
>>>>
>>>> 3. Optimize if ROM BAR write lock required detection by caching offset
>>>> of the ROM BAR register in vpci->header->rom_reg which depends on
>>>> header's type.
>>>>
>>>> 4. Reduce locked region in vpci_remove_device as it is now possible
>>>> to set pdev->vpci to NULL early right after the write lock is acquired.
>>>>
>>>> 5. Reduce locked region in vpci_add_handlers as it is possible to
>>>> initialize many more fields of the struct vpci before assigning it to
>>>> pdev->vpci.
>>>>
>>>> 6. vpci_{add|remove}_register are required to be called with the write lock
>>>> held, but it is not feasible to add an assert there as it requires
>>>> struct domain to be passed for that. So, add a comment about this 
>>>> requirement
>>>> to these and other functions with the equivalent constraints.
>>>>
>>>> 7. Drop const qualifier where the new rwlock is used and this is 
>>>> appropriate.
>>>>
>>>> 8. Do not call process_pending_softirqs with any locks held. For that 
>>>> unlock
>>>> prior the call and re-acquire the locks after. After re-acquiring the
>>>> lock there is no need to check if pdev->vpci exists:
>>>>    - in apply_map because of the context it is called (no race condition
>>>>      possible)
>>>>    - for MSI/MSI-X debug code because it is called at the end of
>>>>      pdev->vpci access and no further access to pdev->vpci is made
>>>>
>>>> 9. Check for !pdev->vpci in vpci_{read|write} after acquiring the lock
>>>> and if so, allow reading or writing the hardware register directly. This is
>>>> acceptable as we only deal with Dom0 as of now. Once DomU support is
>>>> added the write will need to be ignored and read return all 0's for the
>>>> guests, while Dom0 can still access the registers directly.
>>>>
>>>> 10. Introduce pcidevs_trylock, so there is a possibility to try locking
>>>> the pcidev's lock.
>>>>
>>>> 11. Use pcidev's lock around for_each_pdev and pci_get_pdev_by_domain
>>>> while accessing pdevs in vpci code.
>>> So if you use the pcidevs_lock then it's impossible for the pdev or
>>> pdev->vpci to be removed or recreated, as the pcidevs lock protects
>>> any device operations (add, remove, assign, deassign).
>>>
>>> It's however not OK to use the pcidevs lock in vpci_{read,write}
>>> as-is, as the introduced contention is IMO not acceptable.
>>>
>>> The only viable option I see here is to:
>>>
>>>    1. Make the pcidevs lock a rwlock: switch current callers to take the
>>>       lock in write mode, detect and fixup any issues that could arise
>>>       from the lock not being recursive anymore.
>>>    2. Take the lock in read mode around vpci_{read,write} sections that
>>>       rely on pdev (including the handlers).
>>>
>>> These items should be at least two separate patches. Let's not mix the
>>> conversion of pcidevs locks with the addition of vPCI support.
>>>
>>> I think with that we could get away without requiring a per-domain
>>> rwlock? Just doing lock ordering in modify_bars regarding
>>> tmp->vpci->lock vs pdev->vpci->lock. Neither pdev or vpci can go away
>>> while holding the pcidevs lock.
>>>
>>> Sorting the situation in modify_bars should also be done as a separate
>>> patch on top of 1. and 2.
>> So, to make it crystal clear: we can do with the locking as in this
>> patch and instead we need to convert pcidevs lock into rwlock.
>> Meaning that I need to drop this patch.
>>
>> Then, 3 patches to follow:
>> 1. pcidevs as rwlock
>> 2. vpci_{read|write} and the rest using new pcidevs rwlock
>> 3. lock ordering in modify_bars
>>
>> Is it what we want?
> Likely? The current approach of the per-domain rwlock still leaves us
> with a window between pci_get_pdev_by_domain and taking such lock
> where the device could be removed.
>
> We also need a safe way to use pci_get_pdev_by_domain without the
> devices being removed while using them, so it would seem we need the
> pcidevs lock anyway, in which case it seems possible to avoid having
> to introduce a per-domain rwlock.
>
> I'm happy with any approach that solves the issues we have at hand,
> but this proposal has a fundamental flaw of leaving a window after
> pci_get_pdev_by_domain where the device could be removed. I'm OK to
> have this fixed in a different way if there's one, but if the pcidevs
> lock is used in vpci_{read,write} it needs to be converted into a
> rwlock.
I'm on your side, I just want to hear that we all agree pcidevs
needs to be converted into rwlock according with the plan you
suggested and at least now it seems to be an acceptable solution.

This is going to be the 3rd attempt to make up the locking and
it all takes quite some time, unfortunately...
>
> Thanks, Roger.
>
Thank you,
Oleksandr

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.