[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 03/13] vpci: move lock outside of struct vpci


  • To: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Fri, 4 Feb 2022 14:06:49 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cOsj3/lO0yyIvHSXvMIiEAHT+mxFIqipm9UcgUnPejE=; b=SXG1Fk2HWumXBK7oCzWvrDInjERi8GONMvINIXZPfUg6F74g7qyiihjn+G79OMbOCgGee+hNfiEVwZ3HZc9bPUBEwQMDwntiUbeM96NZLYqpCIUy+mRckRgQ5pVr/o0RoE8ZipGVlcp3hFpp/uKuoxzkClUUB/7yzWU98ybaANFdH3BiESwk/fFnC7WK2Z5QyK5Wb1xMwRS6XTyLzJZ5a2+oouvykunpL9Mw0u1YgVHNaErqISqjETRc+Yfsj9PlERpd69nksce/ypd3Gdu8YPtNRfPBCsQ5BPyrZP8TJitQA5CzOCt6ZCBB30V6LYiQF9s4HYiXUH6IhWAzqIkWyA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=l8BzuojzgjPvIp/D4aBHR42tQ4W1Y/UW4dR56vluyE13fawkhUsMSDwZxaQV0lG22agyrObV4Rxsy5Qy5RcRNp2IX2ST7HjFwvKkxerg/cko2nKUemkBXbQyXlpm81vosP0OgjpRhQV+IH/F1fYzod+sfLmIaDphb0pfttqwZkKkT3I/D1TWOVusFlAPJ+XiTZwzyWlytXuPj6YBm//sXbEbCPBOJMuUXgUmyVOOjcB4SmDDCq+dUahsygw7PKzaNP6uXKJskbWUVODeS3yO17ALyO4By6+MTW6q33QS5wwLvjOPqXiIzg2GXG1mvdPHewFMW54PJbRMBkj/Z2iM4w==
  • Authentication-results: esa1.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Jan Beulich <jbeulich@xxxxxxxx>, "julien@xxxxxxx" <julien@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Artem Mygaiev <Artem_Mygaiev@xxxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>, "george.dunlap@xxxxxxxxxx" <george.dunlap@xxxxxxxxxx>, "paul@xxxxxxx" <paul@xxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 04 Feb 2022 13:07:07 +0000
  • Ironport-data: A9a23:xdCkwKpngB8rce3uOrVNcnpRgtteBmI6YxIvgKrLsJaIsI4StFCzt garIBmEPv2DYDCmed8jbIq39R4G7MfWx95jQFBlpSs9FX5EpZuZCYyVIHmrMnLJJKUvbq7GA +byyDXkBJppJpMJjk71atANlZT4vE2xbuKU5NTsY0idfic5Dndx4f5fs7Rh2NQw24DlW1nlV e7a+KUzBnf0g1aYDUpMg06zgEsHUCPa4W5wUvQWPJinjXeG/5UnJMt3yZKZdhMUdrJ8DO+iL 9sv+Znilo/vE7XBPfv++lrzWhVirrc/pmFigFIOM0SpqkAqSiDfTs/XnRfTAKtao2zhojx/9 DlCnaWLTAAkLqf1o7RecSNRGSVXbKxA0ZaSdBBTseTLp6HHW37lwvEoB0AqJ4wIvO1wBAmi9 9RBdmpLNErawbvrnvTrEYGAhex6RCXvFJkYtXx6iynQEN4tQIzZQrWM7thdtNs1rp4VTK6FN 5BIAdZpRBntTDBPZFEwM4M3zeqslmDkKRJxtk3A8MLb5ECMlVcsgdABKuH9YtWXQe1Fk0Deo XjJl0zQGA0XMeu62DWM83+yruLXlCa9U4UXfJWy++R2mlSVyioWAQcPSFqgifCjjwi1XNc3A 1wZ/G8ioLY/8GSvT8LhRFuorXicpBkeVtFMVeog52ml0KPU/gLfHWkCQT5pYcYj8sQxQFQC6 FiNmN/4AC11h5ecQ3md67S8oCu7PG4eKmpqTS0OQBYB4tLjiJoulR+JRdFmeIaSitD2Ajj2z yq9hS4ynagIjcUL2qO4+njKmzup4JPOS2Yd5BjTX2+jxhN0YsiifYPAwX/f4PVbJYCVVG69r WMEkMiT6uMJJZyVnSnLS+IIdJmy/OqMOjDYhV9pHrEi+i6r9nrleppfiAySP28wbJxCI2WwJ haO50UBv/e/IUdGc4dOWIe2BtQg9JOwVvbfX/rQR4tPZLFYIVrvED5VWWac2GXkkU4JmK45O IuGfcvEMUv2GZiL3xLtGb5DjOZDKjQWgDqKGMull0jPPa+2OSbNIYrpJmdieQzQAEmshAzOu +hSOMKRo/m0eL2vO3KHmWL/wL1jEJTaOXwUg5EPHgJgClA/cI3ENxM36el6E2CCt/8N/tokB lnnBidlJKPX3BUr0zmiZHF5c6/IVp1it389NiFEFQ/2hyNzOtvxsfZEKMBfkVwbGApLl6Uco x4tIJ3oPxiyYm6fp2R1gWfV8OSOiyhHdSrRZnH4MVDTjrZrRhDT+8+MQ+cc3HJmM8ZDjuNn+ +fI/lqCGfIrHl0+ZO6LOKPH5w7g5hA1xbMtN3Yk1/EOIS0ABqAxcHev5hL2SulRQSj+Ksyyj FfLWEpI+LWW/ufYMrDh3Mi5kmtgKMMndmJyFGjH97emcy7c+2uo24hbV+iUOzvaUQvJFG+KP 425Ftnwb68KmkhkqY15H+o5xK4y/YK39bRb0h5lDDPAaFHyUuFsJXyP3M9usKxRx+AG5VvqC xzXotQKa6+UPM7FEUIKIFZ3ZOq0yvxJyCLZ6u44IRum6XYvrqaHS0hbIzKFlDdZcOluKIogz Op44JwW5gWzhwAEKNGDiiwIpW2AImZZC/cst40AAZ+tgQ0ukwkQbZvZAy7wwZeOd9QTbRV6f m7K3PLP3u0OyFDDfnw/EWn28dBc3Zle6gpXyFIiJkiSnoaXjPEAwxAMoy88SR5Yz0sb3rsra HRrLUB8OY6H4yxs2JpYR2mpFgxMWE+Z90j2xwdbnWHVVRD1BGnELWl7MueR5kEJtWlbe2EDr r2fzW/kVxfsfd3wgXRuCRI09aS7QIwj7BDGleCmA9+BTsszbjfSi6OzYXYF9kn8CsQriUyb/ eRn8Y6cs0Eg2fL8d0HjN7Sn6A==
  • Ironport-hdrordr: A9a23:Swpzeaq2cDY0JeXgDdVYqOUaV5uzL9V00zEX/kB9WHVpm5Oj+P xGzc526farslsssREb+OxpOMG7MBThHLpOkPMs1NCZLXTbUQqTXfpfBO7ZrQEIdBeOlNK1uZ 0QFpSWTeeAcWSS7vyKkTVQcexQueVvmZrA7Yy1rwYPcegpUdAZ0+4QMHfkLqQcfnghOXNWLu v52iIRzADQBkj/I/7LTUUtbqzmnZnmhZjmaRkJC1oO7xSPtyqh7PrfHwKD1hkTfjtTyfN6mF K13jDR1+GGibWW2xXc32jc49B/n8bg8MJKAIiphtIOIjvhpw60bMBKWqGEvhoyvOazgWxa2u XkklMFBYBe+nnRdma6rV/E3BTh6i8n7zvYxVqRkRLY0LrEbQN/L/AEqZNScxPf5UZllsp7yr h302WQsIcSJQ/cnQzmjuK4GS1Cpw6Rmz4PgOQTh3tQXc81c7lKt7ES+0tTDdMpAD/60oY6C+ NjZfusq8q+SWnqL0wxg1Mfg+BFBh8Ib1W7qwk5y4CoOgFt7TFEJxBy/r1bop8CnKhNPKWsqd 60dpiAr4s+PfP+W5gNcNvpcfHHelAlfii8Ql56AW6XXZ3vaEi946Ie3t0OlZSXkdozvdwPpK g=
  • Ironport-sdr: BoWmmKysbnN0hcYtlNd95B2+l2rBu7gPQyVp6tPjS0YFYXr5NKmVDfmgBrufyZYPChpBT76I8r G8UIA6cBjtxYZUjrqm30ZBaCUpZ1ol11aodpzIFm4mIMiuClq2dSBtC4Dlla8o0S0AsTuKEoWQ +bJmYjJLrlAJiULrt4HtZy6Eu/nlGiaM8LSaQKQZufvM9Te7j7GiT5eOhVfQo8dzDt+YzDD/Hy 0XMz2XY6nOvG0HMivEr1vOs+4RnUqlwceFR5NzXS5wTNC31ejpoOwh62FaWt8LxaN9lBO5iBey LxrI9bIlF/XXSxCkpvCT6MWg
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, Feb 04, 2022 at 12:53:20PM +0000, Oleksandr Andrushchenko wrote:
> 
> 
> On 04.02.22 14:47, Jan Beulich wrote:
> > On 04.02.2022 13:37, Oleksandr Andrushchenko wrote:
> >>
> >> On 04.02.22 13:37, Jan Beulich wrote:
> >>> On 04.02.2022 12:13, Roger Pau Monné wrote:
> >>>> On Fri, Feb 04, 2022 at 11:49:18AM +0100, Jan Beulich wrote:
> >>>>> On 04.02.2022 11:12, Oleksandr Andrushchenko wrote:
> >>>>>> On 04.02.22 11:15, Jan Beulich wrote:
> >>>>>>> On 04.02.2022 09:58, Oleksandr Andrushchenko wrote:
> >>>>>>>> On 04.02.22 09:52, Jan Beulich wrote:
> >>>>>>>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
> >>>>>>>>>> @@ -285,6 +286,12 @@ static int modify_bars(const struct pci_dev 
> >>>>>>>>>> *pdev, uint16_t cmd, bool rom_only)
> >>>>>>>>>>                      continue;
> >>>>>>>>>>              }
> >>>>>>>>>>      
> >>>>>>>>>> +        spin_lock(&tmp->vpci_lock);
> >>>>>>>>>> +        if ( !tmp->vpci )
> >>>>>>>>>> +        {
> >>>>>>>>>> +            spin_unlock(&tmp->vpci_lock);
> >>>>>>>>>> +            continue;
> >>>>>>>>>> +        }
> >>>>>>>>>>              for ( i = 0; i < ARRAY_SIZE(tmp->vpci->header.bars); 
> >>>>>>>>>> i++ )
> >>>>>>>>>>              {
> >>>>>>>>>>                  const struct vpci_bar *bar = 
> >>>>>>>>>> &tmp->vpci->header.bars[i];
> >>>>>>>>>> @@ -303,12 +310,14 @@ static int modify_bars(const struct pci_dev 
> >>>>>>>>>> *pdev, uint16_t cmd, bool rom_only)
> >>>>>>>>>>                  rc = rangeset_remove_range(mem, start, end);
> >>>>>>>>>>                  if ( rc )
> >>>>>>>>>>                  {
> >>>>>>>>>> +                spin_unlock(&tmp->vpci_lock);
> >>>>>>>>>>                      printk(XENLOG_G_WARNING "Failed to remove 
> >>>>>>>>>> [%lx, %lx]: %d\n",
> >>>>>>>>>>                             start, end, rc);
> >>>>>>>>>>                      rangeset_destroy(mem);
> >>>>>>>>>>                      return rc;
> >>>>>>>>>>                  }
> >>>>>>>>>>              }
> >>>>>>>>>> +        spin_unlock(&tmp->vpci_lock);
> >>>>>>>>>>          }
> >>>>>>>>> At the first glance this simply looks like another unjustified (in 
> >>>>>>>>> the
> >>>>>>>>> description) change, as you're not converting anything here but you
> >>>>>>>>> actually add locking (and I realize this was there before, so I'm 
> >>>>>>>>> sorry
> >>>>>>>>> for not pointing this out earlier).
> >>>>>>>> Well, I thought that the description already has "...the lock can be
> >>>>>>>> used (and in a few cases is used right away) to check whether vpci
> >>>>>>>> is present" and this is enough for such uses as here.
> >>>>>>>>>      But then I wonder whether you
> >>>>>>>>> actually tested this, since I can't help getting the impression that
> >>>>>>>>> you're introducing a live-lock: The function is called from 
> >>>>>>>>> cmd_write()
> >>>>>>>>> and rom_write(), which in turn are called out of vpci_write(). Yet 
> >>>>>>>>> that
> >>>>>>>>> function already holds the lock, and the lock is not (currently)
> >>>>>>>>> recursive. (For the 3rd caller of the function - init_bars() - otoh
> >>>>>>>>> the locking looks to be entirely unnecessary.)
> >>>>>>>> Well, you are correct: if tmp != pdev then it is correct to acquire
> >>>>>>>> the lock. But if tmp == pdev and rom_only == true
> >>>>>>>> then we'll deadlock.
> >>>>>>>>
> >>>>>>>> It seems we need to have the locking conditional, e.g. only lock
> >>>>>>>> if tmp != pdev
> >>>>>>> Which will address the live-lock, but introduce ABBA deadlock 
> >>>>>>> potential
> >>>>>>> between the two locks.
> >>>>>> I am not sure I can suggest a better solution here
> >>>>>> @Roger, @Jan, could you please help here?
> >>>>> Well, first of all I'd like to mention that while it may have been okay 
> >>>>> to
> >>>>> not hold pcidevs_lock here for Dom0, it surely needs acquiring when 
> >>>>> dealing
> >>>>> with DomU-s' lists of PCI devices. The requirement really applies to the
> >>>>> other use of for_each_pdev() as well (in vpci_dump_msi()), except that
> >>>>> there it probably wants to be a try-lock.
> >>>>>
> >>>>> Next I'd like to point out that here we have the still pending issue of
> >>>>> how to deal with hidden devices, which Dom0 can access. See my RFC patch
> >>>>> "vPCI: account for hidden devices in modify_bars()". Whatever the 
> >>>>> solution
> >>>>> here, I think it wants to at least account for the extra need there.
> >>>> Yes, sorry, I should take care of that.
> >>>>
> >>>>> Now it is quite clear that pcidevs_lock isn't going to help with 
> >>>>> avoiding
> >>>>> the deadlock, as it's imo not an option at all to acquire that lock
> >>>>> everywhere else you access ->vpci (or else the vpci lock itself would be
> >>>>> pointless). But a per-domain auxiliary r/w lock may help: Other paths
> >>>>> would acquire it in read mode, and here you'd acquire it in write mode 
> >>>>> (in
> >>>>> the former case around the vpci lock, while in the latter case there may
> >>>>> then not be any need to acquire the individual vpci locks at all). 
> >>>>> FTAOD:
> >>>>> I haven't fully thought through all implications (and hence whether 
> >>>>> this is
> >>>>> viable in the first place); I expect you will, documenting what you've
> >>>>> found in the resulting patch description. Of course the double lock
> >>>>> acquire/release would then likely want hiding in helper functions.
> >>>> I've been also thinking about this, and whether it's really worth to
> >>>> have a per-device lock rather than a per-domain one that protects all
> >>>> vpci regions of the devices assigned to the domain.
> >>>>
> >>>> The OS is likely to serialize accesses to the PCI config space anyway,
> >>>> and the only place I could see a benefit of having per-device locks is
> >>>> in the handling of MSI-X tables, as the handling of the mask bit is
> >>>> likely very performance sensitive, so adding a per-domain lock there
> >>>> could be a bottleneck.
> >>> Hmm, with method 1 accesses serializing globally is basically
> >>> unavoidable, but with MMCFG I see no reason why OSes may not (move
> >>> to) permit(ting) parallel accesses, with serialization perhaps done
> >>> only at device level. See our own pci_config_lock, which applies to
> >>> only method 1 accesses; we don't look to be serializing MMCFG
> >>> accesses at all.
> >>>
> >>>> We could alternatively do a per-domain rwlock for vpci and special case
> >>>> the MSI-X area to also have a per-device specific lock. At which point
> >>>> it becomes fairly similar to what you propose.
> >> @Jan, @Roger
> >>
> >> 1. d->vpci_lock - rwlock <- this protects vpci
> >> 2. pdev->vpci->msix_tbl_lock - rwlock <- this protects MSI-X tables
> >> or should it better be pdev->msix_tbl_lock as MSI-X tables don't
> >> really depend on vPCI?
> > If so, perhaps indeed better the latter. But as said in reply to Roger,
> > I'm not convinced (yet) that doing away with the per-device lock is a
> > good move. As said there - we're ourselves doing fully parallel MMCFG
> > accesses, so OSes ought to be fine to do so, too.
> But with pdev->vpci_lock we face ABBA...

I think it would be easier to start with a per-domain rwlock that
guarantees pdev->vpci cannot be removed under our feet. This would be
taken in read mode in vpci_{read,write} and in write mode when
removing a device from a domain.

Then there are also other issues regarding vPCI locking that need to
be fixed, but that lock would likely be a start.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.