Xen project Mailing List

Re: [PATCH v3 10/11] vpci: Add initial support for virtual PCI bus topology

From: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>

Date: Wed, 3 Nov 2021 08:57:59 +0000

Accept-language: en-US

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=abJhIhsTTkiUCMlNEdTW5hx8Z4rICWMfvn5gKhIQIe8=; b=Kkowx4Xx/mUGdG9Si1OTP4zFWbyD2EfXItePlv23Nu2Tl6Dyt4wAcSIFy5mStzlYsVR1LASGDIRZO/g6T6FuuWlB/z+LmhxiQ44jQysH30cPIOuSvtJ0OJBQ+YHQiFHhSUedA476Ob7rUFa2Z/XP9QQAvyFL3V0AnFTmQhz9euMMKcYqR2AQMhG/rBSZvGIPbI001vRNu8zjUxQVn6kW2NM7vB4F/6iMe3yMVf3Ag0jZSKPu20CJwzWV6zkBrFtfmqjhGr9+HDHhglSrws9x5Xr3FsPbSaI74CsP7TSKIIHegEDNnapba2cRaXjJfFPCfBzobPXGSgl7482JcSTvHQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Rvf7s76vtWyDzSuAU+fhk2wb6HLalrEI+0g2RiIDfdW2hCSMnTIlsMrZmrJwzHjRYysT7z03+mZ6X0OpKB99GhY3UEPyJRpiybJV1Auj4cKmPYRCE4cdb0FCWfwuPBqPVSaIipiqOQAib7uN5Wf03keGDrNjiAiqZSe+DO7BuDblj31Es+AP5W72FGuozkiykepEz+IQN/9Z9UhJ8w9bF6yYjJPjtkLnWxSdus4Kfm3PzaYu39v0pDbmju3jMpR7zYPOng7y65JVMXC3zUQeLc5C9hKGYaoeQ68qTBXGa34k71QEnfEIxLMbcOnhV84z0OA/q8SnW7eVzGZqaV0XNA==

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "julien@xxxxxxx" <julien@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Artem Mygaiev <Artem_Mygaiev@xxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>

Delivery-date: Wed, 03 Nov 2021 08:58:16 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHXtdAlOCu2AnkIOUC4Prsocox3sqvlTvCAgAw/FQCAACN9gIAABKkA

Thread-topic: [PATCH v3 10/11] vpci: Add initial support for virtual PCI bus topology

On 03.11.21 10:41, Jan Beulich wrote: > On 03.11.2021 07:34, Oleksandr Andrushchenko wrote: >> Hi, Roger >> >> On 26.10.21 14:33, Roger Pau Monné wrote: >>> On Thu, Sep 30, 2021 at 10:52:22AM +0300, Oleksandr Andrushchenko wrote: >>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> >>>> >>>> Assign SBDF to the PCI devices being passed through with bus 0. >>>> The resulting topology is where PCIe devices reside on the bus 0 of the >>>> root complex itself (embedded endpoints). >>>> This implementation is limited to 32 devices which are allowed on >>>> a single PCI bus. >>>> >>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> >>>> >>>> --- >>>> Since v2: >>>> - remove casts that are (a) malformed and (b) unnecessary >>>> - add new line for better readability >>>> - remove CONFIG_HAS_VPCI_GUEST_SUPPORT ifdef's as the relevant vPCI >>>> functions are now completely gated with this config >>>> - gate common code with CONFIG_HAS_VPCI_GUEST_SUPPORT >>>> New in v2 >>>> --- >>>> xen/common/domain.c | 3 ++ >>>> xen/drivers/passthrough/pci.c | 60 +++++++++++++++++++++++++++++++++++ >>>> xen/drivers/vpci/vpci.c | 14 +++++++- >>>> xen/include/xen/pci.h | 22 +++++++++++++ >>>> xen/include/xen/sched.h | 8 +++++ >>>> 5 files changed, 106 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/xen/common/domain.c b/xen/common/domain.c >>>> index 40d67ec34232..e0170087612d 100644 >>>> --- a/xen/common/domain.c >>>> +++ b/xen/common/domain.c >>>> @@ -601,6 +601,9 @@ struct domain *domain_create(domid_t domid, >>>> >>>> #ifdef CONFIG_HAS_PCI >>>> INIT_LIST_HEAD(&d->pdev_list); >>>> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT >>>> + INIT_LIST_HEAD(&d->vdev_list); >>>> +#endif >>>> #endif >>>> >>>> /* All error paths can depend on the above setup. */ >>>> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c >>>> index 805ab86ed555..5b963d75d1ba 100644 >>>> --- a/xen/drivers/passthrough/pci.c >>>> +++ b/xen/drivers/passthrough/pci.c >>>> @@ -831,6 +831,66 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn) >>>> return ret; >>>> } >>>> >>>> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT >>>> +static struct vpci_dev *pci_find_virtual_device(const struct domain *d, >>>> + const struct pci_dev >>>> *pdev) >>>> +{ >>>> + struct vpci_dev *vdev; >>>> + >>>> + list_for_each_entry ( vdev, &d->vdev_list, list ) >>>> + if ( vdev->pdev == pdev ) >>>> + return vdev; >>>> + return NULL; >>>> +} >>>> + >>>> +int pci_add_virtual_device(struct domain *d, const struct pci_dev *pdev) >>>> +{ >>>> + struct vpci_dev *vdev; >>>> + >>>> + ASSERT(!pci_find_virtual_device(d, pdev)); >>>> + >>>> + /* Each PCI bus supports 32 devices/slots at max. */ >>>> + if ( d->vpci_dev_next > 31 ) >>>> + return -ENOSPC; >>>> + >>>> + vdev = xzalloc(struct vpci_dev); >>>> + if ( !vdev ) >>>> + return -ENOMEM; >>>> + >>>> + /* We emulate a single host bridge for the guest, so segment is >>>> always 0. */ >>>> + vdev->seg = 0; >>>> + >>>> + /* >>>> + * The bus number is set to 0, so virtual devices are seen >>>> + * as embedded endpoints behind the root complex. >>>> + */ >>>> + vdev->bus = 0; >>>> + vdev->devfn = PCI_DEVFN(d->vpci_dev_next++, 0); >>> This would likely be better as a bitmap where you set the bits of >>> in-use slots. Then you can use find_first_bit or similar to get a free >>> slot. >>> >>> Long term you might want to allow the caller to provide a pre-selected >>> slot, as it's possible for users to request the device to appear at a >>> specific slot on the emulated bus. >>> >>>> + >>>> + vdev->pdev = pdev; >>>> + vdev->domain = d; >>>> + >>>> + pcidevs_lock(); >>>> + list_add_tail(&vdev->list, &d->vdev_list); >>>> + pcidevs_unlock(); >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +int pci_remove_virtual_device(struct domain *d, const struct pci_dev >>>> *pdev) >>>> +{ >>>> + struct vpci_dev *vdev; >>>> + >>>> + pcidevs_lock(); >>>> + vdev = pci_find_virtual_device(d, pdev); >>>> + if ( vdev ) >>>> + list_del(&vdev->list); >>>> + pcidevs_unlock(); >>>> + xfree(vdev); >>>> + return 0; >>>> +} >>>> +#endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */ >>>> + >>>> /* Caller should hold the pcidevs_lock */ >>>> static int deassign_device(struct domain *d, uint16_t seg, uint8_t bus, >>>> uint8_t devfn) >>>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c >>>> index 702f7b5d5dda..d787f13e679e 100644 >>>> --- a/xen/drivers/vpci/vpci.c >>>> +++ b/xen/drivers/vpci/vpci.c >>>> @@ -91,20 +91,32 @@ int __hwdom_init vpci_add_handlers(struct pci_dev >>>> *pdev) >>>> /* Notify vPCI that device is assigned to guest. */ >>>> int vpci_assign_device(struct domain *d, const struct pci_dev *dev) >>>> { >>>> + int rc; >>>> + >>>> /* It only makes sense to assign for hwdom or guest domain. */ >>>> if ( is_system_domain(d) || !has_vpci(d) ) >>>> return 0; >>>> >>>> - return vpci_bar_add_handlers(d, dev); >>>> + rc = vpci_bar_add_handlers(d, dev); >>>> + if ( rc ) >>>> + return rc; >>>> + >>>> + return pci_add_virtual_device(d, dev); >>>> } >>>> >>>> /* Notify vPCI that device is de-assigned from guest. */ >>>> int vpci_deassign_device(struct domain *d, const struct pci_dev *dev) >>>> { >>>> + int rc; >>>> + >>>> /* It only makes sense to de-assign from hwdom or guest domain. */ >>>> if ( is_system_domain(d) || !has_vpci(d) ) >>>> return 0; >>>> >>>> + rc = pci_remove_virtual_device(d, dev); >>>> + if ( rc ) >>>> + return rc; >>>> + >>>> return vpci_bar_remove_handlers(d, dev); >>>> } >>>> #endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */ >>>> diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h >>>> index 43b8a0817076..33033a3a8f8d 100644 >>>> --- a/xen/include/xen/pci.h >>>> +++ b/xen/include/xen/pci.h >>>> @@ -137,6 +137,24 @@ struct pci_dev { >>>> struct vpci *vpci; >>>> }; >>>> >>>> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT >>>> +struct vpci_dev { >>>> + struct list_head list; >>>> + /* Physical PCI device this virtual device is connected to. */ >>>> + const struct pci_dev *pdev; >>>> + /* Virtual SBDF of the device. */ >>>> + union { >>>> + struct { >>>> + uint8_t devfn; >>>> + uint8_t bus; >>>> + uint16_t seg; >>>> + }; >>>> + pci_sbdf_t sbdf; >>>> + }; >>>> + struct domain *domain; >>>> +}; >>>> +#endif >>> I wonder whether this is strictly needed. Won't it be enough to store >>> the virtual (ie: guest) sbdf inside the existing vpci struct? >>> >>> It would avoid the overhead of the translation you do from pdev -> >>> vdev, and there doesn't seem to be anything relevant stored in >>> vpci_dev apart from the virtual sbdf. >> TL;DR It seems it might be needed from performance POV. If not implemented >> for every MMIO trap we use a global PCI lock, e.g. pcidevs_{lock|unlock}. >> Note: pcidevs' lock is a recursive lock >> >> There are 2 sources of access to virtual devices: >> 1. During initialization when we add, assign or de-assign a PCI device >> 2. At run-time when we trap configuration space access and need to >> translate virtual SBDF into physical SBDF >> 3. At least de-assign can run concurrently with MMIO handlers >> >> Now let's see which locks are in use while doing that. >> >> 1. No struct vpci_dev is used. >> 1.1. We remove the structure and just add pdev->vpci->guest_sbdf as you >> suggest >> 1.2. To protect virtual devices we use pcidevs_{lock|unlock} >> 1.3. Locking happens on system level >> >> 2. struct vpci_dev is used >> 2.1. We have a per-domain lock vdev_lock >> 2.2. Locking happens on per domain level >> >> To compare the two: >> >> 1. Without vpci_dev >> pros: much simpler code >> pros/cons: global lock is used during MMIO handling, but it is a recursive >> lock > Could you point out to me in which way the recursive nature of the lock > is relevant here? Afaict that aspect is of no interest when considering > the performance effects of using a global lock vs one with more narrow > scope. I just tried to find some excuses and defend pcidev's global lock, so even lock's recursion could be an argument here. Weak. Besides that I do agree that this is still a global lock. > > Jan >

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.