[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 05/13] vpci: add hooks for PCI device assign/de-assign




On 08.02.22 13:25, Oleksandr Andrushchenko wrote:
>
> On 08.02.22 13:00, Jan Beulich wrote:
>> On 08.02.2022 11:52, Oleksandr Andrushchenko wrote:
>>> This smells like we first need to fix the existing code, so
>>> pdev->domain is not assigned by specific IOMMU implementations,
>>> but instead controlled by the code which relies on that, assign_device.
>> Feel free to come up with proposals how to cleanly do so. Moving the
>> assignment to pdev->domain may even be possible now, but if you go
>> back you may find that the code was quite different earlier on.
> I do understand that as the code evolves new use cases bring
> new issues.
>>> I can have something like:
>>>
>>> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
>>> index 88836aab6baf..cc7790709a50 100644
>>> --- a/xen/drivers/passthrough/pci.c
>>> +++ b/xen/drivers/passthrough/pci.c
>>> @@ -1475,6 +1475,7 @@ static int device_assigned(u16 seg, u8 bus, u8 devfn)
>>>     static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, 
>>> u32 flag)
>>>     {
>>>         const struct domain_iommu *hd = dom_iommu(d);
>>> +    struct domain *old_owner;
>>>         struct pci_dev *pdev;
>>>         int rc = 0;
>>>
>>> @@ -1490,6 +1491,9 @@ static int assign_device(struct domain *d, u16 seg, 
>>> u8 bus, u8 devfn, u32 flag)
>>>         ASSERT(pdev && (pdev->domain == hardware_domain ||
>>>                         pdev->domain == dom_io));
>>>
>>> +    /* We need to restore the old owner in case of an error. */
>>> +    old_owner = pdev->domain;
>>> +
>>>         vpci_deassign_device(pdev->domain, pdev);
>>>
>>>         rc = pdev_msix_assign(d, pdev);
>>> @@ -1515,8 +1519,12 @@ static int assign_device(struct domain *d, u16 seg, 
>>> u8 bus, u8 devfn, u32 flag)
>>>
>>>      done:
>>>         if ( rc )
>>> +    {
>>>             printk(XENLOG_G_WARNING "%pd: assign (%pp) failed (%d)\n",
>>>                    d, &PCI_SBDF3(seg, bus, devfn), rc);
>>> +        /* We failed to assign, so restore the previous owner. */
>>> +        pdev->domain = old_owner;
>>> +    }
>>>         /* The device is assigned to dom_io so mark it as quarantined */
>>>         else if ( d == dom_io )
>>>             pdev->quarantine = true;
>>>
>>> But I do not think this belongs to this patch
>> Indeed. Plus I'm sure you understand that it's not that simple. Assigning
>> to pdev->domain is only the last step of assignment. Restoring the original
>> owner would entail putting in place the original IOMMU table entries as
>> well, which in turn can fail. Hence why you'll find a number of uses of
>> domain_crash() in places where rolling back is far from easy.
> So, why don't we just rely on the toolstack to do the roll back then?
> This way we won't add new domain_crash() calls.
> I do understand though that we may live Xen in a wrong state though.
> So, do you think it is possible if we just call deassign_device from
> assign_device on the error path? This is just like I do in vpci_assign_device:
> I call vpci_deassign_device if the former fails.
With the following addition:

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index c4ae22aeefcd..d6c00449193c 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1511,6 +1511,12 @@ static int assign_device(struct domain *d, u16 seg, u8 
bus, u8 devfn, u32 flag)
      }

      rc = vpci_assign_device(pdev);
+    if ( rc )
+        /*
+         * Ignore the return code as we want to preserve the one from the
+         * failed assign operation.
+         */
+        deassign_device(d, seg, bus, devfn);

   done:
      if ( rc )

I see the following logs (PV Dom0):

(XEN) assign_device seg 0 bus 3 devfn 0
(XEN) [VT-D]d[IO]:PCIe: unmap 0000:03:00.0
(XEN) [VT-D]d4:PCIe: map 0000:03:00.0
(XEN) assign_device vpci_assign rc -22 from d[IO] to d4
(XEN) deassign_device current d4 to d[IO]
(XEN) [VT-D]d4:PCIe: unmap 0000:03:00.0
(XEN) [VT-D]d[IO]:PCIe: map 0000:03:00.0
(XEN) deassign_device ret 0
(XEN) d4: assign (0000:03:00.0) failed (-22)
libxl: error: libxl_pci.c:1498:pci_add_dm_done: Domain 4:xc_assign_device 
failed: Invalid argument
libxl: error: libxl_pci.c:1781:device_pci_add_done: Domain 
4:libxl__device_pci_add failed for PCI device 0:3:0.0 (rc -3)
libxl: error: libxl_create.c:1895:domcreate_attach_devices: Domain 4:unable to 
add pci devices
libxl: error: libxl_domain.c:1183:libxl__destroy_domid: Domain 4:Non-existant 
domain
libxl: error: libxl_domain.c:1137:domain_destroy_callback: Domain 4:Unable to 
destroy guest
libxl: error: libxl_domain.c:1064:domain_destroy_cb: Domain 4:Destruction of 
domain failed

So, it seems to properly solve the issue with pdev->domain left
set to the domain we couldn't create.

@Jan, will this address your concern?

Thank you,
Oleksandr

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.