[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 9/9] IOMMU/PCI: don't let domain cleanup continue when device de-assignment failed



On 09/06/2021 10:30, Jan Beulich wrote:
Failure here could in principle mean the device may still be issuing DMA
requests, which would continue to be translated by the page tables the
device entry currently points at. With this we cannot allow the
subsequent cleanup step of freeing the page tables to occur, to prevent
use-after-free issues. We would need to accept, for the time being, that
in such a case the remaining domain resources will all be leaked, and
the domain will continue to exist as a zombie.

However, with flushes no longer timing out (and with proper timeout
detection for device I/O TLB flushing yet to be implemented), there's no
way anymore for failures to occur, except due to bugs elsewhere. Hence
the change here is merely a "just in case" one.

In order to continue the loop in spite of an error, we can't use
pci_get_pdev_by_domain() anymore. I have no idea why it was used here in
the first place, instead of the cheaper list iteration.

Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

Reviewed-by: Paul Durrant <paul@xxxxxxx>

---
A first step beyond this could be to have the backing functions of
deassign_device() allow the caller to tell whether the failure was from
removing the device from the domain being cleaned up, or from re-setup
in wherever the device was supposed to get moved to. In the latter case
we could allow domain cleanup to continue. I wonder whether we could
simply make those functions return "success" anyway, overriding their
returning of an error when ->is_dying is set.

A next step then might be to figure whether there's any "emergency"
adjustment that could be done instead of the full-fledged (and failed)
de-assign, to allow at least recovering all the memory from the guest.

--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -894,7 +894,7 @@ static int deassign_device(struct domain
int pci_release_devices(struct domain *d)
  {
-    struct pci_dev *pdev;
+    struct pci_dev *pdev, *tmp;
      u8 bus, devfn;
      int ret;
@@ -905,15 +905,15 @@ int pci_release_devices(struct domain *d
          pcidevs_unlock();
          return ret;
      }
-    while ( (pdev = pci_get_pdev_by_domain(d, -1, -1, -1)) )
+    list_for_each_entry_safe ( pdev, tmp, &d->pdev_list, domain_list )
      {
          bus = pdev->bus;
          devfn = pdev->devfn;
-        deassign_device(d, pdev->seg, bus, devfn);
+        ret = deassign_device(d, pdev->seg, bus, devfn) ?: ret;
      }
      pcidevs_unlock();
- return 0;
+    return ret;
  }
#define PCI_CLASS_BRIDGE_HOST 0x0600





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.