[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 02/11] vpci: cancel pending map/unmap on vpci removal
On 17.11.21 10:28, Jan Beulich wrote: > On 05.11.2021 07:56, Oleksandr Andrushchenko wrote: >> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> >> >> When a vPCI is removed for a PCI device it is possible that we have >> scheduled a delayed work for map/unmap operations for that device. >> For example, the following scenario can illustrate the problem: >> >> pci_physdev_op >> pci_add_device >> init_bars -> modify_bars -> defer_map -> >> raise_softirq(SCHEDULE_SOFTIRQ) >> iommu_add_device <- FAILS >> vpci_remove_device -> xfree(pdev->vpci) >> >> leave_hypervisor_to_guest >> vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci == NULL >> >> For the hardware domain we continue execution as the worse that >> could happen is that MMIO mappings are left in place when the >> device has been deassigned >> >> For unprivileged domains that get a failure in the middle of a vPCI >> {un}map operation we need to destroy them, as we don't know in which >> state the p2m is. This can only happen in vpci_process_pending for >> DomUs as they won't be allowed to call pci_add_device. >> >> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> > Thinking about it some more, I'm not convinced any of this is really > needed in the presented form. The intention of this patch was to handle error conditions which are abnormal, e.g. when iommu_add_device fails and we are in the middle of initialization. So, I am trying to cancel all pending work which might already be there and not to crash. > Removal of a vPCI device is the analogue > of hot-unplug on baremetal. That's not a "behind the backs of > everything" operation. Instead the host admin has to prepare the > device for removal, which will result in it being quiescent (which in > particular means no BAR adjustments anymore). The act of removing the > device from the system has as its virtual counterpart "xl pci-detach". > I think it ought to be in this context when pending requests get > drained, and an indicator be set that no further changes to that > device are permitted. This would mean invoking from > vpci_deassign_device() as added by patch 4, not from > vpci_remove_device(). This would yield removal of a device from the > host being independent of removal of a device from a guest. > > The need for vpci_remove_device() seems questionable in the first > place: Even for hot-unplug on the host it may be better to require a > pci-detach from (PVH) Dom0 before the actual device removal. This > would involve an adjustment to the de-assignment logic for the case > of no quarantining: We'd need to make sure explicit de-assignment > from Dom0 actually removes the device from there; right now > de-assignment assumes "from DomU" and "to Dom0 or DomIO" (depending > on quarantining mode). Please see above. What you wrote might be perfectly fine for the "expected" removals, but what about the errors which are out of administrator's control? > > Thoughts? > > Jan > Thank you, Oleksandr
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |