Xen project Mailing List

Re: [PATCH v4 02/11] vpci: cancel pending map/unmap on vpci removal

To: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>

Date: Tue, 16 Nov 2021 09:01:42 +0100

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=YqesUy8IFBZ/7k21mDj5KKOLPbhQSdCeHuJWSmoWtd4=; b=TTur15DWAHyK77jcTHQqMRtmnj6KeByPdHcGtmYqWGXA3/ql4VerDBD3AWKsc7pURKxwmbqYHgQrCv4f7y3VzkPkgb+c0wWjQ3ZbjMN3FzC4idwfcgSEUvYcHBAbxJtJMK1tNuLOYNytVXdRNEtih9FHFTxv3LAmvSq7kdmgjOVauASIeZN8R39rTwtY1HfdfDsNKAilUuoC2QpvLRj/lk/HBh2qW6CW/xR32l3o4bsl8IvjHlDSRQzw4mntNat3vFbxtQyDlaRMBfGrReSewnzEgZcbxiF57xmByJpP8HzlHcZyLUXAdLdhH8reEoikws7VgSsZMRFanW5mC+YzyQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Ga4zCu27zoVY1KuV2CxIdAb1QttjS4DUx72cBzTF3iys3HknOkDIZ7ERv6GDEGBSTquy4Jr8l2pJe5oIynCQekwxP8bfGkrhGoZDt+buWZHRSBTqbfUiZ0LHm4hUUz97ZPJqtuXBM43JvQnpJzdU3OHzC+yCxMMUpmxeD074lQK63gz3+R8Vntiv2Z09e+bWKSP+lb9ru0LOSufdlSe3YuQM64r8QF5wQZ4HSdCfGefTRm2/tby4ZbEUoM8fe2zBMhcCZOGklSuIFLxp09GMoRzx+/KCTxf1jO+/+h8SmvGb1YqVMKnnznBkLgI3BOK7LkHBnz6xAsT2J2yvFxYacw==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: "julien@xxxxxxx" <julien@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Artem Mygaiev <Artem_Mygaiev@xxxxxxxx>, "roger.pau@xxxxxxxxxx" <roger.pau@xxxxxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>, "george.dunlap@xxxxxxxxxx" <george.dunlap@xxxxxxxxxx>, "paul@xxxxxxx" <paul@xxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 16 Nov 2021 08:01:58 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 16.11.2021 08:32, Oleksandr Andrushchenko wrote: > On 15.11.21 18:56, Jan Beulich wrote: >> On 05.11.2021 07:56, Oleksandr Andrushchenko wrote: >>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> >>> >>> When a vPCI is removed for a PCI device it is possible that we have >>> scheduled a delayed work for map/unmap operations for that device. >>> For example, the following scenario can illustrate the problem: >>> >>> pci_physdev_op >>> pci_add_device >>> init_bars -> modify_bars -> defer_map -> >>> raise_softirq(SCHEDULE_SOFTIRQ) >>> iommu_add_device <- FAILS >>> vpci_remove_device -> xfree(pdev->vpci) >>> >>> leave_hypervisor_to_guest >>> vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci == NULL >>> >>> For the hardware domain we continue execution as the worse that >>> could happen is that MMIO mappings are left in place when the >>> device has been deassigned >> Is continuing safe in this case? I.e. isn't there the risk of a NULL >> deref? > I think it is safe to continue And why do you think so? I.e. why is there no race for Dom0 when there is one for DomU? >>> For unprivileged domains that get a failure in the middle of a vPCI >>> {un}map operation we need to destroy them, as we don't know in which >>> state the p2m is. This can only happen in vpci_process_pending for >>> DomUs as they won't be allowed to call pci_add_device. >> You saying "we need to destroy them" made me look for a new domain_crash() >> that you add, but there is none. What is this about? > Yes, I guess we need to implicitly destroy the domain, What do you mean by "implicitly"? >>> @@ -165,6 +164,18 @@ bool vpci_process_pending(struct vcpu *v) >>> return false; >>> } >>> >>> +void vpci_cancel_pending(const struct pci_dev *pdev) >>> +{ >>> + struct vcpu *v = current; >>> + >>> + /* Cancel any pending work now. */ >> Doesn't "any" include pending work on all vCPU-s of the guest, not >> just current? Is current even relevant (as in: part of the correct >> domain), considering ... >> >>> --- a/xen/drivers/vpci/vpci.c >>> +++ b/xen/drivers/vpci/vpci.c >>> @@ -51,6 +51,8 @@ void vpci_remove_device(struct pci_dev *pdev) >>> xfree(r); >>> } >>> spin_unlock(&pdev->vpci->lock); >>> + >>> + vpci_cancel_pending(pdev); >> ... this code path, when coming here from pci_{add,remove}_device()? >> >> I can agree that there's a problem here, but I think you need to >> properly (i.e. in a race free manner) drain pending work. > Yes, the code is inconsistent with this respect. I am thinking about: > > void vpci_cancel_pending(const struct pci_dev *pdev) > { > struct domain *d = pdev->domain; > struct vcpu *v; > > /* Cancel any pending work now. */ > domain_lock(d); > for_each_vcpu ( d, v ) > { > vcpu_pause(v); > if ( v->vpci.mem && v->vpci.pdev == pdev) Nit: Same style issue as in the original patch. > { > rangeset_destroy(v->vpci.mem); > v->vpci.mem = NULL; > } > vcpu_unpause(v); > } > domain_unlock(d); > } > > which seems to solve all the concerns. Is this what you mean? Something along these lines. I expect you will want to make use of domain_pause_except_self(), and I don't understand the purpose of acquiring the domain lock. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.