Xen project Mailing List

Re: [PATCH v2 07/18] IOMMU/x86: perform PV Dom0 mappings in batches

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Fri, 10 Dec 2021 12:41:31 +0100

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=x89M2zig9DXdn3QqBbLEBv2HMC5p/11NIDL8RS+jL+o=; b=dR+ZwBIshvlUV0AHuRP9Bd2oTr+fjMkjScuj0tD6JA5hAKJ/PulLDM4eEDT2KMEK/fsh+yPIiJLVcuxMyOwUuZ2v+d+Z818WQCz8wW1HArml/K+XN6wz/7drkAtV2IQJRAQUA+ZWpE/NT9wLRu0GIHYtHv+QEquoxZn3/QMYhFHVGLPdnTLfCESz3WFU1yf9KfKwD3Ei5sEhZUBB4scVbOPlRr7BA782wZR+50inSNssFSpeZMNWnXQ3cQTkKOG2z6Hs5ZPoYdH1bwPqhgg9vlsxf6QGf4xWkkuaRqTUkqvjA7plfVTQ4JU3B9ZOLu3hLfMM9ngcZ1i8UHxPKr6wmQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ExPz8S9EJCXUR4R2uX6YNq33iuz30po/IqlDQyg99godVvbTUFDwjYzciotTmLvAVsfDr7NzVT777dNurSnelasmz5YauOQ7oMxfnVv/GC1/52zh4NPjsu0fHJ+nA2RihTQRhQ4ZhdG0UyxaKkhz+q5VKiBhH6mixesREXCMq4ehWsLwkTXVteTlhbhxfUfxI96aDsmQu97DL4zbC5/t0LnlonXDiDjn28tu4uC/BDCycgSa+KKGQxDfug/lkcAigjcCtx7kIU+Qv8tAv0YdlnClamec4FZp8WXPyZsSNn0wSGHOIFx+RnixqIUG6T6tS3UyKUpQhR/h6t/wIFKx0w==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, Wei Liu <wl@xxxxxxx>

Delivery-date: Fri, 10 Dec 2021 11:41:46 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 10.12.2021 10:36, Roger Pau Monné wrote: > On Fri, Dec 03, 2021 at 01:38:48PM +0100, Jan Beulich wrote: >> On 02.12.2021 15:10, Roger Pau Monné wrote: >>> On Fri, Sep 24, 2021 at 11:47:41AM +0200, Jan Beulich wrote: >>>> @@ -720,6 +795,17 @@ int __init dom0_construct_pv(struct doma >>>> /* Pages that are part of page tables must be read only. */ >>>> mark_pv_pt_pages_rdonly(d, l4start, vpt_start, nr_pt_pages); >>>> >>>> + /* >>>> + * This needs to come after all potentially excess >>>> + * get_page_and_type(..., PGT_writable_page) invocations; see the >>>> loop a >>>> + * few lines further up, where the effect of calling that function in >>>> an >>>> + * earlier loop iteration may get overwritten by a later one. >>>> + */ >>>> + if ( need_iommu_pt_sync(d) && >>>> + iommu_unmap(d, _dfn(PFN_DOWN(mpt_alloc) - nr_pt_pages), >>>> nr_pt_pages, >>>> + &flush_flags) ) >>>> + BUG(); >>> >>> Wouldn't such unmap better happen as part of changing the types of the >>> pages that become part of the guest page tables? >> >> Not sure - it's a single call here, but would be a call per page when >> e.g. moved into mark_pv_pt_pages_rdonly(). > > I see. So this would result in multiple calls when moved, plus all the > involved page shattering and aggregation logic. Overall it would be > less error prone, as the iommu unmap would happen next to the type > change, but I'm not going to insist if you think it's not worth it. > The page table structure pages shouldn't be that many anyway? Typically it wouldn't be that many, true. I'm not sure about "less error prone", though: We'd have more problems if the range unmapped here wasn't properly representing the set of page tables used. >>>> @@ -840,22 +928,41 @@ int __init dom0_construct_pv(struct doma >>>> #endif >>>> while ( pfn < nr_pages ) >>>> { >>>> - if ( (page = alloc_chunk(d, nr_pages - domain_tot_pages(d))) == >>>> NULL ) >>>> + count = domain_tot_pages(d); >>>> + if ( (page = alloc_chunk(d, nr_pages - count)) == NULL ) >>>> panic("Not enough RAM for DOM0 reservation\n"); >>>> + mfn = mfn_x(page_to_mfn(page)); >>>> + >>>> + if ( need_iommu_pt_sync(d) ) >>>> + { >>>> + rc = iommu_map(d, _dfn(mfn), _mfn(mfn), domain_tot_pages(d) - >>>> count, >>>> + IOMMUF_readable | IOMMUF_writable, >>>> &flush_flags); >>>> + if ( rc ) >>>> + printk(XENLOG_ERR >>>> + "pre-mapping MFN %lx (PFN %lx) into IOMMU failed: >>>> %d\n", >>>> + mfn, pfn, rc); >>>> + } >>>> + >>>> while ( pfn < domain_tot_pages(d) ) >>>> { >>>> - mfn = mfn_x(page_to_mfn(page)); >>>> + if ( !rc ) >>>> + make_pages_writable(page, 1); >>> >>> There's quite a lot of repetition of the pattern: allocate, iommu_map, >>> set as writable. Would it be possible to abstract this into some >>> kind of helper? >>> >>> I've realized some of the allocations use alloc_chunk while others use >>> alloc_domheap_pages, so it might require some work. >> >> Right, I'd leave the allocation part aside for the moment. I had actually >> considered to fold iommu_map() and make_pages_writable() into a common >> helper (or really rename make_pages_writable() and fold iommu_map() into >> there). What I lacked was a reasonable, not overly long name for such a >> function. > > I'm not overly good at naming, but I think we need to somehow find a > way to place those together into a single helper. > > I would be fine with naming this iommu_memory_{setup,add} or some > such. Marking the pages as writable is a result (or a requirement > might be a better way to express it?) of adding them to the IOMMU. > Would you be OK with one of those names? I'll use the suggestion as a basis and see how it ends up looking / feeling. >>>> @@ -372,16 +372,30 @@ void __hwdom_init arch_iommu_hwdom_init( >>>> perms & IOMMUF_writable ? >>>> p2m_access_rw >>>> : >>>> p2m_access_r, >>>> 0); >>>> + else if ( pfn != start + count || perms != start_perms ) >>>> + { >>>> + commit: >>>> + rc = iommu_map(d, _dfn(start), _mfn(start), count, >>>> + start_perms, &flush_flags); >>>> + SWAP(start, pfn); >>>> + start_perms = perms; >>>> + count = 1; >>>> + } >>>> else >>>> - rc = iommu_map(d, _dfn(pfn), _mfn(pfn), 1ul << PAGE_ORDER_4K, >>>> - perms, &flush_flags); >>>> + { >>>> + ++count; >>>> + rc = 0; >>>> + } >>>> >>>> if ( rc ) >>>> printk(XENLOG_WARNING "%pd: identity %smapping of %lx failed: >>>> %d\n", >>>> d, !paging_mode_translate(d) ? "IOMMU " : "", pfn, rc); >>> >>> Would be nice to print the count (or end pfn) in case it's a range. >> >> I can do so if you think it's worth further extra code. I can't use >> "count" here in particular, as that was updated already (in context >> above). The most reasonable change towards this would perhaps be to >> duplicate the printk() into both the "if()" and the "else if()" bodies. > > Maybe. The current message gives the impression that a single pfn has > been added and failed, but without printing the range that failed the > message will not be that helpful in diagnosing further issues that > might arise due to the mapping failure. I guess I'll make the change then. I'm still not really convinced though, as the presence of the message should be far more concerning than whether it's a single page or a range. As middle ground, would printk(XENLOG_WARNING "%pd: identity %smapping of %lx... failed: %d\n", be indicative enough of this perhaps not having been just a single page? Otoh splitting (and moving) the message would allow to drop the separate paging_mode_translate() check. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.