[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 02/15] IOMMU/x86: perform PV Dom0 mappings in batches


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 1 Jun 2022 09:30:07 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dELefG2bSYXeyZV9CN/nMhdgDni/uZK8NPZC86J7h1c=; b=Lo8nD2mh/zho30wG1xMZ90+i1u38DGBZWM/zLl0j5EZbd0UqjTG9G1o7q17L4wk1MYZ0Or23lMzbnQEKqDABW4jLw43rCzV1WN5zleGj0R25oB3v2GjShm3SjP34CcWlS0rPDyc9zlLg6DGBycybPxDZqp36RixB2UjAECRtns3A7Y8+LNsM5uxfAZXwxoyYtaTDHJ8aEJ/gWN5uv/tI4sgLNbdxV0YhQ7ahIL3IM+s0oR5VOisOGijTxLKYdLbRnIepmNFaS6FyM1dgQJOplnci+v/5rG2KNsmfIe83GBUYUhDS0xWKYNx2kMNYL8DjPnydgDpagu+R71cresfNIw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Ef2+BMHYaTIonF1OHWt1+Gznr354YUShykNly5hgjYP+9YfMBKmOQvvDRdHltoydSZKPlBtskNhVNR1zNxURW8DtSDImtviQ7c7TYLlhRY0wP+aYeI7m/YguN5kMJSHWtEa+5zoPDhKKgrfEhdUqi7t39Ys71aTSpRw/w7Etr6cN2zSzNim0ULp+WFD2WI3/Vjr+1xDD6nWoCn5+2ql5hHtaVXd28LLmGiZDWiqzv1wXBA/XJizUHl2VkxGGkmcCczPoV/qw7bezVbzYHCTxETu8yUWLMifsmU1ip6aIEjPL4SrNcAlBXuBEbbE2+4TFL3a1H3PcZuwmdbWWtGWd+A==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Wed, 01 Jun 2022 07:30:20 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 31.05.2022 18:01, Roger Pau Monné wrote:
> On Fri, May 27, 2022 at 01:12:48PM +0200, Jan Beulich wrote:
>> For large page mappings to be easily usable (i.e. in particular without
>> un-shattering of smaller page mappings) and for mapping operations to
>> then also be more efficient, pass batches of Dom0 memory to iommu_map().
>> In dom0_construct_pv() and its helpers (covering strict mode) this
>> additionally requires establishing the type of those pages (albeit with
>> zero type references).
>>
>> The earlier establishing of PGT_writable_page | PGT_validated requires
>> the existing places where this gets done (through get_page_and_type())
>> to be updated: For pages which actually have a mapping, the type
>> refcount needs to be 1.
>>
>> There is actually a related bug that gets fixed here as a side effect:
>> Typically the last L1 table would get marked as such only after
>> get_page_and_type(..., PGT_writable_page). While this is fine as far as
>> refcounting goes, the page did remain mapped in the IOMMU in this case
>> (when "iommu=dom0-strict").
>>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> 
> Acked-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Thanks.

>> ---
>> Subsequently p2m_add_identity_entry() may want to also gain an order
>> parameter, for arch_iommu_hwdom_init() to use. While this only affects
>> non-RAM regions, systems typically have 2-16Mb of reserved space
>> immediately below 4Gb, which hence could be mapped more efficiently.
>>
>> Eventually we may want to overhaul this logic to use a rangeset based
>> approach instead, punching holes into originally uniformly large-page-
>> mapped regions. Doing so right here would first and foremost be yet more
>> of a change.
>>
>> The installing of zero-ref writable types has in fact shown (observed
>> while putting together the change) that despite the intention by the
>> XSA-288 changes (affecting DomU-s only) for Dom0 a number of
>> sufficiently ordinary pages (at the very least initrd and P2M ones as
>> well as pages that are part of the initial allocation but not part of
>> the initial mapping) still have been starting out as PGT_none, meaning
>> that they would have gained IOMMU mappings only the first time these
>> pages would get mapped writably. Consequently an open question is
>> whether iommu_memory_setup() should set the pages to PGT_writable_page
>> independent of need_iommu_pt_sync().
> 
> Hm, I see, non strict PV dom0s won't get the pages set to
> PGT_writable_page even when accessible by devices by virtue of such
> domain having all RAM mapped in the IOMMU page-tables.
> 
> I guess it does make sense to also have the pages set as
> PGT_writable_page by default in that case, as tthe pages _are_
> writable by the IOMMU.  Do pages added during runtime (ie: ballooned
> in) also get PGT_writable_page set?

Yes, by virtue of going through guest_physmap_add_page().

>> @@ -406,20 +406,41 @@ void __hwdom_init arch_iommu_hwdom_init(
>>          if ( !perms )
>>              rc = 0;
>>          else if ( paging_mode_translate(d) )
>> +        {
>>              rc = p2m_add_identity_entry(d, pfn,
>>                                          perms & IOMMUF_writable ? 
>> p2m_access_rw
>>                                                                  : 
>> p2m_access_r,
>>                                          0);
>> +            if ( rc )
>> +                printk(XENLOG_WARNING
>> +                       "%pd: identity mapping of %lx failed: %d\n",
>> +                       d, pfn, rc);
>> +        }
>> +        else if ( pfn != start + count || perms != start_perms )
>> +        {
>> +        commit:
>> +            rc = iommu_map(d, _dfn(start), _mfn(start), count, start_perms,
>> +                           &flush_flags);
>> +            if ( rc )
>> +                printk(XENLOG_WARNING
>> +                       "%pd: IOMMU identity mapping of [%lx,%lx) failed: 
>> %d\n",
>> +                       d, pfn, pfn + count, rc);
>> +            SWAP(start, pfn);
>> +            start_perms = perms;
>> +            count = 1;
>> +        }
>>          else
>> -            rc = iommu_map(d, _dfn(pfn), _mfn(pfn), 1ul << PAGE_ORDER_4K,
>> -                           perms, &flush_flags);
>> +        {
>> +            ++count;
>> +            rc = 0;
>> +        }
>>  
>> -        if ( rc )
>> -            printk(XENLOG_WARNING "%pd: identity %smapping of %lx failed: 
>> %d\n",
>> -                   d, !paging_mode_translate(d) ? "IOMMU " : "", pfn, rc);
>>  
>> -        if (!(i & 0xfffff))
>> +        if ( !(++i & 0xfffff) )
>>              process_pending_softirqs();
>> +
>> +        if ( i == top && count )
> 
> Nit: do you really need to check for count != 0? AFAICT this is only
> possible in the first iteration.

Yes, to avoid taking the PV path for PVH on the last iteration (count
remains zero for PVH throughout the entire loop).

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.