[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 06/18] IOMMU/x86: restrict IO-APIC mappings for PV Dom0


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 1 Dec 2021 12:45:12 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=j4IVWEtRk6K7m+zrtbeKk+dr4fXqtN0FFcTY7TgJ9+w=; b=bqwMwrez+nWoEEIOxeNhi1BuZZw7iyS7d53SJcfKEAA2TUi4kR9y89sn/1ncvcb5KXpUXCSFk6mIxp7KqyytUWrZEl+ABZJMxZFrOyoRCqq+WX2zJ9JlAxf9HKHy6OsAlZEH1hX/h4jC7Bo/WOH8bRGROhHZoDpkuDr33lQ6NFBw6mcWKxJEh0Ap3rvH0KFvUNAidgj7rmpGloVg/4pNlx1vek1PwfATkM3fdiJR4kZPB2LnEEpmVElITkYWfwqB5nEijl9BHptroJU32alwJoNey4hz2cmVyMOEFdjL+NBi6i90LPskiS/8yV//1Xev7lqB1WKVoeDJ4t40rLUerQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=HCkxJcOgo22WD8K9dSfifIEBuNMYDhiPh8wg7IOpNTknmXC5b6vttTx+B3hU7+wWfvMP8UUWJg8zWfN5ztc/jOG3OvprQbyaozXtm3WiMDdAG5Tn0Si48HT9hTsP6HppoXhHs/wm7v1K7hwfRHp1BD6mTRgsqGfg6yM86oVHjoRlHu25/VFF9OH2VOTZKyi/zhYxwkgPjedyTlmZLsYdlmBMdCbX0MU1PkmNiIAql7NyHF2K+teEyGwex1AvvxdgQVKOld1t9IbIYiqVezbKpD5NXNhx4Qk3zm8J7UQ06dplqQolkE8cOQl2I53YaKpYrLBYZafxrqrcfs5/2hTBOA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>
  • Delivery-date: Wed, 01 Dec 2021 11:45:47 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 01.12.2021 11:32, Roger Pau Monné wrote:
> On Wed, Dec 01, 2021 at 10:27:21AM +0100, Jan Beulich wrote:
>> On 01.12.2021 10:09, Roger Pau Monné wrote:
>>> On Fri, Sep 24, 2021 at 11:46:57AM +0200, Jan Beulich wrote:
>>>> @@ -267,44 +267,60 @@ static bool __hwdom_init hwdom_iommu_map
>>>>       * that fall in unusable ranges for PV Dom0.
>>>>       */
>>>>      if ( (pfn > max_pfn && !mfn_valid(mfn)) || xen_in_range(pfn) )
>>>> -        return false;
>>>> +        return 0;
>>>>  
>>>>      switch ( type = page_get_ram_type(mfn) )
>>>>      {
>>>>      case RAM_TYPE_UNUSABLE:
>>>> -        return false;
>>>> +        return 0;
>>>>  
>>>>      case RAM_TYPE_CONVENTIONAL:
>>>>          if ( iommu_hwdom_strict )
>>>> -            return false;
>>>> +            return 0;
>>>>          break;
>>>>  
>>>>      default:
>>>>          if ( type & RAM_TYPE_RESERVED )
>>>>          {
>>>>              if ( !iommu_hwdom_inclusive && !iommu_hwdom_reserved )
>>>> -                return false;
>>>> +                perms = 0;
>>>>          }
>>>> -        else if ( is_hvm_domain(d) || !iommu_hwdom_inclusive || pfn > 
>>>> max_pfn )
>>>> -            return false;
>>>> +        else if ( is_hvm_domain(d) )
>>>> +            return 0;
>>>> +        else if ( !iommu_hwdom_inclusive || pfn > max_pfn )
>>>> +            perms = 0;
>>>
>>> I'm confused about the reason to set perms = 0 instead of just
>>> returning here. AFAICT perms won't be set to any other value below,
>>> so you might as well just return 0.
>>
>> This is so that ...
>>
>>>>      }
>>>>  
>>>>      /* Check that it doesn't overlap with the Interrupt Address Range. */
>>>>      if ( pfn >= 0xfee00 && pfn <= 0xfeeff )
>>>> -        return false;
>>>> +        return 0;
>>>>      /* ... or the IO-APIC */
>>>> -    for ( i = 0; has_vioapic(d) && i < d->arch.hvm.nr_vioapics; i++ )
>>>> -        if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
>>>> -            return false;
>>>> +    if ( has_vioapic(d) )
>>>> +    {
>>>> +        for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
>>>> +            if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
>>>> +                return 0;
>>>> +    }
>>>> +    else if ( is_pv_domain(d) )
>>>> +    {
>>>> +        /*
>>>> +         * Be consistent with CPU mappings: Dom0 is permitted to 
>>>> establish r/o
>>>> +         * ones there, so it should also have such established for IOMMUs.
>>>> +         */
>>>> +        for ( i = 0; i < nr_ioapics; i++ )
>>>> +            if ( pfn == PFN_DOWN(mp_ioapics[i].mpc_apicaddr) )
>>>> +                return rangeset_contains_singleton(mmio_ro_ranges, pfn)
>>>> +                       ? IOMMUF_readable : 0;
>>>> +    }
>>
>> ... this return, as per the comment, takes precedence over returning
>> zero.
> 
> I see. This is because you want to map those in the IOMMU page tables
> even if the IO-APIC ranges are outside of a reserved region.
> 
> I have to admit this is kind of weird, because the purpose of this
> function is to add mappings for all memory below 4G, and/or for all
> reserved regions.

Well, that was what it started out as. The purpose here is to be consistent
about IO-APICs: Either have them all mapped, or none of them. Since we map
them in the CPU page tables and since Andrew asked for the two mappings to
be consistent, this is the only way to satisfy the requests. Personally I'd
be okay with not mapping IO-APICs here (but then regardless of whether they
are covered by a reserved region).

> I also wonder whether we should kind of generalize the handling of RO
> regions in the IOMMU for PV dom0 by using mmio_ro_ranges instead? Ie:
> we could loop around the RO ranges in PV dom0 build and map them.

We shouldn't, for example because of ...

> FWIW MSI-X tables are also RO, but adding and removing those to the
> IOMMU might be quite complex as we have to track the memory decoding
> and MSI-X enable bits.

... these: Dom0 shouldn't have a need for mappings of these tables. It's
bad enough that we need to map them in the CPU page tables.

But yes, if the goal is to map stuff uniformly in CPU and IOMMU, then
what you suggest would look to be a reasonable approach.

> And we are likely missing a check for iomem_access_permitted in
> hwdom_iommu_map?

This would be a documentation-only check: The pages have permissions
removed when not in mmio_ro_ranges (see dom0_setup_permissions()).
IOW their presence there is an indication of permissions having been
granted.

>>>> @@ -346,15 +362,19 @@ void __hwdom_init arch_iommu_hwdom_init(
>>>>      for ( ; i < top; i++ )
>>>>      {
>>>>          unsigned long pfn = pdx_to_pfn(i);
>>>> +        unsigned int perms = hwdom_iommu_map(d, pfn, max_pfn);
>>>>          int rc;
>>>>  
>>>> -        if ( !hwdom_iommu_map(d, pfn, max_pfn) )
>>>> +        if ( !perms )
>>>>              rc = 0;
>>>>          else if ( paging_mode_translate(d) )
>>>> -            rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0);
>>>> +            rc = set_identity_p2m_entry(d, pfn,
>>>> +                                        perms & IOMMUF_writable ? 
>>>> p2m_access_rw
>>>> +                                                                : 
>>>> p2m_access_r,
>>>> +                                        0);
>>>>          else
>>>>              rc = iommu_map(d, _dfn(pfn), _mfn(pfn), 1ul << PAGE_ORDER_4K,
>>>> -                           IOMMUF_readable | IOMMUF_writable, 
>>>> &flush_flags);
>>>> +                           perms, &flush_flags);
>>>
>>> You could just call set_identity_p2m_entry uniformly here. It will
>>> DTRT for non-translated guests also, and then hwdom_iommu_map could
>>> perhaps return a p2m_access_t?
>>
>> That's an orthogonal change imo, i.e. could be done as a prereq change,
>> but I'd prefer to leave it as is for now. Furthermore see "x86/mm: split
>> set_identity_p2m_entry() into PV and HVM parts": In v2 I'm now also
>> adjusting the code here 
> 
> I would rather adjust the code here to just call
> set_identity_p2m_entry instead of differentiating between PV and
> HVM.

I'm a little hesitant, in particular considering your suggestion to
then have hwdom_iommu_map() return p2m_access_t. Andrew has made quite
clear that the use of p2m_access_* here and in a number of other places
is actually an abuse.

Plus - forgot about this in my earlier reply - there would also be a
conflict with the next patch in this series, where larger orders will
get passed to iommu_map(), while set_identity_p2m_entry() has no
respective parameter yet (and imo it isn't urgent for it to gain one).

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.