[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 05/21] IOMMU/x86: restrict IO-APIC mappings for PV Dom0


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Wed, 4 May 2022 15:46:03 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rYz58vCxxnV2QVO4SVjnndTHHlvJr9hq45F/UFYdlAs=; b=hY1YOVuyFvq5H2cSoc68PBI+g6h/vT58X28EhYfpGeUVOnRoJ5Bmlys56OaxBdOgnWEBJItvOnmg331OrIwC9iXufVj70tw5Ib7nz0J6kaSiPXb4qHU8WmDTvL2OG3eCA9VQQyR9xfdBCxAY8ybgVwb/UrPmjDWRzZEmEBeM2v0Nnp+zNb07sKwthjjcBSGspLF2b6xBl4nYvX27OjZMRUIa82PcXfUS4yAm5GCNHKNrr5cBZ66U/mhxOd7282DAEcxXVvsALQbOe7d7G0sW6xrEfgnlprbxCSF68DIreKLvCeAIZAIDomBh/aKf+DelXVi2xnO3n/1OaosCHBj2eA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=HdRE7OLD90ay9o3cw37Xy4AoFZZR01i8LEGICk2Q19VLUQ5fm2OOd9egRnd1krTbH8VFiya7DEhBzN58ZOvRcOL1CTNcNdhK0pdtZMoSn1lTzuIjW1uI9lce0VlxrcjdBox6I/r3pBFORPKAfhTg6BylD8wsSt7NW2kpiEjDjJte6G2N86nRhWJWAIzblKkSrsFitqSBcO/BEqC4OR1lMixgYnRPMifbtsdSVl68g3LyxkBIFbCLFtJEEBg0+nVYv3Cx6Cc245fHk80g/2br9mcIpzm8I5Jr8k+lQkV+GWRWYprCG6L/M/mXi6FmSEbnqcxZoS9jzkV9olkKxNNQfQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>
  • Delivery-date: Wed, 04 May 2022 13:46:17 +0000
  • Ironport-data: A9a23:148p7anvNh2uVYbW1E8SLNbo5gz3J0RdPkR7XQ2eYbSJt1+Wr1Gzt xIeUGGAOKqCZmHwfdhwYYXn/RgDvpOAzoQ3TVdopSExEiMWpZLJC+rCIxarNUt+DCFioGGLT Sk6QoOdRCzhZiaE/n9BCpC48T8kk/vgqoPUUIYoAAgoLeNfYHpn2EsLd9IR2NYy24DkW1/V4 7senuWEULOb828sWo4rw/rrRCNH5JwebxtB4zTSzdgS1LPvvyF94KA3fMldHFOhKmVgJcaoR v6r8V2M1jixEyHBqD+Suu2TnkUiGtY+NOUV45Zcc/DKbhNq/kTe3kunXRa1hIg+ZzihxrhMJ NtxWZOYDjsuEorGh+IhTRRnKgxVOfIY07L7GC3q2SCT5xWun3rE5dxLVRhzEahGv+F9DCdJ6 OASLy0LYlabneWqzbmnS+5qwMM+MM3sO4BZsXZlpd3bJa9+HdafHOOXupkBg25YasNmRJ4yY +IDbjVidlLYagBnMVYLEpMu2uyvgxETdhUH8A/I+fFmvQA/yiRe9ZLhNfreWeWLavcIg3+0+ mPKwmPmV0Ry2Nu3jGDtHmiXru3FkD7/WYkSPKal7fMsi1qWrkQDBRtTWValrP2Rjk+lR8kZO 0ES4jApr6U56AqsVNaVdwWxvXqsrhMaHd1KHIUHBBqlz6PV50OVAzYCRzsYMdg+7pZpHHoty 0ODmM7vCXp3qrqJRHmB97CS6zSvJSwSKmxEbigBJecY3+TeTEgIpkqnZr5e/GSd1bUZxRmYL +i2kRUD
  • Ironport-hdrordr: A9a23:uzp7/qjblVEsrUCjM6B411XSL3BQX0h13DAbv31ZSRFFG/FwyP rCoB1L73XJYWgqM03I+eruBEBPewK4yXdQ2/hoAV7EZnichILIFvAa0WKG+VHd8kLFltK1uZ 0QEJSWTeeAd2SS7vyKnzVQcexQp+VvmZrA7Ym+854ud3ANV0gJ1XYENu/xKDwTeOApP+taKH LKjfA32gZINE5nJ/iTNz0gZazuttfLnJXpbVovAAMm0hCHiXeN5KThGxaV8x8CW3cXqI1Sul Ttokjc3OGOovu7whjT2yv66IlXosLozp9mCNaXgsYYBz3wgkKDZZhnWZeFoDcpydvfoGoCoZ 3pmVMNLs5z43TeciWcpgbs4RDp1HIU53rr2Taj8A/eiP28YAh/J9tKhIpffBecwVEnpstA3K VC2H/cn4ZLDDvb9R6NqOTgZlVPrA6ZsHAimekcgzh0So0FcoJcqoQZ4Qd8DIoAJiTn84oqed MeQP003MwmMG9yUkqp/lWGmLeXLzcO91a9MwU/U/WuonZrdCsT9Tpb+CQd9k1wga7VBaM0ot gsCZ4Y5Y2mfvVmE56VO91xMfdfKla9Ni4kY1jiV2gOKsk8SgHwgq+yxokJz8eXX7FN5KcOuf 36ISFlXCgJCgjTNfE=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, May 04, 2022 at 03:19:16PM +0200, Jan Beulich wrote:
> On 04.05.2022 15:00, Roger Pau Monné wrote:
> > On Wed, May 04, 2022 at 02:12:58PM +0200, Jan Beulich wrote:
> >> On 04.05.2022 14:01, Roger Pau Monné wrote:
> >>> On Wed, May 04, 2022 at 12:51:25PM +0200, Jan Beulich wrote:
> >>>> On 04.05.2022 12:30, Roger Pau Monné wrote:
> >>>>> Right, ->iomem_caps is indeed too wide for our purpose.  What
> >>>>> about using something like:
> >>>>>
> >>>>> else if ( is_pv_domain(d) )
> >>>>> {
> >>>>>     if ( !iomem_access_permitted(d, pfn, pfn) )
> >>>>>         return 0;
> >>>>
> >>>> We can't return 0 here (as RAM pages also make it here when
> >>>> !iommu_hwdom_strict), so I can at best take this as a vague outline
> >>>> of what you really mean. And I don't want to rely on RAM pages being
> >>>> (imo wrongly) represented by set bits in Dom0's iomem_caps.
> >>>
> >>> Well, yes, my suggestion was taking into account that ->iomem_caps for
> >>> dom0 has mostly holes for things that shouldn't be mapped, but
> >>> otherwise contains everything else as allowed (including RAM).
> >>>
> >>> We could instead do:
> >>>
> >>> else if ( is_pv_domain(d) && type != RAM_TYPE_CONVENTIONAL )
> >>> {
> >>>     ...
> >>>
> >>> So that we don't rely on RAM being 'allowed' in ->iomem_caps?
> >>
> >> This would feel to me like excess special casing.
> > 
> > What about placing this in the 'default:' label on the type switch a
> > bit above?
> 
> I'd really like to stick to the present layout of where the special
> casing is done, with PV and PVH logic at least next to each other. I
> continue to think the construct I suggested (still visible below)
> would do.
> 
> >>>>>     if ( rangeset_contains_singleton(mmio_ro_ranges, pfn) )
> >>>>>         return IOMMUF_readable;
> >>>>> }
> >>>>>
> >>>>> That would get us a bit closer to allowed CPU side mappings, and we
> >>>>> don't need to special case IO-APIC or HPET addresses as those are
> >>>>> already added to ->iomem_caps or mmio_ro_ranges respectively by
> >>>>> dom0_setup_permissions().
> >>>>
> >>>> This won't fit in a region of code framed by a (split) comment
> >>>> saying "Check that it doesn't overlap with ...". Hence if anything
> >>>> I could put something like this further down. Yet even then the
> >>>> question remains what to do with ranges which pass
> >>>> iomem_access_permitted() but
> >>>> - aren't really MMIO,
> >>>> - are inside MMCFG,
> >>>> - are otherwise special.
> >>>>
> >>>> Or did you perhaps mean to suggest something like
> >>>>
> >>>> else if ( is_pv_domain(d) && iomem_access_permitted(d, pfn, pfn) &&
> >>>>           rangeset_contains_singleton(mmio_ro_ranges, pfn) )
> >>>>     return IOMMUF_readable;
> >>>
> >>> I don't think this would be fully correct, as we would still allow
> >>> mappings of IO-APIC pages explicitly banned in ->iomem_caps by not
> >>> handling those?
> >>
> >> CPU side mappings don't deal with the IO-APICs specifically. They only
> >> care about iomem_caps and mmio_ro_ranges. Hence explicitly banned
> >> IO-APIC pages cannot be mapped there either. (Of course we only do
> >> such banning if IO-APIC pages weren't possible to represent in
> >> mmio_ro_ranges, which should effectively be never.)
> > 
> > I think I haven't expressed myself correctly.
> > 
> > This construct won't return 0 for pfns not in iomem_caps, and hence
> > could allow mapping of addresses not in iomem_caps?
> 
> I'm afraid I don't understand: There's an iomem_access_permitted()
> in the conditional. How would this allow mapping pages outside of
> iomem_caps? The default case higher up has already forced perms to
> zero for any non-RAM page (unless iommu_hwdom_inclusive).

It was my understanding that when using iommu_hwdom_inclusive (or
iommu_hwdom_reserved if the IO-APIC page is a reserved region) we
still want to deny access to the IO-APIC page if it's not in
iomem_caps, and the proposed conditional won't do that.

So I guess the discussion is really whether
iommu_hwdom_{inclusive,reserved} take precedence over ->iomem_caps?

It seems a bit inconsistent IMO to enforce mmio_ro_ranges but not
->iomem_caps when using iommu_hwdom_{inclusive,reserved}.

> >>>> ? Then there would only remain the question of whether mapping r/o
> >>>> MMCFG pages is okay (I don't think it is), but that could then be
> >>>> special-cased similar to what's done further down for vPCI (by not
> >>>> returning in the "else if", but merely updating "perms").
> >>>
> >>> Well part of the point of this is to make CPU and Device mappings
> >>> more similar.  I don't think devices have any business in poking at
> >>> the MMCFG range, so it's fine to explicitly ban that range.  But I
> >>> would have also said the same for IO-APIC pages, so I'm unsure why are
> >>> IO-APIC pages fine to be mapped RO, but not the MMCFG range.
> >>
> >> I wouldn't have wanted to allow r/o mappings of the IO-APICs, but
> >> Linux plus the ACPI tables of certain vendors require us to permit
> >> this. If we didn't, Dom0 would crash there during boot.
> > 
> > Right, but those are required for the CPU only.  I think it's a fine
> > goal to try to have similar mappings for CPU and Devices, and then
> > that would also cover MMCFG in the PV case.  Or else it fine to assume
> > CPU vs Device mappings will be slightly different, and then don't add
> > any mappings for IO-APIC, HPET or MMCFG to the Device page tables
> > (likely there's more that could be added here).
> 
> It being different is what Andrew looks to strongly dislike. And I agree
> with this up to a certain point, i.e. I'm having a hard time seeing why
> we should put in MMCFG mappings just for this reason. But if consensus
> was that consistency across all types of MMIO is the goal, then I could
> live with also making MMCFG mappings ...

For HVM/PVH I think we want o be consistent as long as it's doable (we
can't provide devices access to the emulated MMCFG there for example).

For PV I guess it's also a worthy goal if it makes the code easier.
PV (and PV dom0 specially) is already a very custom platform with
weird properties (like the mapping of the IO-APIC and HPET regions RO
or no mappings at all).

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.