[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH v4] IOMMU: make DMA containment of quarantined devices optional



> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: 27 November 2020 16:46
> To: xen-devel@xxxxxxxxxxxxxxxxxxxx
> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>; Paul Durrant <paul@xxxxxxx>; 
> Kevin Tian
> <kevin.tian@xxxxxxxxx>
> Subject: [PATCH v4] IOMMU: make DMA containment of quarantined devices 
> optional
> 
> Containing still in flight DMA was introduced to work around certain
> devices / systems hanging hard upon hitting a "not-present" IOMMU fault.
> Passing through (such) devices (on such systems) is inherently insecure
> (as guests could easily arrange for IOMMU faults of any kind to occur).
> Defaulting to a mode where admins may not even become aware of issues
> with devices can be considered undesirable. Therefore convert this mode
> of operation to an optional one, not one enabled by default.
> 
> This involves resurrecting code commit ea38867831da ("x86 / iommu: set
> up a scratch page in the quarantine domain") did remove, in a slightly
> extended and abstracted fashion. Here, instead of reintroducing a pretty
> pointless use of "goto" in domain_context_unmap(), and instead of making
> the function (at least temporarily) inconsistent, take the opportunity
> and replace the other similarly pointless "goto" as well.
> 
> In order to key the re-instated bypasses off of there (not) being a root
> page table this further requires moving the allocate_domain_resources()
> invocation from reassign_device() to amd_iommu_setup_domain_device() (or
> else reassign_device() would allocate a root page table anyway); this is
> benign to the second caller of the latter function.
> 
> Take the opportunity and also limit the control to builds supporting
> PCI.
> 
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> v4: "full" -> "scratch_page". Duplicate Kconfig help text into command
>     line doc. Re-base.
> v3: IOMMU_quarantine_basic -> IOMMU_quarantine_fault,
>     IOMMU_quarantine_full -> IOMMU_quarantine_write_fault. Kconfig
>     option (choice) to select default. Limit to HAS_PCI.
> v2: Don't use true/false. Introduce QUARANTINE_SKIP() (albeit I'm not
>     really convinced this is an improvement). Add comment.
> 
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1278,7 +1278,7 @@ detection of systems known to misbehave
>  > Default: `new` unless directed-EOI is supported
> 
>  ### iommu
> -    = List of [ <bool>, verbose, debug, force, required, quarantine,
> +    = List of [ <bool>, verbose, debug, force, required, 
> quarantine[=scratch-page],
>                  sharept, intremap, intpost, crash-disable,
>                  snoop, qinval, igfx, amd-iommu-perdev-intremap,
>                  dom0-{passthrough,strict} ]
> @@ -1316,11 +1316,32 @@ boolean (e.g. `iommu=no`) can override t
>      will prevent Xen from booting if IOMMUs aren't discovered and enabled
>      successfully.
> 
> -*   The `quarantine` boolean can be used to control Xen's behavior when
> -    de-assigning devices from guests.  If enabled (the default), Xen always
> +*   The `quarantine` option can be used to control Xen's behavior when
> +    de-assigning devices from guests.
> +
> +    When a PCI device is assigned to an untrusted domain, it is possible
> +    for that domain to program the device to DMA to an arbitrary address.
> +    The IOMMU is used to protect the host from malicious DMA by making
> +    sure that the device addresses can only target memory assigned to the
> +    guest.  However, when the guest domain is torn down, assigning the
> +    device back to the hardware domain would allow any in-flight DMA to
> +    potentially target critical host data.  To avoid this, quarantining
> +    should be enabled.  Quarantining can be done in two ways: In its basic
> +    form, all in-flight DMA will simply be forced to encounter IOMMU
> +    faults.  Since there are systems where doing so can cause host lockup,
> +    an alternative form is available where writes to memory will be made
> +    fault, but reads will be directed to a dummy page.  The implication
> +    here is that such reads will go unnoticed, i.e. an admin may not
> +    become aware of the underlying problem.
> +
> +    Therefore, if this option is set to true (the default), Xen always
>      quarantines such devices; they must be explicitly assigned back to Dom0
> -    before they can be used there again.  If disabled, Xen will only
> -    quarantine devices the toolstack hass arranged for getting quarantined.
> +    before they can be used there again.  If set to "scratch-page", still
> +    active DMA reads will additionally be directed to a "scratch" page.  If

There's inconsistency of terms here. We should choose either 'dummy page' or 
'scratch page' (and my vote goes for the latter). Also, rather than true or 
false, shouldn't we have 'off', 'basic', and 'scratch-page'?

  Paul




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.