[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v8 6/6] x86/iommu: add map-reserved dom0-iommu option to map reserved memory ranges



> -----Original Message-----
> From: Roger Pau Monne [mailto:roger.pau@xxxxxxxxxx]
> Sent: 07 September 2018 10:08
> To: xen-devel@xxxxxxxxxxxxxxxxxxxx
> Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>; Andrew Cooper
> <Andrew.Cooper3@xxxxxxxxxx>; George Dunlap
> <George.Dunlap@xxxxxxxxxx>; Ian Jackson <Ian.Jackson@xxxxxxxxxx>; Jan
> Beulich <jbeulich@xxxxxxxx>; Julien Grall <julien.grall@xxxxxxx>; Konrad
> Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>; Stefano Stabellini
> <sstabellini@xxxxxxxxxx>; Tim (Xen.org) <tim@xxxxxxx>; Wei Liu
> <wei.liu2@xxxxxxxxxx>; Paul Durrant <Paul.Durrant@xxxxxxxxxx>; Suravee
> Suthikulpanit <suravee.suthikulpanit@xxxxxxx>; Brian Woods
> <brian.woods@xxxxxxx>; Kevin Tian <kevin.tian@xxxxxxxxx>
> Subject: [PATCH v8 6/6] x86/iommu: add map-reserved dom0-iommu option
> to map reserved memory ranges
> 
> Several people have reported hardware issues (malfunctioning USB
> controllers) due to iommu page faults on Intel hardware. Those faults
> are caused by missing RMRR (VTd) entries in the ACPI tables. Those can
> be worked around on VTd hardware by manually adding RMRR entries on
> the command line, this is however limited to Intel hardware and quite
> cumbersome to do.
> 
> In order to solve those issues add a new dom0-iommu=map-reserved
> option that identity maps all regions marked as reserved in the memory
> map. Note that regions used by devices emulated by Xen (LAPIC, IO-APIC
> or PCIe MCFG regions) are specifically avoided. Note that this option
> is available to all Dom0 modes (as opposed to the inclusive option
> which only works for PV Dom0).
> 
> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> Reviewed-by: Kevin Tian <kevin.tian@xxxxxxxxx>
> Reviewed-by: Wei Liu <wei.liu2@xxxxxxxxxx>
> Acked-by: Jan Beulich <jbeulich@xxxxxxxx>

Reviewed-by: Paul Durrant <paul.durrant@xxxxxxxxxx>

> ---
> Changes since v7:
>  - Don't use true/false with int8_t.
>  - Print a warning message if map-reserved is set on ARM.
> 
> Changes since v6:
>  - Reword the map-reserved help to make it clear it's available to
>    both PV and PVH Dom0.
>  - Assign type inside of the switch expression.
>  - Remove the comment about IO-APIC MMIO relocation, this is not
>    supported ATM.
> 
> Changes since v5:
>  - Merge with the vpci MMCFG helper patch.
>  - Add a TODO item about the issues with relocating the LAPIC or
>    IOAPIC MMIO regions.
>  - Use the newly introduced page_get_ram_type that returns all the
>    types that fall between a page.
>  - Use paging_mode_translate instead of iommu_use_hap_pt when deciding
>    whether to use set_identity_p2m_entry or iommu_map_page.
> 
> Changes since v4:
>  - Use pfn_to_paddr.
>  - Rebase on top of previous changes.
>  - Change the default option setting to use if instead of a ternary
>    operator.
>  - Rename to map-reserved.
> 
> Changes since v3:
>  - Add mappings if the iommu page tables are shared.
> 
> Changes since v2:
>  - Fix comment regarding dom0-strict.
>  - Change documentation style of xen command line.
>  - Rename iommu_map to hwdom_iommu_map.
>  - Move all the checks to hwdom_iommu_map.
> 
> Changes since v1:
>  - Introduce a new reserved option instead of abusing the inclusive
>    option.
>  - Use the same helper function for PV and PVH in order to decide if a
>    page should be added to the domain page tables.
>  - Use the data inside of the domain struct to detect overlaps with
>    emulated MMIO regions.
> ---
> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
> Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
> Cc: Jan Beulich <jbeulich@xxxxxxxx>
> Cc: Julien Grall <julien.grall@xxxxxxx>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> Cc: Tim Deegan <tim@xxxxxxx>
> Cc: Wei Liu <wei.liu2@xxxxxxxxxx>
> Cc: Paul Durrant <paul.durrant@xxxxxxxxxx>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
> Cc: Brian Woods <brian.woods@xxxxxxx>
> Cc: Kevin Tian <kevin.tian@xxxxxxxxx>
> ---
>  docs/misc/xen-command-line.markdown         |  9 ++++
>  xen/arch/x86/hvm/io.c                       |  5 ++
>  xen/drivers/passthrough/amd/pci_amd_iommu.c |  3 ++
>  xen/drivers/passthrough/arm/smmu.c          |  4 ++
>  xen/drivers/passthrough/iommu.c             |  5 +-
>  xen/drivers/passthrough/vtd/iommu.c         |  3 ++
>  xen/drivers/passthrough/x86/iommu.c         | 52 ++++++++++++++++++---
>  xen/include/asm-x86/hvm/io.h                |  3 ++
>  xen/include/xen/iommu.h                     |  2 +-
>  9 files changed, 78 insertions(+), 8 deletions(-)
> 
> diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-
> command-line.markdown
> index 98f0f3b68b..1ffd586224 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -704,6 +704,15 @@ This list of booleans controls the iommu usage by
> Dom0:
>    option is only applicable to a PV Dom0 and is enabled by default on Intel
>    hardware.
> 
> +* `map-reserved`: sets up DMA remapping for all the reserved regions in
> the
> +  memory map for Dom0. Use this to work around firmware issues providing
> +  incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for
> IOMMU
> +  accesses for Dom0, all memory regions marked as reserved in the memory
> map
> +  that don't overlap with any MMIO region from emulated devices will be
> +  identity mapped. This option maps a subset of the memory that would be
> +  mapped when using the `map-inclusive` option. This option is available to
> all
> +  Dom0 modes and is enabled by default on Intel hardware.
> +
>  ### dom0\_ioports\_disable (x86)
>  > `= List of <hex>-<hex>`
> 
> diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
> index 47d6c850ca..a5b0a23f06 100644
> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -404,6 +404,11 @@ static const struct hvm_mmcfg
> *vpci_mmcfg_find(const struct domain *d,
>      return NULL;
>  }
> 
> +bool vpci_is_mmcfg_address(const struct domain *d, paddr_t addr)
> +{
> +    return vpci_mmcfg_find(d, addr);
> +}
> +
>  static unsigned int vpci_mmcfg_decode_addr(const struct hvm_mmcfg
> *mmcfg,
>                                             paddr_t addr, pci_sbdf_t *sbdf)
>  {
> diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> index 073d18bd10..330f9ce386 100644
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -256,6 +256,9 @@ static void __hwdom_init
> amd_iommu_hwdom_init(struct domain *d)
>      /* Inclusive IOMMU mappings are disabled by default on AMD hardware.
> */
>      if ( iommu_hwdom_inclusive == -1 )
>          iommu_hwdom_inclusive = 0;
> +    /* Reserved IOMMU mappings are disabled by default on AMD
> hardware. */
> +    if ( iommu_hwdom_reserved == -1 )
> +        iommu_hwdom_reserved = 0;
> 
>      if ( allocate_domain_resources(dom_iommu(d)) )
>          BUG();
> diff --git a/xen/drivers/passthrough/arm/smmu.c
> b/xen/drivers/passthrough/arm/smmu.c
> index a5158b0bdf..43ece42a50 100644
> --- a/xen/drivers/passthrough/arm/smmu.c
> +++ b/xen/drivers/passthrough/arm/smmu.c
> @@ -2732,6 +2732,10 @@ static void __hwdom_init
> arm_smmu_iommu_hwdom_init(struct domain *d)
>               printk(XENLOG_WARNING
>               "map-inclusive dom0-iommu option is not supported on
> ARM\n");
>       iommu_hwdom_inclusive = 0;
> +     if ( iommu_hwdom_reserved == 1 )
> +             printk(XENLOG_WARNING
> +             "map-reserved dom0-iommu option is not supported on
> ARM\n");
> +     iommu_hwdom_reserved = 0;
>  }
> 
>  static void arm_smmu_iommu_domain_teardown(struct domain *d)
> diff --git a/xen/drivers/passthrough/iommu.c
> b/xen/drivers/passthrough/iommu.c
> index 9552464bdc..a29bc13f8a 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -62,6 +62,7 @@ bool_t __read_mostly iommu_intremap = 1;
>  bool __hwdom_initdata iommu_hwdom_strict;
>  bool __read_mostly iommu_hwdom_passthrough;
>  int8_t __hwdom_initdata iommu_hwdom_inclusive = -1;
> +int8_t __hwdom_initdata iommu_hwdom_reserved = -1;
> 
>  /*
>   * In the current implementation of VT-d posted interrupts, in some
> extreme
> @@ -155,6 +156,8 @@ static int __init parse_dom0_iommu_param(const
> char *s)
>              iommu_hwdom_strict = val;
>          else if ( (val = parse_boolean("map-inclusive", s, ss)) >= 0 )
>              iommu_hwdom_inclusive = val;
> +        else if ( (val = parse_boolean("map-reserved", s, ss)) >= 0 )
> +            iommu_hwdom_inclusive = val;
>          else
>              rc = -EINVAL;
> 
> @@ -236,7 +239,7 @@ void __hwdom_init iommu_hwdom_init(struct
> domain *d)
> 
>      hd->platform_ops->hwdom_init(d);
> 
> -    ASSERT(iommu_hwdom_inclusive != -1);
> +    ASSERT(iommu_hwdom_inclusive != -1 && iommu_hwdom_inclusive != -
> 1);
>      if ( iommu_hwdom_inclusive && !is_pv_domain(d) )
>      {
>          printk(XENLOG_WARNING
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index a09e02c8db..1121f5ff5b 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1307,6 +1307,9 @@ static void __hwdom_init
> intel_iommu_hwdom_init(struct domain *d)
>      /* Inclusive mappings are enabled by default on Intel hardware for PV. */
>      if ( iommu_hwdom_inclusive == -1 )
>          iommu_hwdom_inclusive = is_pv_domain(d);
> +    /* Reserved IOMMU mappings are enabled by default on Intel hardware.
> */
> +    if ( iommu_hwdom_reserved == -1 )
> +        iommu_hwdom_reserved = 1;
> 
>      setup_hwdom_pci_devices(d, setup_hwdom_device);
>      setup_hwdom_rmrr(d);
> diff --git a/xen/drivers/passthrough/x86/iommu.c
> b/xen/drivers/passthrough/x86/iommu.c
> index 5809027573..47a078272a 100644
> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -20,6 +20,7 @@
>  #include <xen/softirq.h>
>  #include <xsm/xsm.h>
> 
> +#include <asm/hvm/io.h>
>  #include <asm/setup.h>
> 
>  void iommu_update_ire_from_apic(
> @@ -139,17 +140,23 @@ static bool __hwdom_init
> hwdom_iommu_map(const struct domain *d,
>                                           unsigned long max_pfn)
>  {
>      mfn_t mfn = _mfn(pfn);
> +    unsigned int i, type;
> 
>      /*
>       * Set up 1:1 mapping for dom0. Default to include only conventional RAM
>       * areas and let RMRRs include needed reserved regions. When set, the
>       * inclusive mapping additionally maps in every pfn up to 4GB except 
> those
> -     * that fall in unusable ranges.
> +     * that fall in unusable ranges for PV Dom0.
>       */
> -    if ( (pfn > max_pfn && !mfn_valid(mfn)) || xen_in_range(pfn) )
> +    if ( (pfn > max_pfn && !mfn_valid(mfn)) || xen_in_range(pfn) ||
> +         /*
> +          * Ignore any address below 1MB, that's already identity mapped by
> the
> +          * Dom0 builder for HVM.
> +          */
> +         (!d->domain_id && is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) )
>          return false;
> 
> -    switch ( page_get_ram_type(mfn) )
> +    switch ( type = page_get_ram_type(mfn) )
>      {
>      case RAM_TYPE_UNUSABLE:
>          return false;
> @@ -160,10 +167,40 @@ static bool __hwdom_init
> hwdom_iommu_map(const struct domain *d,
>          break;
> 
>      default:
> -        if ( !iommu_hwdom_inclusive || pfn > max_pfn )
> +        if ( type & RAM_TYPE_RESERVED )
> +        {
> +            if ( !iommu_hwdom_inclusive && !iommu_hwdom_reserved )
> +                return false;
> +        }
> +        else if ( is_hvm_domain(d) || !iommu_hwdom_inclusive || pfn >
> max_pfn )
>              return false;
>      }
> 
> +    /*
> +     * Check that it doesn't overlap with the LAPIC
> +     * TODO: if the guest relocates the MMIO area of the LAPIC Xen should
> make
> +     * sure there's nothing in the new address that would prevent trapping.
> +     */
> +    if ( has_vlapic(d) )
> +    {
> +        const struct vcpu *v;
> +
> +        for_each_vcpu(d, v)
> +            if ( pfn == PFN_DOWN(vlapic_base_address(vcpu_vlapic(v))) )
> +                return false;
> +    }
> +    /* ... or the IO-APIC */
> +    for ( i = 0; has_vioapic(d) && i < d->arch.hvm.nr_vioapics; i++ )
> +        if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
> +            return false;
> +    /*
> +     * ... or the PCIe MCFG regions.
> +     * TODO: runtime added MMCFG regions are not checked to make sure
> they
> +     * don't overlap with already mapped regions, thus preventing trapping.
> +     */
> +    if ( has_vpci(d) && vpci_is_mmcfg_address(d, pfn_to_paddr(pfn)) )
> +        return false;
> +
>      return true;
>  }
> 
> @@ -173,7 +210,7 @@ void __hwdom_init arch_iommu_hwdom_init(struct
> domain *d)
> 
>      BUG_ON(!is_hardware_domain(d));
> 
> -    if ( iommu_hwdom_passthrough || !is_pv_domain(d) )
> +    if ( iommu_hwdom_passthrough )
>          return;
> 
>      max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
> @@ -187,7 +224,10 @@ void __hwdom_init arch_iommu_hwdom_init(struct
> domain *d)
>          if ( !hwdom_iommu_map(d, pfn, max_pfn) )
>              continue;
> 
> -        rc = iommu_map_page(d, pfn, pfn,
> IOMMUF_readable|IOMMUF_writable);
> +        if ( paging_mode_translate(d) )
> +            rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0);
> +        else
> +            rc = iommu_map_page(d, pfn, pfn,
> IOMMUF_readable|IOMMUF_writable);
>          if ( rc )
>              printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n",
>                     d->domain_id, rc);
> diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
> index 8c83fd0c8b..7ceb119b64 100644
> --- a/xen/include/asm-x86/hvm/io.h
> +++ b/xen/include/asm-x86/hvm/io.h
> @@ -185,6 +185,9 @@ int register_vpci_mmcfg_handler(struct domain *d,
> paddr_t addr,
>  /* Destroy tracked MMCFG areas. */
>  void destroy_vpci_mmcfg(struct domain *d);
> 
> +/* Check if an address is between a MMCFG region for a domain. */
> +bool vpci_is_mmcfg_address(const struct domain *d, paddr_t addr);
> +
>  #endif /* __ASM_X86_HVM_IO_H__ */
> 
> 
> diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> index 89c6830689..57c4e81ec6 100644
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -37,7 +37,7 @@ extern bool_t iommu_debug;
>  extern bool_t amd_iommu_perdev_intremap;
> 
>  extern bool iommu_hwdom_strict, iommu_hwdom_passthrough;
> -extern int8_t iommu_hwdom_inclusive;
> +extern int8_t iommu_hwdom_inclusive, iommu_hwdom_reserved;
> 
>  extern unsigned int iommu_dev_iotlb_timeout;
> 
> --
> 2.18.0

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.