[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v7 for-next 04/12] x86/mmcfg: add handlers for the PVH Dom0 MMCFG areas



> -----Original Message-----
> From: Roger Pau Monne [mailto:roger.pau@xxxxxxxxxx]
> Sent: 18 October 2017 12:40
> To: xen-devel@xxxxxxxxxxxxxxxxxxxx
> Cc: konrad.wilk@xxxxxxxxxx; boris.ostrovsky@xxxxxxxxxx; Roger Pau Monne
> <roger.pau@xxxxxxxxxx>; Jan Beulich <jbeulich@xxxxxxxx>; Andrew Cooper
> <Andrew.Cooper3@xxxxxxxxxx>; Paul Durrant <Paul.Durrant@xxxxxxxxxx>
> Subject: [PATCH v7 for-next 04/12] x86/mmcfg: add handlers for the PVH
> Dom0 MMCFG areas
> 
> Introduce a set of handlers for the accesses to the MMCFG areas. Those
> areas are setup based on the contents of the hardware MMCFG tables,
> and the list of handled MMCFG areas is stored inside of the hvm_domain
> struct.
> 
> The read/writes are forwarded to the generic vpci handlers once the
> address is decoded in order to obtain the device and register the
> guest is trying to access.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Reviewed-by: Paul Durrant <paul.durrant@xxxxxxxxxx>

> ---
> Cc: Jan Beulich <jbeulich@xxxxxxxx>
> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Cc: Paul Durrant <paul.durrant@xxxxxxxxxx>
> ---
> Changes since v6:
>  - Move allocation of mmcfg outside of the locked region.
>  - Do proper overlap checks when adding mmcfg regions.
>  - Return _RETRY if the mcfg region cannot be found in the read/write
>    handlers. This means the mcfg area has been removed between the
>    accept and the read/write calls.
> 
> Changes since v5:
>  - Switch to use pci_sbdf_t.
>  - Switch to the new per vpci locks.
>  - Move the mmcfg related external definitions to asm-x86/pci.h.
> 
> Changes since v4:
>  - Change the attribute of pvh_setup_mmcfg to __hwdom_init.
>  - Try to add as many MMCFG regions as possible, even if one fails to
>    add.
>  - Change some fields of the hvm_mmcfg struct: turn size into a
>    unsigned int, segment into uint16_t and bus into uint8_t.
>  - Convert some address parameters from unsigned long to paddr_t for
>    consistency.
>  - Make vpci_mmcfg_decode_addr return the decoded register in the
>    return of the function.
>  - Introduce a new macro to convert a MMCFG address into a BDF, and
>    use it in vpci_mmcfg_decode_addr to clarify the logic.
>  - In vpci_mmcfg_{read/write} unify the logic for 8B accesses and
>    smaller ones.
>  - Add the __hwdom_init attribute to register_vpci_mmcfg_handler.
>  - Test that reg + size doesn't cross a device boundary.
> 
> Changes since v3:
>  - Propagate changes from previous patches: drop xen_ prefix for vpci
>    functions, pass slot and func instead of devfn and fix the error
>    paths of the MMCFG handlers.
>  - s/ecam/mmcfg/.
>  - Move the destroy code to a separate function, so the hvm_mmcfg
>    struct can be private to hvm/io.c.
>  - Constify the return of vpci_mmcfg_find.
>  - Use d instead of v->domain in vpci_mmcfg_accept.
>  - Allow 8byte accesses to the mmcfg.
> 
> Changes since v1:
>  - Added locking.
> ---
>  xen/arch/x86/hvm/dom0_build.c    |  21 +++++
>  xen/arch/x86/hvm/hvm.c           |   4 +
>  xen/arch/x86/hvm/io.c            | 174
> ++++++++++++++++++++++++++++++++++++++-
>  xen/arch/x86/x86_64/mmconfig.h   |   4 -
>  xen/include/asm-x86/hvm/domain.h |   4 +
>  xen/include/asm-x86/hvm/io.h     |   7 ++
>  xen/include/asm-x86/pci.h        |   6 ++
>  7 files changed, 215 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/dom0_build.c
> b/xen/arch/x86/hvm/dom0_build.c
> index a67071c739..9e841c103d 100644
> --- a/xen/arch/x86/hvm/dom0_build.c
> +++ b/xen/arch/x86/hvm/dom0_build.c
> @@ -22,6 +22,7 @@
>  #include <xen/init.h>
>  #include <xen/libelf.h>
>  #include <xen/multiboot.h>
> +#include <xen/pci.h>
>  #include <xen/softirq.h>
> 
>  #include <acpi/actables.h>
> @@ -1049,6 +1050,24 @@ static int __init pvh_setup_acpi(struct domain *d,
> paddr_t start_info)
>      return 0;
>  }
> 
> +static void __hwdom_init pvh_setup_mmcfg(struct domain *d)
> +{
> +    unsigned int i;
> +    int rc;
> +
> +    for ( i = 0; i < pci_mmcfg_config_num; i++ )
> +    {
> +        rc = register_vpci_mmcfg_handler(d, pci_mmcfg_config[i].address,
> +                                         
> pci_mmcfg_config[i].start_bus_number,
> +                                         pci_mmcfg_config[i].end_bus_number,
> +                                         pci_mmcfg_config[i].pci_segment);
> +        if ( rc )
> +            printk("Unable to setup MMCFG handler at %#lx for segment %u\n",
> +                   pci_mmcfg_config[i].address,
> +                   pci_mmcfg_config[i].pci_segment);
> +    }
> +}
> +
>  int __init dom0_construct_pvh(struct domain *d, const module_t *image,
>                                unsigned long image_headroom,
>                                module_t *initrd,
> @@ -1091,6 +1110,8 @@ int __init dom0_construct_pvh(struct domain *d,
> const module_t *image,
>          return rc;
>      }
> 
> +    pvh_setup_mmcfg(d);
> +
>      panic("Building a PVHv2 Dom0 is not yet supported.");
>      return 0;
>  }
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 8ed6718bf6..fd16d9c06f 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -581,8 +581,10 @@ int hvm_domain_initialise(struct domain *d,
> unsigned long domcr_flags,
>      spin_lock_init(&d->arch.hvm_domain.irq_lock);
>      spin_lock_init(&d->arch.hvm_domain.uc_lock);
>      spin_lock_init(&d->arch.hvm_domain.write_map.lock);
> +    rwlock_init(&d->arch.hvm_domain.mmcfg_lock);
>      INIT_LIST_HEAD(&d->arch.hvm_domain.write_map.list);
>      INIT_LIST_HEAD(&d->arch.hvm_domain.g2m_ioport_list);
> +    INIT_LIST_HEAD(&d->arch.hvm_domain.mmcfg_regions);
> 
>      rc = create_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0, NULL,
> NULL);
>      if ( rc )
> @@ -728,6 +730,8 @@ void hvm_domain_destroy(struct domain *d)
>          list_del(&ioport->list);
>          xfree(ioport);
>      }
> +
> +    destroy_vpci_mmcfg(&d->arch.hvm_domain.mmcfg_regions);
>  }
> 
>  static int hvm_save_tsc_adjust(struct domain *d, hvm_domain_context_t
> *h)
> diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
> index 6c12cf5d22..f853739c7d 100644
> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -283,7 +283,7 @@ unsigned int hvm_pci_decode_addr(unsigned int cf8,
> unsigned int addr,
>  static bool vpci_access_allowed(unsigned int reg, unsigned int len)
>  {
>      /* Check access size. */
> -    if ( len != 1 && len != 2 && len != 4 )
> +    if ( len != 1 && len != 2 && len != 4 && len != 8 )
>          return false;
> 
>      /* Check that access is size aligned. */
> @@ -381,6 +381,178 @@ void register_vpci_portio_handler(struct domain
> *d)
>      handler->ops = &vpci_portio_ops;
>  }
> 
> +struct hvm_mmcfg {
> +    struct list_head next;
> +    paddr_t addr;
> +    unsigned int size;
> +    uint16_t segment;
> +    uint8_t start_bus;
> +};
> +
> +/* Handlers to trap PCI MMCFG config accesses. */
> +static const struct hvm_mmcfg *vpci_mmcfg_find(const struct domain *d,
> +                                               paddr_t addr)
> +{
> +    const struct hvm_mmcfg *mmcfg;
> +
> +    list_for_each_entry ( mmcfg, &d->arch.hvm_domain.mmcfg_regions,
> next )
> +        if ( addr >= mmcfg->addr && addr < mmcfg->addr + mmcfg->size )
> +            return mmcfg;
> +
> +    return NULL;
> +}
> +
> +static unsigned int vpci_mmcfg_decode_addr(const struct hvm_mmcfg
> *mmcfg,
> +                                           paddr_t addr, pci_sbdf_t *sbdf)
> +{
> +    addr -= mmcfg->addr;
> +    sbdf->bdf = MMCFG_BDF(addr);
> +    sbdf->bus += mmcfg->start_bus;
> +    sbdf->seg = mmcfg->segment;
> +
> +    return addr & (PCI_CFG_SPACE_EXP_SIZE - 1);
> +}
> +
> +static int vpci_mmcfg_accept(struct vcpu *v, unsigned long addr)
> +{
> +    struct domain *d = v->domain;
> +    bool found;
> +
> +    read_lock(&d->arch.hvm_domain.mmcfg_lock);
> +    found = vpci_mmcfg_find(d, addr);
> +    read_unlock(&d->arch.hvm_domain.mmcfg_lock);
> +
> +    return found;
> +}
> +
> +static int vpci_mmcfg_read(struct vcpu *v, unsigned long addr,
> +                           unsigned int len, unsigned long *data)
> +{
> +    struct domain *d = v->domain;
> +    const struct hvm_mmcfg *mmcfg;
> +    unsigned int reg;
> +    pci_sbdf_t sbdf;
> +
> +    *data = ~0ul;
> +
> +    read_lock(&d->arch.hvm_domain.mmcfg_lock);
> +    mmcfg = vpci_mmcfg_find(d, addr);
> +    if ( !mmcfg )
> +    {
> +        read_unlock(&d->arch.hvm_domain.mmcfg_lock);
> +        return X86EMUL_RETRY;
> +    }
> +
> +    reg = vpci_mmcfg_decode_addr(mmcfg, addr, &sbdf);
> +    read_unlock(&d->arch.hvm_domain.mmcfg_lock);
> +
> +    if ( !vpci_access_allowed(reg, len) ||
> +         (reg + len) > PCI_CFG_SPACE_EXP_SIZE )
> +        return X86EMUL_OKAY;
> +
> +    /*
> +     * According to the PCIe 3.1A specification:
> +     *  - Configuration Reads and Writes must usually be DWORD or smaller
> +     *    in size.
> +     *  - Because Root Complex implementations are not required to support
> +     *    accesses to a RCRB that cross DW boundaries [...] software
> +     *    should take care not to cause the generation of such accesses
> +     *    when accessing a RCRB unless the Root Complex will support the
> +     *    access.
> +     *  Xen however supports 8byte accesses by splitting them into two
> +     *  4byte accesses.
> +     */
> +    *data = vpci_read(sbdf, reg, min(4u, len));
> +    if ( len == 8 )
> +        *data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
> +
> +    return X86EMUL_OKAY;
> +}
> +
> +static int vpci_mmcfg_write(struct vcpu *v, unsigned long addr,
> +                            unsigned int len, unsigned long data)
> +{
> +    struct domain *d = v->domain;
> +    const struct hvm_mmcfg *mmcfg;
> +    unsigned int reg;
> +    pci_sbdf_t sbdf;
> +
> +    read_lock(&d->arch.hvm_domain.mmcfg_lock);
> +    mmcfg = vpci_mmcfg_find(d, addr);
> +    if ( !mmcfg )
> +    {
> +        read_unlock(&d->arch.hvm_domain.mmcfg_lock);
> +        return X86EMUL_RETRY;
> +    }
> +
> +    reg = vpci_mmcfg_decode_addr(mmcfg, addr, &sbdf);
> +    read_unlock(&d->arch.hvm_domain.mmcfg_lock);
> +
> +    if ( !vpci_access_allowed(reg, len) ||
> +         (reg + len) > PCI_CFG_SPACE_EXP_SIZE )
> +        return X86EMUL_OKAY;
> +
> +    vpci_write(sbdf, reg, min(4u, len), data);
> +    if ( len == 8 )
> +        vpci_write(sbdf, reg + 4, 4, data >> 32);
> +
> +    return X86EMUL_OKAY;
> +}
> +
> +static const struct hvm_mmio_ops vpci_mmcfg_ops = {
> +    .check = vpci_mmcfg_accept,
> +    .read = vpci_mmcfg_read,
> +    .write = vpci_mmcfg_write,
> +};
> +
> +int __hwdom_init register_vpci_mmcfg_handler(struct domain *d, paddr_t
> addr,
> +                                             unsigned int start_bus,
> +                                             unsigned int end_bus,
> +                                             unsigned int seg)
> +{
> +    struct hvm_mmcfg *mmcfg, *new = xmalloc(struct hvm_mmcfg);
> +
> +    ASSERT(is_hardware_domain(d));
> +
> +    if ( !new )
> +        return -ENOMEM;
> +
> +    new->addr = addr + (start_bus << 20);
> +    new->start_bus = start_bus;
> +    new->segment = seg;
> +    new->size = (end_bus - start_bus + 1) << 20;;
> +
> +    write_lock(&d->arch.hvm_domain.mmcfg_lock);
> +    list_for_each_entry ( mmcfg, &d->arch.hvm_domain.mmcfg_regions,
> next )
> +        if ( new->addr < mmcfg->addr + mmcfg->size &&
> +             mmcfg->addr < new->addr + new->size )
> +        {
> +            write_unlock(&d->arch.hvm_domain.mmcfg_lock);
> +            xfree(new);
> +            return -EEXIST;
> +        }
> +
> +    if ( list_empty(&d->arch.hvm_domain.mmcfg_regions) )
> +        register_mmio_handler(d, &vpci_mmcfg_ops);
> +
> +    list_add(&new->next, &d->arch.hvm_domain.mmcfg_regions);
> +    write_unlock(&d->arch.hvm_domain.mmcfg_lock);
> +
> +    return 0;
> +}
> +
> +void destroy_vpci_mmcfg(struct list_head *domain_mmcfg)
> +{
> +    while ( !list_empty(domain_mmcfg) )
> +    {
> +        struct hvm_mmcfg *mmcfg = list_first_entry(domain_mmcfg,
> +                                                   struct hvm_mmcfg, next);
> +
> +        list_del(&mmcfg->next);
> +        xfree(mmcfg);
> +    }
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/x86/x86_64/mmconfig.h
> b/xen/arch/x86/x86_64/mmconfig.h
> index 7537519414..2e836848ad 100644
> --- a/xen/arch/x86/x86_64/mmconfig.h
> +++ b/xen/arch/x86/x86_64/mmconfig.h
> @@ -74,10 +74,6 @@ static inline void mmio_config_writel(void __iomem
> *pos, u32 val)
>      asm volatile("movl %%eax,(%1)" :: "a" (val), "r" (pos) : "memory");
>  }
> 
> -/* external variable defines */
> -extern int pci_mmcfg_config_num;
> -extern struct acpi_mcfg_allocation *pci_mmcfg_config;
> -
>  /* function prototypes */
>  int acpi_parse_mcfg(struct acpi_table_header *header);
>  int pci_mmcfg_reserved(uint64_t address, unsigned int segment,
> diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-
> x86/hvm/domain.h
> index 7f128c05ff..d1d933d791 100644
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -184,6 +184,10 @@ struct hvm_domain {
>      /* List of guest to machine IO ports mapping. */
>      struct list_head g2m_ioport_list;
> 
> +    /* List of MMCFG regions trapped by Xen. */
> +    struct list_head mmcfg_regions;
> +    rwlock_t mmcfg_lock;
> +
>      /* List of permanently write-mapped pages. */
>      struct {
>          spinlock_t lock;
> diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
> index ff0bea5d53..55a0a67754 100644
> --- a/xen/include/asm-x86/hvm/io.h
> +++ b/xen/include/asm-x86/hvm/io.h
> @@ -163,6 +163,13 @@ void register_g2m_portio_handler(struct domain
> *d);
>  /* HVM port IO handler for vPCI accesses. */
>  void register_vpci_portio_handler(struct domain *d);
> 
> +/* HVM MMIO handler for PCI MMCFG accesses. */
> +int register_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
> +                                unsigned int start_bus, unsigned int end_bus,
> +                                unsigned int seg);
> +/* Destroy tracked MMCFG areas. */
> +void destroy_vpci_mmcfg(struct list_head *domain_mmcfg);
> +
>  #endif /* __ASM_X86_HVM_IO_H__ */
> 
> 
> diff --git a/xen/include/asm-x86/pci.h b/xen/include/asm-x86/pci.h
> index 36801d317b..cc05045e9c 100644
> --- a/xen/include/asm-x86/pci.h
> +++ b/xen/include/asm-x86/pci.h
> @@ -6,6 +6,8 @@
>  #define CF8_ADDR_HI(cf8) (  ((cf8) & 0x0f000000) >> 16)
>  #define CF8_ENABLED(cf8) (!!((cf8) & 0x80000000))
> 
> +#define MMCFG_BDF(addr)  ( ((addr) & 0x0ffff000) >> 12)
> +
>  #define IS_SNB_GFX(id) (id == 0x01068086 || id == 0x01168086 \
>                          || id == 0x01268086 || id == 0x01028086 \
>                          || id == 0x01128086 || id == 0x01228086 \
> @@ -26,4 +28,8 @@ bool_t pci_mmcfg_decode(unsigned long mfn,
> unsigned int *seg,
>  bool_t pci_ro_mmcfg_decode(unsigned long mfn, unsigned int *seg,
>                             unsigned int *bdf);
> 
> +/* MMCFG external variable defines */
> +extern int pci_mmcfg_config_num;
> +extern struct acpi_mcfg_allocation *pci_mmcfg_config;
> +
>  #endif /* __X86_PCI_H__ */
> --
> 2.13.5 (Apple Git-94)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.