WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH V3 10/10] Introduce Xen PCI Passthrough, MSI (3/3

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH V3 10/10] Introduce Xen PCI Passthrough, MSI (3/3)
From: Anthony PERARD <anthony.perard@xxxxxxxxxx>
Date: Fri, 11 Nov 2011 19:18:42 +0000
Cc: Xen Devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, Jiang, Stefano Stabellini <Stefano.Stabellini@xxxxxxxxxxxxx>, Yunhong <yunhong.jiang@xxxxxxxxx>, Shan Haitao <haitao.shan@xxxxxxxxx>, QEMU-devel <qemu-devel@xxxxxxxxxx>
Delivery-date: Fri, 11 Nov 2011 11:19:45 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20111110221027.GB23837@xxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <1319814456-8158-1-git-send-email-anthony.perard@xxxxxxxxxx> <1319814456-8158-11-git-send-email-anthony.perard@xxxxxxxxxx> <20111110221027.GB23837@xxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Alpine 2.00 (DEB 1167 2008-08-23)
On Thu, 10 Nov 2011, Konrad Rzeszutek Wilk wrote:

> On Fri, Oct 28, 2011 at 04:07:36PM +0100, Anthony PERARD wrote:
> > From: Jiang Yunhong <yunhong.jiang@xxxxxxxxx>
> >
> > Signed-off-by: Jiang Yunhong <yunhong.jiang@xxxxxxxxx>
> > Signed-off-by: Shan Haitao <haitao.shan@xxxxxxxxx>
> > Signed-off-by: Anthony PERARD <anthony.perard@xxxxxxxxxx>
> > ---
> >  Makefile.target                      |    1 +
> >  hw/apic-msidef.h                     |    2 +
> >  hw/xen_pci_passthrough.c             |   27 ++-
> >  hw/xen_pci_passthrough.h             |   55 +++
> >  hw/xen_pci_passthrough_config_init.c |  495 +++++++++++++++++++++++++-
> >  hw/xen_pci_passthrough_msi.c         |  667 
> > ++++++++++++++++++++++++++++++++++
> >  6 files changed, 1240 insertions(+), 7 deletions(-)
> >  create mode 100644 hw/xen_pci_passthrough_msi.c
> >

[...]

> > +/* write Message Upper Address register */
> > +static int pt_msgaddr64_reg_write(XenPCIPassthroughState *s,
> > +                                  XenPTReg *cfg_entry, uint32_t *value,
> > +                                  uint32_t dev_value, uint32_t valid_mask)
> > +{
> > +    XenPTRegInfo *reg = cfg_entry->reg;
> > +    uint32_t writable_mask = 0;
> > +    uint32_t throughable_mask = 0;
> > +    uint32_t old_addr = cfg_entry->data;
> > +
> > +    /* check whether the type is 64 bit or not */
> > +    if (!(s->msi->flags & PCI_MSI_FLAGS_64BIT)) {
> > +        /* exit I/O emulator */
> > +        PT_LOG("Error: why comes to Upper Address without 64 bit 
> > support??\n");
>
> Um, not sure what that means.

This is probably unprobable.

I'll change the comment for "write to the Upper Address without 64 bit
support"

> > +        return -1;
> > +    }
> > +
> > +    /* modify emulate register */
> > +    writable_mask = reg->emu_mask & ~reg->ro_mask & valid_mask;
> > +    cfg_entry->data = PT_MERGE_VALUE(*value, cfg_entry->data, 
> > writable_mask);
> > +    /* update the msi_info too */
> > +    s->msi->addr_hi = cfg_entry->data;
> > +
> > +    /* create value for writing to I/O device register */
> > +    throughable_mask = ~reg->emu_mask & valid_mask;
> > +    *value = PT_MERGE_VALUE(*value, dev_value, throughable_mask);
> > +
> > +    /* update MSI */
> > +    if (cfg_entry->data != old_addr) {
> > +        if (s->msi->flags & PT_MSI_FLAG_MAPPED) {
> > +            pt_msi_update(s);
> > +        }
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +
> > +/* this function will be called twice (for 32 bit and 64 bit type) */
> > +/* write Message Data register */
> > +static int pt_msgdata_reg_write(XenPCIPassthroughState *s, XenPTReg 
> > *cfg_entry,
> > +                                uint16_t *value, uint16_t dev_value,
> > +                                uint16_t valid_mask)
> > +{
> > +    XenPTRegInfo *reg = cfg_entry->reg;
> > +    uint16_t writable_mask = 0;
> > +    uint16_t throughable_mask = 0;
> > +    uint16_t old_data = cfg_entry->data;
> > +    uint32_t flags = s->msi->flags;
> > +    uint32_t offset = reg->offset;
> > +
> > +    /* check the offset whether matches the type or not */
> > +    if (!((offset == PCI_MSI_DATA_64) &&  (flags & PCI_MSI_FLAGS_64BIT)) &&
> > +        !((offset == PCI_MSI_DATA_32) && !(flags & PCI_MSI_FLAGS_64BIT))) {
> > +        /* exit I/O emulator */
> > +        PT_LOG("Error: the offset is not match with the 32/64 bit 
> > type!!\n");
>
> I think it means: "The offset does not match the 32/64 bit type"
>
> > +        return -1;
> > +    }
> > +
> > +    /* modify emulate register */
> > +    writable_mask = reg->emu_mask & ~reg->ro_mask & valid_mask;
> > +    cfg_entry->data = PT_MERGE_VALUE(*value, cfg_entry->data, 
> > writable_mask);
> > +    /* update the msi_info too */
> > +    s->msi->data = cfg_entry->data;
> > +
> > +    /* create value for writing to I/O device register */
> > +    throughable_mask = ~reg->emu_mask & valid_mask;
> > +    *value = PT_MERGE_VALUE(*value, dev_value, throughable_mask);
> > +
> > +    /* update MSI */
> > +    if (cfg_entry->data != old_data) {
> > +        if (flags & PT_MSI_FLAG_MAPPED) {
> > +            pt_msi_update(s);
> > +        }
> > +    }
> > +
> > +    return 0;
> > +}

[...]

> >  /****************************
> >   * Capabilities
> > @@ -1664,6 +2080,48 @@ static uint8_t 
> > pt_pcie_size_init(XenPCIPassthroughState *s,
> >
> >      return pcie_size;
> >  }
> > +/* get MSI Capability Structure register group size */
> > +static uint8_t pt_msi_size_init(XenPCIPassthroughState *s,
> > +                                const XenPTRegGroupInfo *grp_reg,
> > +                                uint32_t base_offset)
> > +{
> > +    PCIDevice *d = &s->dev;
> > +    uint16_t msg_ctrl = 0;
> > +    uint8_t msi_size = 0xa;
> > +
> > +    msg_ctrl = pci_get_word(d->config + (base_offset + PCI_MSI_FLAGS));
> > +
> > +    /* check 64 bit address capable & Per-vector masking capable */
>
> ehh?

Precisely!

> > +    if (msg_ctrl & PCI_MSI_FLAGS_64BIT) {
> > +        msi_size += 4;
> > +    }
> > +    if (msg_ctrl & PCI_MSI_FLAGS_MASKBIT) {
> > +        msi_size += 10;
> > +    }
> > +
> > +    s->msi = g_malloc0(sizeof (XenPTMSI));
> > +    s->msi->pirq = -1;
>
> Is there a define for this -1?

Probably not.

What about PT_UNASSIGNED_MSI_PIRQ ?

> > +    PT_LOG("done\n");
> > +
> > +    return msi_size;
> > +}

[...]

> > @@ -1908,8 +2382,11 @@ static int pt_init_pci_config(XenPCIPassthroughState 
> > *s)
> >      /* reinitialize all emulate register */
> >      pt_config_reinit(s);
> >
> > +    /* setup MSI-INTx translation if support */
> > +    ret = pt_enable_msi_translate(s);
> > +
> >      /* rebind machine_irq to device */
> > -    if (s->machine_irq != 0) {
> > +    if (ret < 0 && s->machine_irq != 0) {
>
> So can machine_irq be -1? Or is it only pirq that can be -1?

I think only pirq can be -1. And the default value of machine_irq is 0.

At least, the comment on top of xen_pci_passthrough.c says the same
thing.

>
> >          uint8_t e_device = PCI_SLOT(s->dev.devfn);
> >          uint8_t e_intx = pci_intx(s);
> >
> > @@ -2043,6 +2520,14 @@ void pt_config_delete(XenPCIPassthroughState *s)
> >      struct XenPTRegGroup *reg_group, *next_grp;
> >      struct XenPTReg *reg, *next_reg;
> >
> > +    /* free MSI/MSI-X info table */
> > +    if (s->msix) {
> > +        pt_msix_delete(s);
> > +    }
> > +    if (s->msi) {
> > +        g_free(s->msi);
> > +    }
> > +
> >      /* free Power Management info table */
> >      if (s->pm_state) {
> >          if (s->pm_state->pm_timer) {

[...]

> > +/*********************************/
> > +/* MSI virtuailization functions */
>
>
> virtualization
> > +
> > +/*
> > + * setup physical msi, but didn't enable it
>
> but don't
>
> > + */
> > +int pt_msi_setup(XenPCIPassthroughState *s)
> > +{
> > +    int pirq = -1;
> > +    uint8_t gvec = 0;
> > +
> > +    if (!(s->msi->flags & PT_MSI_FLAG_UNINIT)) {
> > +        PT_LOG("Error: setup physical after initialized??\n");
>
> I am not sure what that says.

Someone eats some words :(.

I thinks the comment come from this function: pt_msgctrl_reg_write.
pt_msgctrl_reg_write do the setup on the emulation side, and call
pt_msi_setup, and unset PT_MSI_FLAG_UNINIT. (this flags is only internal
to emulator)

I supose this prevent the function to been called to many times
(probably by the guest).

So, maybe "setup physical MSI when it's already initialized" would be a
better log.

> > +        return -1;
> > +    }
> > +
> > +    gvec = s->msi->data & 0xFF;
> > +    if (!gvec) {
> > +        /* if gvec is 0, the guest is asking for a particular pirq that
> > +         * is passed as dest_id */
> > +        pirq = (s->msi->addr_hi & 0xffffff00) |
> > +               ((s->msi->addr_lo >> MSI_ADDR_DEST_ID_SHIFT) & 0xff);
> > +        if (!pirq) {
> > +            /* this probably identifies an misconfiguration of the guest,
> > +             * try the emulated path */
> > +            pirq = -1;
> > +        } else {
> > +            PT_LOG("pt_msi_setup requested pirq = %d\n", pirq);
> > +        }
> > +    }
> > +
> > +    if (xc_physdev_map_pirq_msi(xen_xc, xen_domid, AUTO_ASSIGN, &pirq,
> > +                                PCI_DEVFN(s->real_device->dev,
> > +                                          s->real_device->func),
> > +                                s->real_device->bus, 0, 0)) {
> > +        PT_LOG("Error: Mapping of MSI failed.\n");
>
> Give more details. As in what device failed. PErhaps even the return code?

Ok.

> > +        return -1;
> > +    }
> > +
> > +    if (pirq < 0) {
> > +        PT_LOG("Error: Invalid pirq number\n");
> > +        return -1;
> > +    }
> > +
> > +    s->msi->pirq = pirq;
> > +    PT_LOG("msi mapped with pirq %x\n", pirq);
> > +
> > +    return 0;
> > +}
> > +

[...]

> > +/*********************************/
> > +/* MSI-X virtulization functions */
>
>
> virtu...
>
> > +
> > +static void mask_physical_msix_entry(XenPCIPassthroughState *s,
> > +                                     int entry_nr, int mask)
> > +{
> > +    void *phys_off;
> > +
> > +    phys_off = s->msix->phys_iomem_base + 16 * entry_nr + 12;
> > +    *(uint32_t *)phys_off = mask;
> > +}
> > +
> > +static int pt_msix_update_one(XenPCIPassthroughState *s, int entry_nr)
> > +{
> > +    XenMSIXEntry *entry = &s->msix->msix_entry[entry_nr];
> > +    int pirq = entry->pirq;
> > +    int gvec = entry->io_mem[2] & 0xff;
> > +    uint64_t gaddr = *(uint64_t *)&entry->io_mem[0];
> > +    uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
> > +    int ret;
> > +
> > +    if (!entry->flags) {
> > +        return 0;
> > +    }
> > +
> > +    if (!gvec) {
> > +        /* if gvec is 0, the guest is asking for a particular pirq that
> > +         * is passed as dest_id */
> > +        pirq = ((gaddr >> 32) & 0xffffff00) |
> > +               (((gaddr & 0xffffffff) >> MSI_ADDR_DEST_ID_SHIFT) & 0xff);
> > +        if (!pirq) {
> > +            /* this probably identifies an misconfiguration of the guest,
> > +             * try the emulated path */
> > +            pirq = -1;
> > +        } else {
> > +            PT_LOG("pt_msix_update_one requested pirq = %d\n", pirq);
>
> This is the same code as in the MSI case. Could it be coalesced ?

I can try.


[...]

> > +void pt_msix_disable(XenPCIPassthroughState *s)
> > +{
> > +    PCIDevice *d = &s->dev;
> > +    uint8_t gvec = 0;
> > +    uint32_t gflags = 0;
> > +    uint64_t addr = 0;
> > +    int i = 0;
> > +    XenMSIXEntry *entry = NULL;
> > +
> > +    msix_set_enable(s, 0);
> > +
> > +    for (i = 0; i < s->msix->total_entries; i++) {
> > +        entry = &s->msix->msix_entry[i];
> > +
> > +        if (entry->pirq == -1) {
> > +            continue;
> > +        }
> > +
> > +        gvec = entry->io_mem[2] & 0xff;
> > +        addr = *(uint64_t *)&entry->io_mem[0];
> > +        gflags = __get_msi_gflags(entry->io_mem[2], addr);
> > +
> > +        PT_LOG("Unbind msix with pirq %x, gvec %x\n",
> > +                entry->pirq, gvec);
> > +
> > +        if (xc_domain_unbind_msi_irq(xen_xc, xen_domid, gvec,
> > +                                        entry->pirq, gflags)) {
> > +            PT_LOG("Error: Unbinding of MSI-X failed. [%02x:%02x.%x]\n",
> > +                   pci_bus_num(d->bus), PCI_SLOT(d->devfn),
> > +                   PCI_FUNC(d->devfn));
> > +        } else {
> > +            PT_LOG("Unmap msix with pirq %x\n", entry->pirq);
> > +
> > +            if (xc_physdev_unmap_pirq(xen_xc, xen_domid, entry->pirq)) {
> > +                PT_LOG("Error: Unmapping of MSI-X failed. 
> > [%02x:%02x.%x]\n",
> > +                       pci_bus_num(d->bus),
> > +                       PCI_SLOT(d->devfn), PCI_FUNC(d->devfn));
>
> There is a lot of those error reporting where the pci_bus_num, PCI_SLOT, etc
> are used. Perhaps this should be in a function?

Yes, that will help to have a better reporting.

> > +            }
> > +        }
> > +        /* clear msi-x info */
> > +        entry->pirq = -1;
> > +        entry->flags = 0;
> > +    }
> > +}
> > +
> > +int pt_msix_update_remap(XenPCIPassthroughState *s, int bar_index)
> > +{
> > +    XenMSIXEntry *entry;
> > +    int i, ret;
> > +
> > +    if (!(s->msix && s->msix->bar_index == bar_index)) {
> > +        return 0;
> > +    }
> > +
> > +    for (i = 0; i < s->msix->total_entries; i++) {
> > +        entry = &s->msix->msix_entry[i];
> > +        if (entry->pirq != -1) {
> > +            ret = xc_domain_unbind_pt_irq(xen_xc, xen_domid, entry->pirq,
> > +                                          PT_IRQ_TYPE_MSI, 0, 0, 0, 0);
> > +            if (ret) {
> > +                PT_LOG("Error: unbind MSI-X entry %d failed\n", 
> > entry->pirq);
> > +            }
> > +            entry->flags = 1;
> > +        }
> > +    }
> > +    pt_msix_update(s);
> > +
> > +    return 0;
> > +}
> > +
> > +static void pci_msix_invalid_write(void *opaque, target_phys_addr_t addr,
> > +                                   uint32_t val)
> > +{
> > +    PT_LOG("Error: Invalid write to MSI-X table,"
> > +           " only dword access is allowed.\n");
> > +}
> > +
> > +static void pci_msix_writel(void *opaque, target_phys_addr_t addr,
> > +                            uint32_t val)
> > +{
> > +    XenPCIPassthroughState *s = (XenPCIPassthroughState *)opaque;
> > +    XenPTMSIX *msix = s->msix;
> > +    XenMSIXEntry *entry;
> > +    int entry_nr, offset;
> > +    void *phys_off;
> > +    uint32_t vec_ctrl;
> > +
> > +    if (addr % 4) {
> > +        PT_LOG("Error: Unaligned dword access to MSI-X table, "
> > +                "addr %016"PRIx64"\n", addr);
> > +        return;
> > +    }
> > +
> > +    PT_LOG("addr: "TARGET_FMT_plx", val: %#x\n", addr, val);
>
> Huh?

I will remove this one.

> > +
> > +    entry_nr = addr / 16;
> > +    entry = &msix->msix_entry[entry_nr];
> > +    offset = (addr % 16) / 4;
> > +
> > +    /*
> > +     * If Xen intercepts the mask bit access, io_mem[3] may not be
> > +     * up-to-date. Read from hardware directly.
> > +     */
> > +    phys_off = s->msix->phys_iomem_base + 16 * entry_nr + 12;
> > +    vec_ctrl = *(uint32_t *)phys_off;
> > +
> > +    if (offset != 3 && msix->enabled && !(vec_ctrl & 0x1)) {
> > +        PT_LOG("Error: Can't update msix entry %d since MSI-X is already "
> > +                "function.\n", entry_nr);
>
> already function? already on? active?

Probably.

But I don't know what it is check here.

> > +        return;
> > +    }
> > +
> > +    if (offset != 3 && entry->io_mem[offset] != val) {
> > +        entry->flags = 1;
> > +    }
> > +    entry->io_mem[offset] = val;
> > +
> > +    if (offset == 3) {
> > +        if (msix->enabled && !(val & 0x1)) {
> > +            pt_msix_update_one(s, entry_nr);
> > +        }
> > +        mask_physical_msix_entry(s, entry_nr, entry->io_mem[3] & 0x1);
> > +    }
> > +}

Thanks,

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>