Xen project Mailing List

Re: [Xen-devel] [PATCH v5 11/11] vpci/msix: add MSI-X handlers

To: "Roger Pau Monne" <roger.pau@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Thu, 07 Sep 2017 10:12:59 -0600

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, boris.ostrovsky@xxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Thu, 07 Sep 2017 16:13:06 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 14.08.17 at 16:28, <roger.pau@xxxxxxxxxx> wrote: > +void vpci_msix_arch_mask(struct vpci_arch_msix_entry *arch, > + struct pci_dev *pdev, bool mask) > +{ > + if ( arch->pirq == INVALID_PIRQ ) > + return; How come no similar guard is needed in vpci_msi_arch_mask()? > +int vpci_msix_arch_enable(struct vpci_arch_msix_entry *arch, Considering the first parameter's type, is the name actually appropriate? From the parameter (and the code) I would conclude you enable a single entry here only, not the entire device. This may apply to functions further below, too. > + struct pci_dev *pdev, uint64_t address, > + uint32_t data, unsigned int entry_nr, > + paddr_t table_base) > +{ > + struct domain *d = pdev->domain; > + struct msi_info msi_info = { > + .seg = pdev->seg, > + .bus = pdev->bus, > + .devfn = pdev->devfn, > + .table_base = table_base, > + .entry_nr = entry_nr, > + }; > + xen_domctl_bind_pt_irq_t bind = { > + .irq_type = PT_IRQ_TYPE_MSI, > + .u.msi.gvec = msi_vector(data), > + .u.msi.gflags = msi_flags(data, address), > + }; > + int rc; > + > + ASSERT(arch->pirq == INVALID_PIRQ); > + > + /* > + * Simple sanity check before trying to setup the interrupt. > + * According to the Intel SDM, bits [31, 20] must contain the > + * value 0xfee. This avoids needlessly setting up pirqs for entries > + * the guest has not actually configured. > + */ > + if ( (address & 0xfff00000) != MSI_ADDR_HEADER ) > + return -EINVAL; What about the upper half of the address? That needs to be zero, I think. > + rc = allocate_and_map_msi_pirq(d, -1, &arch->pirq, > + MAP_PIRQ_TYPE_MSI, &msi_info); > + if ( rc ) > + { > + gdprintk(XENLOG_ERR, > + "%04x:%02x:%02x.%u: unable to map MSI-X PIRQ entry %u: > %d\n", > + pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn), entry_nr, rc); > + return rc; > + } > + > + bind.machine_irq = arch->pirq; > + pcidevs_lock(); > + rc = pt_irq_create_bind(d, &bind); > + if ( rc ) > + { > + gdprintk(XENLOG_ERR, > + "%04x:%02x:%02x.%u: unable to create MSI-X bind %u: %d\n", > + pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn), entry_nr, rc); > + spin_lock(&d->event_lock); > + unmap_domain_pirq(d, arch->pirq); > + spin_unlock(&d->event_lock); > + pcidevs_unlock(); > + arch->pirq = INVALID_PIRQ; > + return rc; > + } > + pcidevs_unlock(); > + > + return 0; > +} Much of this looks pretty similar to the MSI counterpart. Can any reasonable portion of this be factored out? > --- a/xen/drivers/vpci/header.c > +++ b/xen/drivers/vpci/header.c > @@ -20,6 +20,7 @@ > #include <xen/sched.h> > #include <xen/vpci.h> > #include <xen/p2m-common.h> > +#include <asm/p2m.h> No need to include xen/p2m-common.h then anymore, although for a common source file it's probably not entirely right to include this header at all. > @@ -89,11 +90,45 @@ static int vpci_map_range(unsigned long s, unsigned long > e, void *data) > return modify_mmio(map->d, _gfn(s), _mfn(s), e - s + 1, map->map); > } > > +static int vpci_unmap_msix(struct domain *d, struct vpci_msix_mem *msix) > +{ > + unsigned long gfn; > + > + for ( gfn = PFN_DOWN(msix->addr); gfn <= PFN_UP(msix->addr + msix->size); Missing a subtraction of 1 in the PFN_UP() invocation? And why <= ? > @@ -102,6 +137,42 @@ static int vpci_modify_bar(struct domain *d, const > struct vpci_bar *bar, > if ( IS_ERR(mem) ) > return PTR_ERR(mem); > > + for ( i = 0; i < ARRAY_SIZE(bar->msix); i++ ) > + { > + struct vpci_msix_mem *msix = bar->msix[i]; > + > + if ( !msix || msix->addr == INVALID_PADDR ) > + continue; > + > + if ( map ) > + { > + /* > + * Make sure the MSI-X regions of the BAR are not mapped into the > + * domain p2m, or else the MSI-X handlers are useless. Only do > this > + * when mapping, since that's when the memory decoding on the > + * device is enabled. > + * > + * This is required because iommu_inclusive_mapping might have > + * mapped MSI-X regions into the guest p2m. > + */ > + rc = vpci_unmap_msix(d, msix); > + if ( rc ) > + { > + rangeset_destroy(mem); > + return rc; > + } > + } > + > + rc = rangeset_remove_range(mem, PFN_DOWN(msix->addr), > + PFN_DOWN(msix->addr + msix->size)); > + if ( rc ) > + { > + rangeset_destroy(mem); > + return rc; > + } > + > + } Why do you do this for the PBA regardless of whether it's shared with a table page? > @@ -255,6 +327,11 @@ static void vpci_bar_write(struct pci_dev *pdev, > unsigned int reg, > bar->addr &= ~(0xffffffffull << (hi ? 32 : 0)); > bar->addr |= (uint64_t)val << (hi ? 32 : 0); > > + /* Update any MSI-X areas contained in this BAR. */ > + for ( i = 0; i < ARRAY_SIZE(bar->msix); i++ ) > + if ( bar->msix[i] ) > + bar->msix[i]->addr = bar->addr + bar->msix[i]->offset; Is it really necessary to store this information separately? It can be re-calculated easily from BIR. > +#define MSIX_SIZE(num) offsetof(struct vpci_msix, entries[num]) Wouldn't this better be VMSIX_SIZE()? > +#define MSIX_ADDR_IN_RANGE(a, table) \ > + ((table)->addr != INVALID_PADDR && (a) >= (table)->addr && \ > + (a) < (table)->addr + (table)->size) > + > +static uint32_t vpci_msix_control_read(struct pci_dev *pdev, unsigned int > reg, > + const void *data) > +{ > + const struct vpci_msix *msix = data; > + uint16_t val; > + > + val = (msix->max_entries - 1) & PCI_MSIX_FLAGS_QSIZE; Is the explicit masking really necessary? And if so, it would look to be more correct to use MASK_INSR(). > +static struct vpci_msix *vpci_msix_find(struct domain *d, unsigned long addr) d can be const as it looks. > +static int vpci_msix_access_check(struct pci_dev *pdev, unsigned long addr, > + unsigned int len) > +{ > + uint8_t seg = pdev->seg, bus = pdev->bus; > + uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn); > + > + /* Only allow 32/64b accesses. */ > + if ( len != 4 && len != 8 ) > + { > + gdprintk(XENLOG_ERR, > + "%04x:%02x:%02x.%u: invalid MSI-X table access size: %u\n", > + seg, bus, slot, func, len); > + return -EINVAL; > + } > + > + /* Only allow aligned accesses. */ > + if ( (addr & (len - 1)) != 0 ) > + { > + gdprintk(XENLOG_ERR, > + "%04x:%02x:%02x.%u: MSI-X only allows aligned accesses\n", > + seg, bus, slot, func); > + return -EINVAL; > + } > + > + return 0; > +} Wouldn't this function better return bool? > +static int vpci_msix_read(struct vcpu *v, unsigned long addr, > + unsigned int len, unsigned long *data) > +{ > + struct domain *d = v->domain; > + struct vpci_msix *msix; > + const struct vpci_msix_entry *entry; > + unsigned int offset; > + > + vpci_rlock(d); > + msix = vpci_msix_find(d, addr); > + if ( !msix ) > + { > + vpci_runlock(d); > + *data = ~0ul; > + return X86EMUL_OKAY; > + } > + > + if ( vpci_msix_access_check(msix->pdev, addr, len) ) > + { > + vpci_runlock(d); > + *data = ~0ul; > + return X86EMUL_OKAY; > + } Please fold the two if()s. > + if ( MSIX_ADDR_IN_RANGE(addr, &msix->pba) ) > + { > + /* Access to PBA. */ > + switch ( len ) > + { > + case 4: > + *data = readl(addr); > + break; > + case 8: > + *data = readq(addr); > + break; This is strictly only valid for Dom0, so perhaps worth a comment. > + default: > + ASSERT_UNREACHABLE(); > + *data = ~0ul; > + break; > + } > + > + vpci_runlock(d); > + return X86EMUL_OKAY; > + } > + > + entry = vpci_msix_get_entry(msix, addr); > + offset = addr & (PCI_MSIX_ENTRY_SIZE - 1); > + > + switch ( offset ) > + { > + case PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET: > + /* > + * NB: do explicit truncation to the size of the access. This > shouldn't > + * be required here, since the caller of the handler should already > + * take the appropriate measures to truncate the value before > returning > + * to the guest, but better be safe than sorry. > + */ > + *data = len == 8 ? entry->addr : (uint32_t)entry->addr; I don't see the need for it - if there's really a bug up the call stack, it'll be better to fix it there. There's no security implication here afaics. > --- a/xen/include/asm-x86/hvm/io.h > +++ b/xen/include/asm-x86/hvm/io.h > @@ -144,6 +144,25 @@ void vpci_msi_arch_init(struct vpci_arch_msi *arch); > void vpci_msi_arch_print(const struct vpci_arch_msi *arch, uint16_t data, > uint64_t addr); > > +/* Arch-specific MSI-X entry data for vPCI. */ > +struct vpci_arch_msix_entry { > + int pirq; > +}; > + > +/* Arch-specific vPCI MSI-X helpers. */ > +void vpci_msix_arch_mask(struct vpci_arch_msix_entry *arch, > + struct pci_dev *pdev, bool mask); > +int vpci_msix_arch_enable(struct vpci_arch_msix_entry *arch, > + struct pci_dev *pdev, uint64_t address, > + uint32_t data, unsigned int entry_nr, > + paddr_t table_base); > +int vpci_msix_arch_disable(struct vpci_arch_msix_entry *arch, > + struct pci_dev *pdev); > +int vpci_msix_arch_init(struct vpci_arch_msix_entry *arch); > +void vpci_msix_arch_print(const struct vpci_arch_msix_entry *entry, > + uint32_t data, uint64_t addr, bool masked, > + unsigned int pos); Actually such helpers, when they are supposed to be called from common code, and when it is not expected for them to be inlined, would better be declared in a common header, so they won't need repeating (and later updating) in multiple places. Obviously this applies to the earlier MSI ones too. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.