[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 1/3] xen/pt: fix some pass-thru devices don't work across reboot



On Tue, Jan 22, 2019 at 01:50:20PM +0800, Chao Gao wrote:
> On Wed, Jan 16, 2019 at 11:38:23AM +0100, Roger Pau Monné wrote:
> >On Wed, Jan 16, 2019 at 04:17:30PM +0800, Chao Gao wrote:
> >> +        }
> >> +    }
> >> +    /*
> >> +     * All pirq-s should have been unmapped and corresponding msi_desc
> >> +     * entries should have been removed in the above loop.
> >> +     */
> >> +    ASSERT(list_empty(&pdev->msi_list));
> >> +
> >> +    spin_unlock(&d->event_lock);
> >> +}
> >> +
> >>  /* caller should hold the pcidevs_lock */
> >>  int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
> >>  {
> >> @@ -1529,6 +1591,8 @@ int deassign_device(struct domain *d, u16 seg, u8 
> >> bus, u8 devfn)
> >>      if ( !pdev )
> >>          return -ENODEV;
> >>  
> >> +    pci_unmap_msi(pdev);
> >
> >Just want to make sure, since deassign_device will be called for both
> >PV and HVM domains. AFAICT pci_unmap_msi is safe to call when the
> >device is assigned to a PV guest, but would like your confirmation.
> 
> Tested with a PV guest loaded by Pygrub. PV guest doesn't suffer the
> msi-x issue I want to fix.
> 
> With these three patches applied, I got some error messages from Xen
> and Dom0 as follow:
> 
> (XEN) irq.c:2176: dom3: forcing unbind of pirq 332
> (XEN) irq.c:2176: dom3: forcing unbind of pirq 331
> (XEN) irq.c:2176: dom3: forcing unbind of pirq 328
> (XEN) irq.c:2148: dom3: pirq 359 not mapped
> [ 2887.067685] xen:events: unmap irq failed -22
> (XEN) irq.c:2148: dom3: pirq 358 not mapped
> [ 2887.075917] xen:events: unmap irq failed -22
> (XEN) irq.c:2148: dom3: pirq 357 not mapped
> 
> It seems, the cause of such error is that pirq-s are unmapped and forcibly
> unbound on deassignment; subsequent unmapping pirq issued by dom0 fail.
> From some aspects, this error is expected. Because with this patch,
> pirq-s are expected to be mapped by qemu or dom0 kernel (for pv case) before
> deassignment and mapping/binding pirq after deassignment should fail.

This is quite entangled because it involves Xen, libxl and pciback.

AFAICT libxl will already try to unmap the pirqs before deassigning
the device if the domain is PV, see do_pci_remove in libxl_pci.c and
the calls it makes to xc_physdev_unmap_pirq.

Which makes me wonder, have you tested if you see those messages about
pirq unmap failure without this patch applied?

> So what's your opinion on handling such error? We should figure out another
> method to fix msi-x issue to avoid such error or suppress these errors in
> qemu and linux kernel?

Regardless of the reply to the question above, I think
unmap_domain_pirq should return ESRCH if the pirq cannot be found,
like the patch below. That would turn the Linux kernel messages into
less scary info messages, like:

"domain %d does not have %d anymore"

Which seems more accurate.

Thanks, Roger.

---8<---
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 23b4f423e6..7e9c974ba1 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2144,9 +2144,9 @@ int unmap_domain_pirq(struct domain *d, int pirq)
     info = pirq_info(d, pirq);
     if ( !info || (irq = info->arch.irq) <= 0 )
     {
-        dprintk(XENLOG_G_ERR, "dom%d: pirq %d not mapped\n",
+        dprintk(XENLOG_G_INFO, "dom%d: pirq %d not mapped\n",
                 d->domain_id, pirq);
-        ret = -EINVAL;
+        ret = -ESRCH;
         goto done;
     }
 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.