[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] xen/pt: fix some pass-thru devices don't work across reboot



On Fri, Nov 16, 2018 at 02:59:41AM -0700, Jan Beulich wrote:
> >>> On 16.11.18 at 10:35, <roger.pau@xxxxxxxxxx> wrote:
> > On Fri, Nov 16, 2018 at 03:53:50PM +0800, Chao Gao wrote:
> >> On Thu, Nov 15, 2018 at 11:40:39AM +0100, Roger Pau Monné wrote:
> >> >On Thu, Nov 15, 2018 at 09:10:26AM +0800, Chao Gao wrote:
> >> >> +    if ( pdev && list_empty(&pdev->msi_list) && pdev->msix )
> >> >> +    {
> >> >> +        if ( pdev->msix->host_maskall )
> >> >> +            printk(XENLOG_G_WARNING
> >> >> +                   "Resetting msix status of %04x:%02x:%02x.%u\n",
> >> >> +                   pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> >> >> +                   PCI_FUNC(pdev->devfn));
> >> >> +        pdev->msix->host_maskall = false;
> >> >> +        pdev->msix->warned = DOMID_INVALID;
> > 
> > AFAICT a guest could trigger this message multiple times by forcing a
> > PIRQ map/unmap of all the vectors in MSIX, thus likely flooding the
> > console since this is not rate limited. Since I think a guest can
> > manage to reach this code path while running, clearing warned is not
> > correct.
> 
> Did you overlook the _G_ infix? That guarantees rate limiting, unless
> the admin specified a non-default "guest_loglvl=".

Right, I tend to use the gprintk variant and I've indeed overlooked
the _G_.

> > Also, if a guest can manage to trigger this path during it's runtime,
> > could it also hit the issue of getting host_maskall set and not being
> > able to clear it?
> 
> But _can_ a guest trigger this path? So far I didn't think it can.

AFAICT (and I might have missed something) a guest can trigger the
execution of unmap_domain_pirq which ends up calling msi_free_irq by
enabling and then disabling MSIX after having setup some vectors. This
is the trace from QEMU and Xen:

xen_pt_msixctrl_reg_write
    xen_pt_msix_disable
        msi_msix_disable
            xc_physdev_unmap_pirq
                -> PHYSDEVOP_unmap_pirq hypercall
                    physdev_unmap_pirq
                        unmap_domain_pirq
                            msi_free_irq

Given this I would only clean host_maskall in msi_free_irq if the
domain is being destroyed (d->is_shutting_down), or even better I
would consider using something like PHYSDEVOP_prepare_msix in order to
reset Xen's internal MSI state after device reset.

> >> >In any case there should be at least a note here pointing out that Xen
> >> >expects the hardware domain to perform a device reset, so the Xen
> >> >internal state actually matches the device state before trying to
> >> >assign the device to another guest.
> >> 
> >> Sounds good. This issue is that Xen tries to mask msi (when unmapping pirq)
> >> after memory decoding is disabled by pciback. If pciback can unmap all the
> >> pirq-s related a given device before disabling memory decoding, Xen won't 
> >> meet
> >> this issue. Currently, pciback doesn't maintain the pirq information; it
> >> isn't able to do this.
> > 
> > I would like to hear Jan's opinion on this, but I think it might be
> > helpful to introduce a new hypercall Dom0 (ie: toolstack) can use to
> > signal Xen a PCI device has been reset, so that Xen can safely reset
> > the device state to the initial one. This would be simpler if Xen was
> > the one performing the device reset.
> 
> Such a notification might be helpful, if it can't be expressed via the
> existing PHYSDEVOP_{prepare,release}_msix. For the moment I can't
> see though why these two would be insufficient.

I think using PHYSDEVOP_{prepare,release}_msix should be enough, since
it will reset host_maskall by calling msix_capability_init.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.