[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 1/3] xen/pt: fix some pass-thru devices don't work across reboot



On Fri, Jan 25, 2019 at 07:36:36PM +0800, Chao Gao wrote:
> On Fri, Jan 25, 2019 at 10:36:13AM +0100, Roger Pau Monné wrote:
> >Thanks for the patch!
> >
> >On Fri, Jan 25, 2019 at 04:26:59PM +0800, Chao Gao wrote:
> >> I find some pass-thru devices don't work any more across guest
> >> reboot. Assigning it to another domain also meets the same issue. And
> >> the only way to make it work again is un-binding and binding it to
> >> pciback. Someone reported this issue one year ago [1].
> >> 
> >> If the device's driver doesn't disable MSI-X during shutdown or qemu is
> >> killed/crashed before the domain shutdown, this domain's pirq won't be
> >> unmapped. Then xen takes over this work, unmapping all pirq-s, when
> >> destroying guest. But as pciback has already disabled meory decoding before
> >> xen unmapping pirq, Xen has to sets the host_maskall flag and maskall bit
> >                                 ^ set
> >> to mask a MSI rather than sets maskbit in MSI-x table. The call trace of
> >                            ^ setting
> >> this process is:
> >> 
> >> ->arch_domain_destroy
> >>     ->free_domain_pirqs
> >>         ->unmap_domain_pirq (if pirq isn't unmapped by qemu)
> >>             ->pirq_guest_force_unbind
> >>                 ->__pirq_guest_unbind
> >>                     ->mask_msi_irq(=desc->handler->disable())
> >>                         ->the warning in msi_set_mask_bit()
> >> 
> >> The host_maskall bit will prevent guests from clearing the maskall bit
> >> even the device is assigned to another guest later. Then guests cannot
> >> receive MSIs from this device.
> >> 
> >> To fix this issue, a pirq is unmapped before memory decoding is disabled by
> >> pciback. Specifically, when a device is detached from a guest, all 
> >> established
> >> mappings between pirq and msi are destroying before changing the ownership.
> >                                    ^ destroyed
> >> 
> >> With this behavior, qemu and pciback are not aware of the forcibly unbindng
> >                                                                     ^ 
> > unbinding
> >> and unmapping done by Xen. As a result, the state of pirq maintained by 
> >> Xen and
> >> pciback/qemu becomes inconsistent. Particularly for hot-plug/hot-unplug 
> >> case,
> >> guests stay alive; such inconsistency may cause other issues. To resolve
> >> this inconsistency and keep compatibility with current qemu and pciback,
> >> two flags, force_unmapped and force_unbound are used to denote that a pirq 
> >> is
> >> forcibly unmapped or unbound. The flags are set when Xen unbinds or unmaps 
> >> the
> >> pirq behind qemu and pciback. And subsequent unbinding or unmapping 
> >> requests
> >> from qemu/pciback can clear these flags and free the pirq.
> >
> >What happens then if qemu/pciback doesn't unbind and/or unmap the
> >pirqs, they would be left in a weird state that would prevent further
> >mapping or binding?
> 
> Yes.

Then I think I would prefer the previous version with the return value
of unmap_domain_pirq check "if ( !info || (irq = info->arch.irq) <= 0
)" adjusted to ESRCH. It's less convoluted and the Linux message in
that case is just informative.

> >
> >I think this is getting quite convoluted, and would like to make sure
> >this is necessary. Last version triggered some error messages in Linux
> >due to the unbind/unmap being performed by the hypervisor, but those
> >where harmless?
> 
> Yes. I didn't see anything harmful.
> 
> >
> >I've also suggested to return ESRCH in unmap_domain_pirq when the pirq
> >is no longer mapped which would make Linux print a less scary message.
> 
> But, with this version, Qemu/pciback won't complain about anything.
> 
> The idea is inspired by the way we handle "force_unbind" in current
> unmap_domain_pirq. Personally, I think this version is slightly better.
> But because it is more complicated and involves more changes, I am less
> confident in this version.

Let's wait for Jan's opinion, but as said above I prefer the previous
version.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.