[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC Patch] xen/pt: Emulate FLR capability



On Fri, Sep 06, 2019 at 05:01:09PM +0800, Chao Gao wrote:
> On Thu, Aug 29, 2019 at 12:21:11PM +0200, Roger Pau Monné wrote:
> >On Thu, Aug 29, 2019 at 05:02:27PM +0800, Chao Gao wrote:
> >> Currently, for a HVM on Xen, no reset method is virtualized. So in a VM's
> >> perspective, assigned devices cannot be reset. But some devices rely on PCI
> >> reset to recover from hardware hangs. When being assigned to a VM, those
> >> devices cannot be reset and won't work any longer if a hardware hang 
> >> occurs.
> >> We have to reboot VM to trigger PCI reset on host to recover the device.
> >>
> >> This patch exposes FLR capability to VMs if the assigned device can be 
> >> reset on
> >> host. When VM initiates an FLR to a device, qemu cleans up the device 
> >> state,
> >> (including disabling of intx and/or MSI and unmapping BARs from guest, 
> >> deleting
> >> emulated registers), then initiate PCI reset through 'reset' knob under the
> >> device's sysfs, finally initialize the device again.
> >
> >I think you likely need to deassign the device from the VM, perform
> >the reset, and then assign the device again, so that there's no Xen
> >internal state carried over prior to the reset?
> 
> Yes. It is the safest way. But here I want to present the feature as FLR
> (such that the device driver in guest can issue PCI reset whenever
> needed and no change is needed to device driver).  Current device
> deassignment notifies guest that the device is going to be removed

In which way does a guest get notified?

AFAICT XEN_DOMCTL_deassign_device doesn't do any kind of guest
notification, it just tears down the device.

> It is not the standard PCI reset. Is it possible to make guest unaware
> of the device deassignment to emulate a standard PCI reset?

That would be my expectation. Such deassignment/assignment should be
completely transparent from a guest PoV. My suggestion for doing
the reassignment is to ensure there's no device state carried over.

> In my mind,
> we can expose do_pci_remove/add to qemu or rewrite them in qemu (but
> don't remove the device from guest's PCI hierarchy). Do you think it is
> the right direction?

Doing all this cleanup without reassigning the device seems more
complicated and likely to miss stuff to cleanup IMO, but as long as
you can guarantee there's no state carried over from before the reset
it should be fine.

I think you also need some dom0 cooperation for this, so that for
example the BARs are correctly re-positioned after the reset?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.