[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] libxl: Don't insert PCI device into xenstore for HVM guests



On Tue, Jun 02, 2015 at 11:06:26AM +0100, Malcolm Crossley wrote:
> On 01/06/15 18:55, Konrad Rzeszutek Wilk wrote:
> > On Mon, Jun 01, 2015 at 05:03:14PM +0100, Malcolm Crossley wrote:
> >> On 01/06/15 16:43, Ross Lagerwall wrote:
> >>> On 06/01/2015 04:26 PM, Konrad Rzeszutek Wilk wrote:
> >>>> On Fri, May 29, 2015 at 08:59:45AM +0100, Ross Lagerwall wrote:
> >>>>> When doing passthrough of a PCI device for an HVM guest, don't insert
> >>>>> the device into xenstore, otherwise pciback attempts to use it which
> >>>>> conflicts with QEMU.
> >>>>
> >>>> How does it conflict?
> >>>
> >>> It doesn't work with repeated use. See below.
> >>>
> >>>>>
> >>>>> This manifests itself such that the first time a device is passed to a
> >>>>> domain, it succeeds. Subsequent attempts fail unless the device is
> >>>>> unbound from pciback or the machine rebooted.
> >>>>
> >>>> Can you be more specific please? What are the issues? Why does it
> >>>> fail?
> >>>
> >>> Without this patch, if a device (e.g. a GPU) is bound to pciback and
> >>> then passed through to a guest using xl pci-attach, it appears in the
> >>> guest and works fine. If the guest is rebooted, and the device is again
> >>> passed through with xl pci-attach, it appears in the guest as before but
> >>> does not work. In Windows, it gets something like Error Code 43 and on
> >>> Linux, the Nouveau driver fails to initialize the device (with error -22
> >>> or something). The only way to get the device to work again is to reboot
> >>> the host or unbind and rebind it to pciback.
> >>>
> >>> With this patch, it works as expected. The device is bound to pciback
> >>> and works after being passed through, even after the VM is rebooted.
> >>>
> >>>>
> >>>> There are certain things that pciback does to "prepare" an PCI device
> >>>> which QEMU also does. Some of them - such as saving the configuration
> >>>> registers (And then restoring them after the device has been detached) -
> >>>> is something that QEMU does not do.
> >>>>
> >>>
> >>> I really have no idea what the correct thing to do is, but the current
> >>> code with qemu-trad doesn't seem to work (for me).

I think I know what the problem is. Do you by any chance have the XSA133-addenum
patch in? If not could you apply it and tell me if it works?

> >>
> >> The pciback pci_stub.c implements the pciback.hide and the device reset
> >> logic.
> >>
> >> The rest of pciback implements the pciback xenbus device which PV guests
> >> need in order to map/unmap MSI interrupts and access PCI config space.
> >>
> >> QEMU emulates and handles the MSI interrupt capabilities and PCI config
> >> space directly.
> > 
> > Right..
> >>
> >> This is why a pciback xenbus device should not be created for
> >> passthrough PCI device being handled by QEMU.
> > 
> > To me that sounds that we should not have PV drivers because QEMU
> > emulates IDE or network devices.
> 
> That is different. We first boot with QEMU handling the devices and then
> we explictly unplug QEMU's handling of IDE and network devices.
> 
> That handover protocol does not currently exist for PCI passthrough
> devices so we have to chose one mechanism or the other to manage the
> passed through PCI device at boot time. Otherwise a HVM guest could load
> pcifront and cause's all kinds of chaos with interrupt management or
> outbound MMIO window management.

Which would be fun! :-)
> 
> > 
> > The crux here is that none of the operations that pciback performs
> > should affect QEMU or guests. But it does - so there is a bug.
> 
> I agree there is a bug but should we try to fix it based upon my
> comments above?

I am still thinking about it. I do like certain things that pciback
does as part of it being notified that a device is to be used by
a guest and performing the configuration save/reset (see
pcistub_put_pci_dev in the pciback).

If somehow that can still be done by libxl (or QEMU) via SysFS
that would be good.

Just to clarify:
 - I concur with you that having xen-pcifront loaded in HVM
   guest and doing odd things behind QEMU is not good.
 - I like the fact that xen-pciback does a bunch of safety
   things with the PCI device to prepare it for a guest.
 - Currently these 'safety things'  are done when you
   'unbind' or 'bind' the device to pciback.
 - Or when the guest is shutdown and via XenBus we are told
   and can do the 'safety things'. This is the crux - if there
   is a way to do this via SysFS this would be super.

   Or perhaps xenpciback can figure out that the guest is HVM
   and ignore any XenBus actions?

> > 
> > I would like to understand which ones do it so I can fix in
> > pciback - as it might be also be a problem with PV.
> > 
> > Unless... are you by any chance using extra patches on top of the
> > native pciback?
> 
> We do have extra patches but they only allow us to do a SBR on PCI
> device's which require it. They failure listed above occurs on devices
> with device specific resets (e.g. FLR,D3) as well so those extra patches
> aren't being used.
> 
> > 
> >>
> >> Malcolm
> >>
> >>>
> >>> Regards
> >>
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.