WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [RFC PATCH linux-2.6.18-xen] pciback: clean up (MSI-X ve

To: "Laszlo Ersek" <lersek@xxxxxxxxxx>
Subject: Re: [Xen-devel] [RFC PATCH linux-2.6.18-xen] pciback: clean up (MSI-X vec, entrynr) list when resetting PCI device
From: "Jan Beulich" <JBeulich@xxxxxxxxxx>
Date: Wed, 01 Jun 2011 15:59:11 +0100
Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>, Drew Jones <drjones@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 01 Jun 2011 08:03:23 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4DE648C3.40602@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4DE60EF8.5060902@xxxxxxxxxx> <4DE653290200007800044DE4@xxxxxxxxxxxxxxxxxx> <4DE648C3.40602@xxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>>> On 01.06.11 at 16:12, Laszlo Ersek <lersek@xxxxxxxxxx> wrote:
> On 06/01/11 14:56, Jan Beulich wrote:
>>>>> On 01.06.11 at 12:05, Laszlo Ersek<lersek@xxxxxxxxxx>  wrote:
> 
>>> domU: igbvf_shutdown()
>>>          igbvf_suspend()
>>>            pci_disable_device()
>>>              pcibios_disable_device()
>>>                pcibios_disable_resources()
>>>                  pci_write_config_word(
>>>                      dev, PCI_COMMAND,
>>>                      orig&  ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY)
>>>                  )
>>>                    ...
>>>                      pcifront_bus_write()
>>>                        do_pci_op(XEN_PCI_OP_conf_write)
>>> dom0:                   pciback_do_op()
>>>                            pciback_config_write()
>>>                              conf_space_write()
>>>                                command_write() [via PCI_COMMAND funcptr]
>>>                                  pci_disable_device()
>>>                                    disable_msi_mode()
>>>                                      dev->msix_enabled = 0;
>>>
>>> The final assignment above precludes c/s 1070 from doing the job.
>>
>> That seems to be a problem in the PCI subsystem, in that
>> disable_msi_mode() is being used from pci_disable_device() (instead
>> of pci_disable_msi{,x}()), and does not seem to be done in a similarly
>> bad way in newer kernels.
> 
> But does linux-2.6.18-xen avoid the problem? I think its 
> pci_disable_device() still calls disable_msi_mode(), and the latter also 
> only read/writes config words.

Yes, and that's certainly broken (in the native base version, not
particularly in the Xen pieces).

>> So I wonder whether you shouldn't fix
>> pci_disable_device() instead, or alternatively move the vector
>> de-allocation (and then for MSI and MSI-X) into disable_msi_mode().
> 
> Yes, I did think of that, however IMO that would introduce exactly the 
> problem that you describe below. If either pci_disable_device() or 
> disable_msi_mode() frees the vector, then re-enabling the device can 
> face yet another stumbling block.

Why - pci_enable_msi{,x}() call msi{,x}_capability_init(), which
obtains the vectors.

The problem is that calling disable_msi_mode() alone is always a
mistake, this should have been a static function just like its
counterpart, enable_msi_mode().

>> While the approach you take covers the guest shutdown/restart
>> case, what if the guest called its pci_disable_device() at runtime,
>> expecting to be able to call pci_enable_{device,msi,msix}() on it
>> again later on? Your newly added function would also be called here,
> 
> May I ask how? The function is only called from pciback_reset_device(), 

I was just looking at the call tree you provided (and which is still
quoted at the top). Oh - you're referring to Dom0 calling the
function, whereas I wrote about what happens when guest calls
its version of it.

>> but in this situation you would have to call into the hypervisor (as
>> the domain is still alive), i.e. you could as well call
>> msi_remove_pci_irq_vectors().
> 
> I considered taking c/s 1070 and simply removing the ifs around the 
> pci_disable_msi[x]() calls. However, msi_get_dev_owner(), called by 
> msi_unmap_pirq(), could find no owner domain for the device and return 
> DOMID_SELF. (This is different in upstream I think, see c/s 680.) Then 
> msi_unmap_pirq() would try to unmap a PIRQ for dom0.

Yeah, when the domain is gone you have this problem. But if you
properly removed the vectors while the domain is still there, you
shouldn't.

>> Additionally, while covering the MSI-X case, I don't see how the
>> similar already-mapped case would be cleanly handled for MSI.
> 
> The target is "cleanup after shutdown in such a way that it doesn't hurt 
> in other cases either", so:
> 
> - it is assumed the hypervisor takes care of the mappings when the 
> domain goes away,
> 
> - dom0 has no filtering list for MSI. msi_capability_init() "simply" 
> asks the hypervisor for a vector, while msix_capability_init() "gets in 
> the way".

That's all true, but by considering only your one case you may end
up making already crappy code worse.

> I'm looking for the best (... least wrong) location to place the cleanup 
> at; c/s 1070 suggested pciback_reset_device().

For that situation this was the natural place, but you're dealing
with an alive domain not being cleaned up after properly during
certain operations, and deferring the cleanup until the domain is
gone is only going to call for further problems (as outlined earlier,
and who knows what other cases we didn't consider).

Hence while I realize that from a pragmatic point of view your
approach may seem acceptable, it only fixing a particular aspect
of a more general problem is not really something I like.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>