WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [RFC PATCH linux-2.6.18-xen] pciback: clean up (MSI-X v

To: Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: Re: [Xen-devel] [RFC PATCH linux-2.6.18-xen] pciback: clean up (MSI-X vec, entrynr) list when resetting PCI device
From: Laszlo Ersek <lersek@xxxxxxxxxx>
Date: Wed, 01 Jun 2011 17:27:15 +0200
Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>, Drew Jones <drjones@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 01 Jun 2011 08:26:43 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4DE66FDF0200007800044EC7@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4DE60EF8.5060902@xxxxxxxxxx> <4DE653290200007800044DE4@xxxxxxxxxxxxxxxxxx> <4DE648C3.40602@xxxxxxxxxx> <4DE66FDF0200007800044EC7@xxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.10
On 06/01/11 16:59, Jan Beulich wrote:
On 01.06.11 at 16:12, Laszlo Ersek<lersek@xxxxxxxxxx>  wrote:
On 06/01/11 14:56, Jan Beulich wrote:
On 01.06.11 at 12:05, Laszlo Ersek<lersek@xxxxxxxxxx>   wrote:

domU: igbvf_shutdown()
          igbvf_suspend()
            pci_disable_device()
              pcibios_disable_device()
                pcibios_disable_resources()
                  pci_write_config_word(
                      dev, PCI_COMMAND,
                      orig&   ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY)
                  )
                    ...
                      pcifront_bus_write()
                        do_pci_op(XEN_PCI_OP_conf_write)
dom0:                   pciback_do_op()
                            pciback_config_write()
                              conf_space_write()
                                command_write() [via PCI_COMMAND funcptr]
                                  pci_disable_device()
                                    disable_msi_mode()
                                      dev->msix_enabled = 0;


So I wonder whether you shouldn't fix
pci_disable_device() instead, or alternatively move the vector
de-allocation (and then for MSI and MSI-X) into disable_msi_mode().

Yes, I did think of that, however IMO that would introduce exactly the
problem that you describe below. If either pci_disable_device() or
disable_msi_mode() frees the vector, then re-enabling the device can
face yet another stumbling block.

Why - pci_enable_msi{,x}() call msi{,x}_capability_init(), which
obtains the vectors.

Which can fail. I imagined there was a conscious deliberation in the background: hold on to the vectors we already have, just enable/disable generation of the MSIs in the device.


The problem is that calling disable_msi_mode() alone is always a
mistake, this should have been a static function just like its
counterpart, enable_msi_mode().

Well this would unify the "disposition" and the "generation", and dev->msix_enabled would have a clear meaning.


While the approach you take covers the guest shutdown/restart
case, what if the guest called its pci_disable_device() at runtime,
expecting to be able to call pci_enable_{device,msi,msix}() on it
again later on? Your newly added function would also be called here,

May I ask how? The function is only called from pciback_reset_device(),

I was just looking at the call tree you provided (and which is still
quoted at the top). Oh - you're referring to Dom0 calling the
function, whereas I wrote about what happens when guest calls
its version of it.

Yeah, when the domain is gone you have this problem. But if you
properly removed the vectors while the domain is still there, you
shouldn't.

For that situation this was the natural place, but you're dealing
with an alive domain not being cleaned up after properly during
certain operations, and deferring the cleanup until the domain is
gone is only going to call for further problems (as outlined earlier,
and who knows what other cases we didn't consider).

The original idea (probably the most straightforward one) was to modify the igbvf driver (ie. the domU) so that it would call igbvf_remove() from igbvf_shutdown(), instead of the current igbvf_suspend(); restricted to the case when the code runs in a PV domU. igbvf_remove() calls igbvf_reset_interrupt_capability(), which in turn calls pci_disable_msix(). The latter percolates through the entire stack and does the right thing.

Or, as you say above, the guest's generic MSI-X massaging PCI code could be modified.

We didn't want to leave this to the guest, however. It's not that I intentionally try to postpone the cleanup until after the domain has vanished; it's rather the domU can't be expected / trusted to do the cleanup itself in all cases.

(I realize we can't talk about "security" in this case at all, since passthrough-to-PV doesn't cross the IOMMU and "has the potential to access the DMA address space of dom0, which is a potential security concern" anyway. Or some such.)


Hence while I realize that from a pragmatic point of view your
approach may seem acceptable, it only fixing a particular aspect
of a more general problem is not really something I like.

Thank you for the review :)
lacos

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>