[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] linux-3.9-rc0 regression from 3.8 SATA controller not detected under xen




On 2/27/2013 3:41 PM, Sander Eikelenboom wrote:
Wednesday, February 27, 2013, 8:28:10 PM, you wrote:

On Wed, Feb 27, 2013 at 06:50:59PM +0100, Sander Eikelenboom wrote:
Wednesday, February 27, 2013, 1:54:31 PM, you wrote:

On 27.02.13 at 12:46, Sander Eikelenboom <linux@xxxxxxxxxxxxxx> wrote:
   [   89.338827] ahci: probe of 0000:00:11.0 failed with error -22
Which is -EINVAL. With nothing else printed, I'm afraid you need to
find the origin of this return value by instrumenting the involved
call tree.
Just wondering, is multiple msi's per device actually supported by xen ?
That is very good question. I know we support MSI-X b/c 1GB or 10GB NICs
use them and they work great with Xen.
BTW, this is merge:
ommit 5800700f66678ea5c85e7d62b138416070bf7f60
Merge: 266d7ad af8d102
Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date:   Tue Feb 19 19:07:27 2013 -0800
     Merge branch 'x86-apic-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/apic changes from Ingo Molnar:
      "Main changes:
- Multiple MSI support added to the APIC, PCI and AHCI code - acked
          by all relevant maintainers, by Alexander Gordeev.
The advantage is that multiple AHCI ports can have multiple MSI
          irqs assigned, and can thus spread to multiple CPUs.
[ Drivers can make use of this new facility via the
            pci_enable_msi_block_auto() method ]


With MSI per device, the hypercall that ends up happening is:
PHYSDEVOP_map_pirq with:
    map_irq.domid = domid;
    map_irq.type = MAP_PIRQ_TYPE_MSI_SEG;
    map_irq.index = -1;
    map_irq.pirq = -1;
    map_irq.bus = dev->bus->number |
                  (pci_domain_nr(dev->bus) << 16);
    map_irq.devfn = dev->devfn;
Which would imply that we are doing this call multiple times?
(This is xen_initdom_setup_msi_irqs).
It looks like pci_enable_msi_block_auto is the multiple MSI one
and it should perculate down to xen_initdom_setup_msi_irqs.
Granted the xen_init.. does not do anything with the 'nvec' call.
So could I ask you try out your hunch by doing three things:
  1). Instrument xen_initdom_setup_msi_irqs to see if the
      nvec has anything but 1 and in its loop instrument to
      see if it has more than on MSI attribute?
  2). The ahci driver has ahci_init_interrupts which only does
    the multiple MSI thing if AHCI_HFLAG_NO_MSI is not set.
     If you edit drivers/ata/ahci ahci_port_info for the SB600 (or 700?)
     to have AHCI_HFLAG_NO_MSI flag (you probably want to do this
     seperatly from 1).
  3). Checkout before merge 5800700f66678ea5c85e7d62b138416070bf7f60
     and try 266d7ad7f4fe2f44b91561f5b812115c1b3018ab?

So of interest are commits:
- 5ca72c4f7c412c2002363218901eba5516c476b1
- 08261d87f7d1b6253ab3223756625a5c74532293
- 51906e779f2b13b38f8153774c4c7163d412ffd9

Hmmm reading the commit message of 51906e779f2b13b38f8153774c4c7163d412ffd9:

x86/MSI: Support multiple MSIs in presense of IRQ remapping

The MSI specification has several constraints in comparison with
MSI-X, most notable of them is the inability to configure MSIs
independently. As a result, it is impossible to dispatch
interrupts from different queues to different CPUs. This is
largely devalues the support of multiple MSIs in SMP systems.

Also, a necessity to allocate a contiguous block of vector
numbers for devices capable of multiple MSIs might cause a
considerable pressure on x86 interrupt vector allocator and
could lead to fragmentation of the interrupt vectors space.

This patch overcomes both drawbacks in presense of IRQ remapping
and lets devices take advantage of multiple queues and per-IRQ
affinity assignments.

At least makes clear why baremetal does boot and xen doesn't:

Baremetal behaves differently and thus boots because interrupt remapping gets 
disabled on boot by the kernel iommu code due to the buggy bios iommu errata, 
so according to the commit message above it doesn't even try the multiple MSI 
per device scenario.

So the question is if it can be enabled in Xen (and if it actually could be 
beneficial because the commit messages seems to indicate that could be 
questionable).
If not, the check in arch/x86/kernel/apic/io_apic.c:setup_msi_irqs should fail
Except that function in Xen is not run. that is b/c x86_msi_ops.setup_msi_irqs end up pointing to xen_initdom_setup_irqs. While if IOMMU is enabled it gets set to irq_remapping_setup_msi_irqs.

So a fix like this:
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 56ab749..47f8cca 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -263,6 +263,9 @@ static int xen_initdom_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
        int ret = 0;
        struct msi_desc *msidesc;

+       if (type == PCI_CAP_ID_MSI && nvec > 1)
+               return 1;
+
        list_for_each_entry(msidesc, &dev->msi_list, list) {
                struct physdev_map_pirq map_irq;
                domid_t domid;


(sorry about the paste getting messed up here) - ought to do it? As for example on one of my AMD machines there is no IOMMU, and this is where AHCI does work under baremetal but not under Xen.

We can future wise implement a better version of this to deal with multiple MSIs, but lets make sure to first get it booting.
--
Sander




--
Sander

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel





_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.