[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: [PATCH] xen: fix interrupt routing



On 14.06.2011, at 20:18, Stefano Stabellini wrote:

> On Tue, 14 Jun 2011, Jan Kiszka wrote:
>> On 2011-06-14 15:27, Stefano Stabellini wrote:
>>> On Tue, 14 Jun 2011, Alexander Graf wrote:
>>>>>>>>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
>>>>>>>>> {
>>>>>>>>>  PCII440FXState *d = opaque;
>>>>>>>>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char 
>>>>>>>>> *device_name,
>>>>>>>>>  d = pci_create_simple(b, 0, device_name);
>>>>>>>>>  *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
>>>>>>>>> 
>>>>>>>>> -    piix3 = DO_UPCAST(PIIX3State, dev,
>>>>>>>>> -                      pci_create_simple_multifunction(b, -1, true, 
>>>>>>>>> "PIIX3"));
>>>>>>>>> +    if (xen_enabled()) {
>>>>>>>>> +        piix3 = DO_UPCAST(PIIX3State, dev,
>>>>>>>>> +                pci_create_simple_multifunction(b, -1, true, 
>>>>>>>>> "PIIX3-xen"));
>>>>>>>>> +        pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
>>>>>>>>> +                piix3, XEN_PIIX_NUM_PIRQS);
>>>>>>>> 
>>>>>>>> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the 
>>>>>>>> reason behind this change?
>>>>>>> 
>>>>>>> It is still a piix3, but also provides non-legacy interrupt links to the
>>>>>>> IO-APIC.
>>>>>>> The four pins of each PCI device on the bus not only are routed to the
>>>>>>> normal four pirqs (programmed writing to 0x60-0x63, see above) but also
>>>>>>> they are connected to the IO-APIC directly.
>>>>>>> These additional routes can only be discovered through ACPI, so you need
>>>>>>> matching ACPI tables. We used to build the old ACPI tables like this:
>>>>>>> 
>>>>>>> /* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */
>>>>>>> printf("Name(PRTA, Package() {\n");
>>>>>>> for ( dev = 1; dev < 32; dev++ )
>>>>>>>  for ( intx = 0; intx < 4; intx++ ) /* INTA-D */
>>>>>>>      printf("Package(){0x%04xffff, %u, 0, %u},\n",
>>>>>>>             dev, intx, ((dev*4+dev/8+intx)&31)+16);
>>>>>>> printf("})\n");
>>>>>>> 
>>>>>> 
>>>>>> Interesting concept, but completely non-standard and very much
>>>>>> different from real hardware. Please at least add a comment there to
>>>>>> show readers that Xen is doing a hack which is not at all related to
>>>>>> how the PIIX really works.
>>>>> 
>>>>> Isn't this more a function of the "wires" on the motherboard than the
>>>>> PIIX specifically? i.e. this just encodes the permutation of the wires
>>>>> from the PCI slots into the IO-APIC input pins (bypassing the PIIX,
>>>>> which is only used for legacy ISA IRQs i.e. by non-APIC aware OSes)?
>>>> 
>>>> Interrupts with PCI work slightly different. PCI devices can map 
>>>> (themselves or by software) to one of 4 interrupt lines: INTA, INTB, INTC, 
>>>> INTD. These get converted using PCI host controller specific logic to 4 
>>>> interrupt lines which then go into the IO-APIC.
>>>> 
>>>> The IO-APIC is a chip with a limited number of pins. IIRC it was 24, could 
>>>> be 26 though.
>>> 
>>> The number of redirection entries in the IOAPIC can be discovered
>>> reading from the IOAPICVER register and it is a property of a specific
>>> model of IOAPIC. As a matter of fact Xen's emulated IOAPIC supports more
>>> pins than the most popular IOAPIC used with PIIX3.
>> 
>> Do real IOAPICs exist with more than 24 pins? Otherwise there is the
>> risk that OSes aren't well prepared for this oddity - specifically not
>> when the chipset is specified to include a 24-pin IOAPIC.
> 
> Linux supports up to 128 pins and as I wrote before all the other OSes
> we tested so far seem to react well.
> 
> 
>>>> I haven't seen a single case where PCI devices have a direct link to the 
>>>> IO-APIC. I also have not seen any PCI host controller that exports more 
>>>> than 4 interrupts. Giving each PCI device its own line, on top of that 
>>>> more than ever could be in real hardware, is a plain hack IMHO.
>>> 
>>> Actually this happens quite often: if I am not mistaken all the GSIs
>>> higher than 15 are actually the result of a direct connection between
>>> an interrupt source and the IOAPIC. I have several on my testboxes.
>> 
>> Except that the interrupt source is the chipset with its PCI bridge, not
>> individual PCI devices.
> 
> That is the most common configuration but it is not the only one: I have
> an ACPI table that has individual PCI devices as source in some test
> boxes.
> In fact there is even an example of it in this good article about
> interrupt routing from the FreeBSD guys (it is the last figure):
> 
> http://people.freebsd.org/~jhb/papers/bsdcan/2007/article/node5.html
> 
> "Figure 6 contains a portion of an example _PRT. Specifically, it
> includes the first entry in the table. This corresponds to the PCI
> interrupt for PCI bus 3, slot 7"
> 
> ..ZIP...
> 
> "For APIC mode, the interrupt is routed to GSI 66.  For this machine,
> ACPI assigns a base GSI of 64 to the I/O APIC with an APIC ID of 10.
> Thus, GSI 66 corresponds to pin 2 on that I/O APIC"
> 
> Unless I am missing something I don't think that interrupt is going
> through any PCI bridges...

I'm actually not quite sure what exactly he's describing here. But if it's 
bypassing the bus logic, it's not a normal PCI device :). Sure, there are 
special case devices that also expose a PCI interface. But real PCI cards that 
you plug in onto the PCI bus can't bypass the interrupt logic of the bus, as 
the only interrupt wires they have go to the bus. And since the PCI adapters we 
use in PC machines in Qemu are all non-special, guests can possibly choke on 
this.

But either way, I won't block the patch as I mentioned before.


Alex


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.