[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] another regression from IRQ handling changes



Jan Beulich wrote:
>>>> Keir Fraser <keir.fraser@xxxxxxxxxxxxx> 22.09.09 11:14 >>>
>> On 22/09/2009 09:53, "Jan Beulich" <JBeulich@xxxxxxxxxx> wrote:
>> 
>>>> If it wasn't broken before, it was probably broken by Xiantao's
>>>> follow-up patch to fix NetBSD dom0 (at least as much as possible,
>>>> to avoid a nasty regression with NetBSD). What we probably need to
>>>> do is have a 256-entry dom0_vector_to_dom0_irq[] array, and
>>>> allocate an entry from that for every fresh irq we see at
>>>> PHYSDEVOP_alloc_irq_vector. Then when the vector gets passed back
>>>> in on ioapic writes, we index into that array rather than using
>>>> naked rte.vector. 
>>> 
>>> Yeah, that's what I would view as the solution to get old
>>> functionality back. But my question also extended to possible
>>> solutions to get beyond 256 here, especially such that are also
>>> acceptable to the pv-ops Dom0, which I'm much less certain about.
>> 
>> Well, we could assume that if we see an irq greater than 256 at
>> PHYSDEVOP_alloc_irq_vector then the dom0 is dealing in GSIs, and in
>> that case the 'vector' we return and then gets passed to
>> ioapic_write is rather redundant. We can work out the GSI from the
>> ioapic rte that is being modified, after all. So perhaps we could
>> just flip into a non-legacy mode of operation in that case (perhaps
>> reserve a single magic 'vector' value to return from
>> PHYSDEVOP_alloc_irq_vector in this case, and if we see that value in
>> the ioapic write handler then we know to assume that guest pirq ==
>> gsi). 
> 
> I wouldn't base this on the passed in IRQ number, but instead on the
> number of IRQs mapped - if the proposed array doesn't have a spare
> slot anymore, switch to passing back the magic vector (and assume
> pirq == irq in ioapic_guest_write() - we could even add checking of
> that for all previously enabled IRQs this relation is true, and fail
> PHYSDEVOP_alloc_irq_vector if the array got exhausted and Dom0
> didn't use only GSIs before). But besides that detail your idea sounds
> fine at least from a Linux perspective.
> 
> Are you planning on getting this done, or should I?

Don't be so complex to handle it. I think the following patch should fix the 
potential issue. 
For linux dom0, it shouldn't depend on the value of rte.vector at all when GSI 
irq > 256, and just make pirq equal to irq and then build the pirq and irq 
mapping.  

diff -r 3a71e070e3c5 xen/arch/x86/io_apic.c
--- a/xen/arch/x86/io_apic.c    Fri Sep 18 14:45:40 2009 +0100
+++ b/xen/arch/x86/io_apic.c    Tue Sep 22 18:03:50 2009 +0800
@@ -2195,9 +2195,13 @@ int ioapic_guest_write(unsigned long phy

     /* Since PHYSDEVOP_alloc_irq_vector is dummy, rte.vector is the pirq
        which corresponds to this ioapic pin, retrieve it for building
-       pirq and irq mapping.
-     */
-    pirq = rte.vector;
+       pirq and irq mapping, and this is only for NetBSD dom0. For Linux
+       dom0, pirq == irq at any time.
+     */
+    if (irq >= 256)
+        pirq = irq;
+    else
+        pirq = rte.vector; /* Make NetBSD dom0 work. */
     if(pirq < 0 || pirq >= dom0->nr_pirqs)
         return -EINVAL;


> 
>> The legacy case is just to handle NetBSD, which throws non-GSIs at
>> the PHYSDEVOP_alloc_irq_vector interface, and I doubt it will have
>> worked with those mega-big systems at any time. So I don't think
>> we're dealing with a regression there.
> 
> Hmm, no, the regression is with (newer?) Linux, which should have been
> capable of dealing with interrupts coming in on IO-APIC pins above 256
> before that change.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.