WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] another regression from IRQ handling changes

To: Jan Beulich <JBeulich@xxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Subject: RE: [Xen-devel] another regression from IRQ handling changes
From: "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx>
Date: Tue, 22 Sep 2009 18:10:10 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 22 Sep 2009 03:11:13 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4AB8BAC60200007800016374@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4AB8AC9D020000780001631A@xxxxxxxxxxxxxxxxxx> <C6DE5419.1575F%keir.fraser@xxxxxxxxxxxxx> <4AB8BAC60200007800016374@xxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Aco7apSspVc2SDYcQY6wBSWmzvvNqQAAXT0A
Thread-topic: [Xen-devel] another regression from IRQ handling changes
Jan Beulich wrote:
>>>> Keir Fraser <keir.fraser@xxxxxxxxxxxxx> 22.09.09 11:14 >>>
>> On 22/09/2009 09:53, "Jan Beulich" <JBeulich@xxxxxxxxxx> wrote:
>> 
>>>> If it wasn't broken before, it was probably broken by Xiantao's
>>>> follow-up patch to fix NetBSD dom0 (at least as much as possible,
>>>> to avoid a nasty regression with NetBSD). What we probably need to
>>>> do is have a 256-entry dom0_vector_to_dom0_irq[] array, and
>>>> allocate an entry from that for every fresh irq we see at
>>>> PHYSDEVOP_alloc_irq_vector. Then when the vector gets passed back
>>>> in on ioapic writes, we index into that array rather than using
>>>> naked rte.vector. 
>>> 
>>> Yeah, that's what I would view as the solution to get old
>>> functionality back. But my question also extended to possible
>>> solutions to get beyond 256 here, especially such that are also
>>> acceptable to the pv-ops Dom0, which I'm much less certain about.
>> 
>> Well, we could assume that if we see an irq greater than 256 at
>> PHYSDEVOP_alloc_irq_vector then the dom0 is dealing in GSIs, and in
>> that case the 'vector' we return and then gets passed to
>> ioapic_write is rather redundant. We can work out the GSI from the
>> ioapic rte that is being modified, after all. So perhaps we could
>> just flip into a non-legacy mode of operation in that case (perhaps
>> reserve a single magic 'vector' value to return from
>> PHYSDEVOP_alloc_irq_vector in this case, and if we see that value in
>> the ioapic write handler then we know to assume that guest pirq ==
>> gsi). 
> 
> I wouldn't base this on the passed in IRQ number, but instead on the
> number of IRQs mapped - if the proposed array doesn't have a spare
> slot anymore, switch to passing back the magic vector (and assume
> pirq == irq in ioapic_guest_write() - we could even add checking of
> that for all previously enabled IRQs this relation is true, and fail
> PHYSDEVOP_alloc_irq_vector if the array got exhausted and Dom0
> didn't use only GSIs before). But besides that detail your idea sounds
> fine at least from a Linux perspective.
> 
> Are you planning on getting this done, or should I?

Don't be so complex to handle it. I think the following patch should fix the 
potential issue. 
For linux dom0, it shouldn't depend on the value of rte.vector at all when GSI 
irq > 256, and just make pirq equal to irq and then build the pirq and irq 
mapping.  

diff -r 3a71e070e3c5 xen/arch/x86/io_apic.c
--- a/xen/arch/x86/io_apic.c    Fri Sep 18 14:45:40 2009 +0100
+++ b/xen/arch/x86/io_apic.c    Tue Sep 22 18:03:50 2009 +0800
@@ -2195,9 +2195,13 @@ int ioapic_guest_write(unsigned long phy

     /* Since PHYSDEVOP_alloc_irq_vector is dummy, rte.vector is the pirq
        which corresponds to this ioapic pin, retrieve it for building
-       pirq and irq mapping.
-     */
-    pirq = rte.vector;
+       pirq and irq mapping, and this is only for NetBSD dom0. For Linux
+       dom0, pirq == irq at any time.
+     */
+    if (irq >= 256)
+        pirq = irq;
+    else
+        pirq = rte.vector; /* Make NetBSD dom0 work. */
     if(pirq < 0 || pirq >= dom0->nr_pirqs)
         return -EINVAL;


> 
>> The legacy case is just to handle NetBSD, which throws non-GSIs at
>> the PHYSDEVOP_alloc_irq_vector interface, and I doubt it will have
>> worked with those mega-big systems at any time. So I don't think
>> we're dealing with a regression there.
> 
> Hmm, no, the regression is with (newer?) Linux, which should have been
> capable of dealing with interrupts coming in on IO-APIC pins above 256
> before that change.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel