WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: Linux patch "translate Xen-provided PIRQs"

To: "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx>
Subject: [Xen-devel] Re: Linux patch "translate Xen-provided PIRQs"
From: "Jan Beulich" <jbeulich@xxxxxxxxxx>
Date: Mon, 08 Sep 2008 12:17:55 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Mon, 08 Sep 2008 04:17:56 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <C4E83A6B.1CE8D%keir.fraser@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <C4E80558.1CE83%keir.fraser@xxxxxxxxxxxxx> <C4E83A6B.1CE8D%keir.fraser@xxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>>> Keir Fraser <keir.fraser@xxxxxxxxxxxxx> 06.09.08 14:47 >>>
>On 6/9/08 10:00, "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx> wrote:
>
>> I reverted this from the tree since it breaks the -xenU build and also mainly
>> because I thought it was responsible for breaking our automated tests.
>> Actually I'm not so sure about the latter, and the build breakage is easily
>> fixed. I'll do some more investigation on Monday and reinstate the patch 
>> (with
>> build fix) if it looks okay to do so.
>> 
>>  -- Keir
>
>Jan,
>
>I fixed the build failure easily enough by making the ifdef-else'ed
>definition of NR_PIRQS check for defined(MAX_IO_APICS).
>
>However, save/restore is definitely broken (I got the same result with both
>default -xen and -xenU configs). The appended backtrace is reported
>immediately upon resume (it's the only BUG_ON in restore_cpu_ipis()). I
>don't get this failure with the patch reverted (although automated tests are
>still failing save/restore for some unknown reason; but it works for me
>manually).

Pretty easily fixed - there was another place where the width of the event
channel part of the irq_info array was hard-coded to 16 bits, which I
overlooked. New patch below.

>I'm not personally bothered enough about this patch to fix it myself (I know
>that Jeremy has other approaches in pv_ops, which has to be what we move to
>during 3.4 development), so this is FYI if you want to fix it up for the
>2.6.18 tree.

So you really think this will not only be available, but also stable and at
feature parity with the current tree within about half a year? I've got
other significant changes to event channel handling pending that I would
only bother to make apply to 2.6.18 if there's really going to be any use
(and testing) for them in that tree. They mostly have to do with the
scalability issue of currently binding per-CPU IPIs and VIRQs to one
(kernel) IRQ for each CPU, whereas they should really use a single per-CPU
IRQ to bind all event channels resulting from the same IPI/VIRQ (Jeremy
indicated he had something similar in mind for pv-ops).

Jan

****************************************************

Previously, the kernel depended upon Xen's NR_IRQS to be no larger than
the kernel's NR_PIRQS.

As usual, written and tested on 2.6.27-rc5 and made apply to the 2.6.18
tree without further testing.

Signed-off-by: Jan Beulich <jbeulich@xxxxxxxxxx>

Index: head-2008-09-01/arch/i386/kernel/io_apic-xen.c
===================================================================
--- head-2008-09-01.orig/arch/i386/kernel/io_apic-xen.c 2008-08-22 
15:28:26.000000000 +0200
+++ head-2008-09-01/arch/i386/kernel/io_apic-xen.c      2008-09-03 
10:47:24.000000000 +0200
@@ -52,6 +52,7 @@
 
 #include <xen/interface/xen.h>
 #include <xen/interface/physdev.h>
+#include <xen/evtchn.h>
 
 /* Fake i8259 */
 #define make_8259A_irq(_irq)     (io_apic_irqs &= ~(1UL<<(_irq)))
@@ -1275,7 +1276,7 @@ static void ioapic_register_intr(int irq
        set_intr_gate(vector, interrupt[idx]);
 }
 #else
-#define ioapic_register_intr(_irq,_vector,_trigger) ((void)0)
+#define ioapic_register_intr(irq, vector, trigger) evtchn_register_pirq(irq)
 #endif
 
 static void __init setup_IO_APIC_irqs(void)
Index: head-2008-09-01/arch/x86_64/kernel/io_apic-xen.c
===================================================================
--- head-2008-09-01.orig/arch/x86_64/kernel/io_apic-xen.c       2008-08-22 
15:28:26.000000000 +0200
+++ head-2008-09-01/arch/x86_64/kernel/io_apic-xen.c    2008-09-03 
10:47:28.000000000 +0200
@@ -81,6 +81,7 @@ int sis_apic_bug; /* not actually suppor
 
 #include <xen/interface/xen.h>
 #include <xen/interface/physdev.h>
+#include <xen/evtchn.h>
 
 /* Fake i8259 */
 #define make_8259A_irq(_irq)     (io_apic_irqs &= ~(1UL<<(_irq)))
@@ -840,7 +841,7 @@ static void ioapic_register_intr(int irq
        set_intr_gate(vector, interrupt[idx]);
 }
 #else
-#define ioapic_register_intr(_irq,_vector,_trigger) ((void)0)
+#define ioapic_register_intr(irq, vector, trigger) evtchn_register_pirq(irq)
 #endif /* !CONFIG_XEN */
 
 static void __init setup_IO_APIC_irqs(void)
Index: head-2008-09-01/drivers/pci/msi-xen.c
===================================================================
--- head-2008-09-01.orig/drivers/pci/msi-xen.c  2008-09-03 13:17:09.000000000 
+0200
+++ head-2008-09-01/drivers/pci/msi-xen.c       2008-09-03 14:38:24.000000000 
+0200
@@ -17,6 +17,8 @@
 #include <linux/pci.h>
 #include <linux/proc_fs.h>
 
+#include <xen/evtchn.h>
+
 #include <asm/errno.h>
 #include <asm/io.h>
 #include <asm/smp.h>
@@ -231,13 +233,15 @@ static int msi_unmap_pirq(struct pci_dev
        int rc;
 
        unmap.domid = msi_get_dev_owner(dev);
-       unmap.pirq = pirq;
+       unmap.pirq = evtchn_get_xen_pirq(pirq);
 
        if ((rc = HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap)))
                printk(KERN_WARNING "unmap irq %x failed\n", pirq);
 
        if (rc < 0)
                return rc;
+
+       evtchn_map_pirq(pirq, 0);
        return 0;
 }
 
@@ -272,7 +276,7 @@ static int msi_map_pirq_to_vector(struct
        map_irq.domid = domid;
        map_irq.type = MAP_PIRQ_TYPE_MSI;
        map_irq.index = -1;
-       map_irq.pirq = pirq;
+       map_irq.pirq = pirq < 0 ? -1 : evtchn_get_xen_pirq(pirq);
        map_irq.bus = dev->bus->number;
        map_irq.devfn = dev->devfn;
        map_irq.entry_nr = entry_nr;
@@ -283,8 +287,12 @@ static int msi_map_pirq_to_vector(struct
 
        if (rc < 0)
                return rc;
+       /* This happens when MSI support is not enabled in Xen. */
+       if (rc == 0 && map_irq.pirq < 0)
+               return -ENOSYS;
 
-       return map_irq.pirq;
+       BUG_ON(map_irq.pirq <= 0);
+       return evtchn_map_pirq(pirq, map_irq.pirq);
 }
 
 static int msi_map_vector(struct pci_dev *dev, int entry_nr, u64 table_base)
Index: head-2008-09-01/drivers/xen/core/evtchn.c
===================================================================
--- head-2008-09-01.orig/drivers/xen/core/evtchn.c      2008-09-03 
13:17:09.000000000 +0200
+++ head-2008-09-01/drivers/xen/core/evtchn.c   2008-09-03 16:05:34.000000000 
+0200
@@ -83,13 +83,27 @@ enum {
        IRQT_VIRQ,
        IRQT_IPI,
        IRQT_LOCAL_PORT,
-       IRQT_CALLER_PORT
+       IRQT_CALLER_PORT,
+       _IRQT_COUNT
 };
 
+#define _IRQT_BITS 4
+#define _EVTCHN_BITS 12
+#define _INDEX_BITS (32 - _IRQT_BITS - _EVTCHN_BITS)
+
 /* Constructor for packed IRQ information. */
 static inline u32 mk_irq_info(u32 type, u32 index, u32 evtchn)
 {
-       return ((type << 24) | (index << 16) | evtchn);
+       BUILD_BUG_ON(_IRQT_COUNT > (1U << _IRQT_BITS));
+
+       BUILD_BUG_ON(NR_PIRQS > (1U << _INDEX_BITS));
+       BUILD_BUG_ON(NR_VIRQS > (1U << _INDEX_BITS));
+       BUILD_BUG_ON(NR_IPIS > (1U << _INDEX_BITS));
+       BUG_ON(index >> _INDEX_BITS);
+
+       BUILD_BUG_ON(NR_EVENT_CHANNELS > (1U << _EVTCHN_BITS));
+
+       return ((type << (32 - _IRQT_BITS)) | (index << _EVTCHN_BITS) | evtchn);
 }
 
 /* Convenient shorthand for packed representation of an unbound IRQ. */
@@ -102,17 +116,17 @@ static inline u32 mk_irq_info(u32 type, 
 
 static inline unsigned int evtchn_from_irq(int irq)
 {
-       return (u16)(irq_info[irq]);
+       return irq_info[irq] & ((1U << _EVTCHN_BITS) - 1);
 }
 
 static inline unsigned int index_from_irq(int irq)
 {
-       return (u8)(irq_info[irq] >> 16);
+       return (irq_info[irq] >> _EVTCHN_BITS) & ((1U << _INDEX_BITS) - 1);
 }
 
 static inline unsigned int type_from_irq(int irq)
 {
-       return (u8)(irq_info[irq] >> 24);
+       return irq_info[irq] >> (32 - _IRQT_BITS);
 }
 
 /* IRQ <-> VIRQ mapping. */
@@ -882,6 +896,60 @@ static struct irq_chip dynirq_chip = {
        .retrigger = resend_irq_on_evtchn,
 };
 
+void evtchn_register_pirq(int irq)
+{
+       irq_info[irq] = mk_irq_info(IRQT_PIRQ, irq, 0);
+}
+
+#ifndef CONFIG_X86_IO_APIC
+#undef IO_APIC_IRQ
+#define IO_APIC_IRQ(irq) ((irq) >= pirq_to_irq(16))
+#endif
+
+int evtchn_map_pirq(int irq, int xen_pirq)
+{
+       if (irq < 0) {
+               static DEFINE_SPINLOCK(irq_alloc_lock);
+
+               irq = pirq_to_irq(NR_PIRQS - 1);
+               spin_lock(&irq_alloc_lock);
+               do {
+                       if (!IO_APIC_IRQ(irq))
+                               continue;
+                       if (!index_from_irq(irq)) {
+                               BUG_ON(type_from_irq(irq) != IRQT_UNBOUND);
+                               irq_info[irq] = mk_irq_info(IRQT_PIRQ,
+                                                           xen_pirq, 0);
+                               break;
+                       }
+               } while (--irq);
+               spin_unlock(&irq_alloc_lock);
+               if (irq < pirq_to_irq(16))
+                       return -ENOSPC;
+       } else if (!xen_pirq) {
+               if (unlikely(type_from_irq(irq) != IRQT_PIRQ))
+                       return -EINVAL;
+               irq_info[irq] = IRQ_UNBOUND;
+               return 0;
+       } else if (type_from_irq(irq) != IRQT_PIRQ
+                  || index_from_irq(irq) != xen_pirq) {
+               printk(KERN_ERR "IRQ#%d is already mapped to %d:%u - "
+                               "cannot map to PIRQ#%u\n",
+                      irq, type_from_irq(irq), index_from_irq(irq), xen_pirq);
+               return -EINVAL;
+       }
+       return index_from_irq(irq) ? irq : -EINVAL;
+}
+
+int evtchn_get_xen_pirq(int irq)
+{
+       if (!IO_APIC_IRQ(irq))
+               return irq;
+       if (unlikely(type_from_irq(irq) != IRQT_PIRQ))
+               return 0;
+       return index_from_irq(irq);
+}
+
 static inline void pirq_unmask_notify(int pirq)
 {
        struct physdev_eoi eoi = { .irq = pirq };
@@ -914,7 +982,7 @@ static unsigned int startup_pirq(unsigne
        if (VALID_EVTCHN(evtchn))
                goto out;
 
-       bind_pirq.pirq  = irq;
+       bind_pirq.pirq = evtchn_get_xen_pirq(irq);
        /* NB. We are happy to share unless we are probing. */
        bind_pirq.flags = probing_irq(irq) ? 0 : BIND_PIRQ__WILL_SHARE;
        if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_pirq, &bind_pirq) != 0) {
@@ -929,7 +997,7 @@ static unsigned int startup_pirq(unsigne
 
        evtchn_to_irq[evtchn] = irq;
        bind_evtchn_to_cpu(evtchn, 0);
-       irq_info[irq] = mk_irq_info(IRQT_PIRQ, irq, evtchn);
+       irq_info[irq] = mk_irq_info(IRQT_PIRQ, bind_pirq.pirq, evtchn);
 
  out:
        unmask_evtchn(evtchn);
@@ -954,7 +1022,7 @@ static void shutdown_pirq(unsigned int i
 
        bind_evtchn_to_cpu(evtchn, 0);
        evtchn_to_irq[evtchn] = -1;
-       irq_info[irq] = IRQ_UNBOUND;
+       irq_info[irq] = mk_irq_info(IRQT_PIRQ, index_from_irq(irq), 0);
 }
 
 static void enable_pirq(unsigned int irq)
@@ -1210,7 +1278,7 @@ static int evtchn_resume(struct sys_devi
 
        /* No IRQ <-> event-channel mappings. */
        for (irq = 0; irq < NR_IRQS; irq++)
-               irq_info[irq] &= ~0xFFFF; /* zap event-channel binding */
+               irq_info[irq] &= ~((1U << _EVTCHN_BITS) - 1);
        for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++)
                evtchn_to_irq[evtchn] = -1;
 
Index: head-2008-09-01/include/asm-i386/mach-xen/irq_vectors.h
===================================================================
--- head-2008-09-01.orig/include/asm-i386/mach-xen/irq_vectors.h        
2008-09-03 13:17:09.000000000 +0200
+++ head-2008-09-01/include/asm-i386/mach-xen/irq_vectors.h     2008-09-03 
11:09:40.000000000 +0200
@@ -37,7 +37,13 @@
  */
 
 #define PIRQ_BASE              0
-#define NR_PIRQS               256
+#if !defined(MAX_IO_APICS)
+# define NR_PIRQS              (NR_VECTORS + 32 * NR_CPUS)
+#elif NR_CPUS < MAX_IO_APICS
+# define NR_PIRQS              (NR_VECTORS + 32 * NR_CPUS)
+#else
+# define NR_PIRQS              (NR_VECTORS + 32 * MAX_IO_APICS)
+#endif
 
 #define DYNIRQ_BASE            (PIRQ_BASE + NR_PIRQS)
 #define NR_DYNIRQS             256
Index: head-2008-09-01/include/xen/evtchn.h
===================================================================
--- head-2008-09-01.orig/include/xen/evtchn.h   2008-09-03 13:17:09.000000000 
+0200
+++ head-2008-09-01/include/xen/evtchn.h        2008-09-03 14:34:50.000000000 
+0200
@@ -117,6 +117,13 @@ asmlinkage void evtchn_do_upcall(struct 
 /* Entry point for notifications into the userland character device. */
 void evtchn_device_upcall(int port);
 
+/* Mark a PIRQ as unavailable for dynamic allocation. */
+void evtchn_register_pirq(int irq);
+/* Map a Xen-supplied PIRQ to a dynamically allocated one. */
+int evtchn_map_pirq(int irq, int xen_pirq);
+/* Look up a Xen-supplied PIRQ for a dynamically allocated one. */
+int evtchn_get_xen_pirq(int irq);
+
 void mask_evtchn(int port);
 void disable_all_local_evtchn(void);
 void unmask_evtchn(int port);



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>