[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 4/8] arm/irq: Migrate IRQs from dyings CPUs



Hi,

On 12/11/2025 10:51, Mykyta Poturai wrote:
Move IRQs from dying CPU to the online ones.
Guest-bound IRQs are already handled by scheduler in the process of
moving vCPUs to active pCPUs, so we only need to handle IRQs used by Xen
itself.

If IRQ is to be migrated, it's affinity is set to a mask of all online
CPUs. With current GIC implementation, this means they are routed to a
random online CPU. This may cause extra moves if multiple cores are
disabled in sequence, but should prevent all interrupts from piling up
on CPU0 in case of repeated up-down cycles on different cores.

Wouldn't they eventually all move to CPU0 in the case of suspend/resume or if all the CPUs but CPU0 are turned off and then off? If so, shouldn't we try to rebalance the interrupts?


IRQs from CPU 0 are never migrated, as dying CPU 0 means we are either
shutting down compeletely or entering system suspend.

I can't find a place where __cpu_disable() is called on CPU0. Do you have any pointer? In any case, I am not sure I want to bake that assumption in more places of the code.


Considering that all Xen-used IRQs are currently allocated during init
on CPU 0, and setup_irq uses smp_processor_id for the initial affinity.

Looking at the SMMU driver, we seems to request IRQs at the time the device is attached. So are you sure about this?

This change is not strictly required for correct operation for now, but
it should future-proof cpu hotplug and system suspend support in case
some kind if IRQ balancing is implemented later.

Signed-off-by: Mykyta Poturai <mykyta_poturai@xxxxxxxx>

v3->v4:
* patch introduced
---
  xen/arch/arm/include/asm/irq.h |  2 ++
  xen/arch/arm/irq.c             | 39 ++++++++++++++++++++++++++++++++++
  xen/arch/arm/smpboot.c         |  2 ++
  3 files changed, 43 insertions(+)

diff --git a/xen/arch/arm/include/asm/irq.h b/xen/arch/arm/include/asm/irq.h
index 09788dbfeb..6e6e27bb80 100644
--- a/xen/arch/arm/include/asm/irq.h
+++ b/xen/arch/arm/include/asm/irq.h
@@ -126,6 +126,8 @@ bool irq_type_set_by_domain(const struct domain *d);
  void irq_end_none(struct irq_desc *irq);
  #define irq_end_none irq_end_none
+void evacuate_irqs(unsigned int from);
+
  #endif /* _ASM_HW_IRQ_H */
  /*
   * Local variables:
diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index 28b40331f7..b383d71930 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -158,6 +158,45 @@ static int init_local_irq_data(unsigned int cpu)
      return 0;
  }
+static void evacuate_irq(int irq, unsigned int from)

Any reason why the 'irq' is signed?

+{
+    struct irq_desc *desc = irq_to_desc(irq);
+    unsigned long flags;
+
+    /* Don't move irqs from CPU 0 as it is always last to be disabled */

Per above, I am not convinced that we should special case CPU 0. But if we do, then shouldn't this be part of evacuate_irqs() so we don't pointlessly go through all the IRQs?

+    if ( from == 0 )
+        return;
+
+    ASSERT(!cpumask_empty(&cpu_online_map));
+    ASSERT(!cpumask_test_cpu(from, &cpu_online_map));
+
+    spin_lock_irqsave(&desc->lock, flags);
+    if ( likely(!desc->action) )
+        goto out;
+
+    if ( likely(test_bit(_IRQ_GUEST, &desc->status) ||
+                test_bit(_IRQ_MOVE_PENDING, &desc->status)) )
+        goto out;
+
+    if ( cpumask_test_cpu(from, desc->affinity) )
+        irq_set_affinity(desc, &cpu_online_map);

I think it would be worth explaining why we are routing to any CPU online rather than checking whether the affinity has other online CPUs.

Just to note, I don't have strong opinion either way. It mainly needs to be documented.

+
+out:
+    spin_unlock_irqrestore(&desc->lock, flags);
+    return;
+}
+
+void evacuate_irqs(unsigned int from)
+{
+    int irq;
> +> +    for ( irq = NR_LOCAL_IRQS; irq < NR_IRQS; irq++ )
+        evacuate_irq(irq, from);
+
+    for ( irq = ESPI_BASE_INTID; irq < ESPI_MAX_INTID; irq++ )

AFAICT, irq_to_desc() would not be able to cope with ESPI interrupts when CONFIG_GICV3_ESPI is not set. Has this been tested?

+        evacuate_irq(irq, from);
+}
+
  static int cpu_callback(struct notifier_block *nfb, unsigned long action,
                          void *hcpu)
  {
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 7f3cfa812e..46b24783dd 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -425,6 +425,8 @@ void __cpu_disable(void)
smp_mb(); + evacuate_irqs(cpu);

I think it would be worth explaining why evacuate_irqs() is called this late in the process.

> +> /* Return to caller; eventually the IPI mechanism will unwind and the
       * scheduler will drop to the idle loop, which will call stop_cpu(). */
  }

Cheers,

--
Julien Grall




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.