[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/events: Fix Global and Domain VIRQ tracking


  • To: Jason Andryuk <jason.andryuk@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>, Chris Wright <chrisw@xxxxxxxxxxxx>, Jeremy Fitzhardinge <jeremy@xxxxxxxxxxxxx>
  • From: Jürgen Groß <jgross@xxxxxxxx>
  • Date: Thu, 14 Aug 2025 09:05:29 +0200
  • Autocrypt: addr=jgross@xxxxxxxx; keydata= xsBNBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOB ycWxw3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJve dYm8Of8Zd621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJ NwQpd369y9bfIhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvx XP3FAp2pkW0xqG7/377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEB AAHNH0p1ZXJnZW4gR3Jvc3MgPGpncm9zc0BzdXNlLmNvbT7CwHkEEwECACMFAlOMcK8CGwMH CwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRCw3p3WKL8TL8eZB/9G0juS/kDY9LhEXseh mE9U+iA1VsLhgDqVbsOtZ/S14LRFHczNd/Lqkn7souCSoyWsBs3/wO+OjPvxf7m+Ef+sMtr0 G5lCWEWa9wa0IXx5HRPW/ScL+e4AVUbL7rurYMfwCzco+7TfjhMEOkC+va5gzi1KrErgNRHH kg3PhlnRY0Udyqx++UYkAsN4TQuEhNN32MvN0Np3WlBJOgKcuXpIElmMM5f1BBzJSKBkW0Jc Wy3h2Wy912vHKpPV/Xv7ZwVJ27v7KcuZcErtptDevAljxJtE7aJG6WiBzm+v9EswyWxwMCIO RoVBYuiocc51872tRGywc03xaQydB+9R7BHPzsBNBFOMcBYBCADLMfoA44MwGOB9YT1V4KCy vAfd7E0BTfaAurbG+Olacciz3yd09QOmejFZC6AnoykydyvTFLAWYcSCdISMr88COmmCbJzn sHAogjexXiif6ANUUlHpjxlHCCcELmZUzomNDnEOTxZFeWMTFF9Rf2k2F0Tl4E5kmsNGgtSa aMO0rNZoOEiD/7UfPP3dfh8JCQ1VtUUsQtT1sxos8Eb/HmriJhnaTZ7Hp3jtgTVkV0ybpgFg w6WMaRkrBh17mV0z2ajjmabB7SJxcouSkR0hcpNl4oM74d2/VqoW4BxxxOD1FcNCObCELfIS auZx+XT6s+CE7Qi/c44ibBMR7hyjdzWbABEBAAHCwF8EGAECAAkFAlOMcBYCGwwACgkQsN6d 1ii/Ey9D+Af/WFr3q+bg/8v5tCknCtn92d5lyYTBNt7xgWzDZX8G6/pngzKyWfedArllp0Pn fgIXtMNV+3t8Li1Tg843EXkP7+2+CQ98MB8XvvPLYAfW8nNDV85TyVgWlldNcgdv7nn1Sq8g HwB2BHdIAkYce3hEoDQXt/mKlgEGsLpzJcnLKimtPXQQy9TxUaLBe9PInPd+Ohix0XOlY+Uk QFEx50Ki3rSDl2Zt2tnkNYKUCvTJq7jvOlaPd6d/W0tZqpyy7KVay+K4aMobDsodB3dvEAs6 ScCnh03dDAFgIq5nsB11j3KPKdVoPlfucX2c7kGNH+LUMbzqV6beIENfNexkOfxHfw==
  • Cc: stable@xxxxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
  • Delivery-date: Thu, 14 Aug 2025 07:05:35 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 13.08.25 17:03, Jason Andryuk wrote:
On 2025-08-12 15:00, Jason Andryuk wrote:
VIRQs come in 3 flavors, per-VPU, per-domain, and global.  The existing
tracking of VIRQs is handled by per-cpu variables virq_to_irq.

The issue is that bind_virq_to_irq() sets the per_cpu virq_to_irq at
registration time - typically CPU 0.  Later, the interrupt can migrate,
and info->cpu is updated.  When calling unbind_from_irq(), the per-cpu
virq_to_irq is cleared for a different cpu.  If bind_virq_to_irq() is

This is what needs to be fixed. At migration the per_cpu virq_to_irq of the
source and the target cpu need to be updated to reflect that migration.

called again with CPU 0, the stale irq is returned.

Change the virq_to_irq tracking to use CPU 0 for per-domain and global
VIRQs.  As there can be at most one of each, there is no need for
per-vcpu tracking.  Also, per-domain and global VIRQs need to be
registered on CPU 0 and can later move, so this matches the expectation.

Fixes: e46cdb66c8fc ("xen: event channels")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Jason Andryuk <jason.andryuk@xxxxxxx>
---
Fixes is the introduction of the virq_to_irq per-cpu array.

This was found with the out-of-tree argo driver during suspend/resume.
On suspend, the per-domain VIRQ_ARGO is unbound.  On resume, the driver
attempts to bind VIRQ_ARGO.  The stale irq is returned, but the
WARN_ON(info == NULL || info->type != IRQT_VIRQ) in bind_virq_to_irq()
triggers for NULL info.  The bind fails and execution continues with the
driver trying to clean up by unbinding.  This eventually faults over the
NULL info.
---
  drivers/xen/events/events_base.c | 17 ++++++++++++++++-
  1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 41309d38f78c..a27e4d7f061e 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -159,7 +159,19 @@ static DEFINE_MUTEX(irq_mapping_update_lock);
  static LIST_HEAD(xen_irq_list_head);
-/* IRQ <-> VIRQ mapping. */
+static bool is_per_vcpu_virq(int virq) {
+    switch (virq) {
+    case VIRQ_TIMER:
+    case VIRQ_DEBUG:
+    case VIRQ_XENOPROF:
+    case VIRQ_XENPMU:
+        return true;
+    default:
+        return false;
+    }
+}
+
+/* IRQ <-> VIRQ mapping.  Global/Domain virqs are tracked in cpu 0.  */
  static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) = {[0 ... NR_VIRQS-1] = 
-1};
  /* IRQ <-> IPI mapping */
@@ -974,6 +986,9 @@ static void __unbind_from_irq(struct irq_info *info, unsigned int irq)
          switch (info->type) {
          case IRQT_VIRQ:
+            if (!is_per_vcpu_virq(virq_from_irq(info)))
+                cpu = 0;
+
              per_cpu(virq_to_irq, cpu)[virq_from_irq(info)] = -1;
              break;
          case IRQT_IPI:

Thinking about it a little more, bind_virq_to_irq() should ensure cpu == 0 for per-domain and global VIRQs to ensure the property holds.  Also virq_to_irq

In Xen's evtchn_bind_virq() there is:

    if ( type != VIRQ_VCPU && vcpu != 0 )
        return -EINVAL;

Making sure in Linux that there is never a violation of that restriction would
require to always have an up-to-date table of all possible VIRQs and their
type, which I'd like to avoid.

I think it is the user of the VIRQ who is responsible to ensure cpu 0 is passed
to bind_virq_to_irq(), as this user knows that such a restriction applies to
the VIRQ in question (at least he should know that).

Special handling for really used VIRQs in the kernel can have some special
handling, of course, as they are known already and should be used correctly.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.