[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/4] xen/arm: gic: Ensure we have an ISB between ack and do_IRQ()

On 10/24/18 10:38 AM, Andrii Anisov wrote:
Hello Julien,

Hi Andrii,

Thank you for the review.

On 23.10.18 21:17, Julien Grall wrote:
Devices that expose their interrupt status registers via system
registers (e.g. Statistical profiling, CPU PMU, DynamIQ PMU, arch timer,
vgic (although unused by Linux), ...)
I guess under vgic you mean GICH registers, is it correct?
I would speculate that GIC v2 registers are memory mapped, not exposed
via system registers.

vGIC is not only about GICv2. It also covers GICv3 that is accessible via system registers.

   rely on a context synchronising
operation on the CPU to ensure that the updated status register is
visible to the CPU when handling the interrupt. This usually happens as
a result of taking the IRQ exception in the first place, but there are
two race scenarios where this isn't the case.

For example, let's say we have two peripherals (X and Y), where Y uses a
system register for its interrupt status.

Case 1:
1. CPU takes an IRQ exception as a result of X raising an interrupt
2. Y then raises its interrupt line, but the update to its system
     register is not yet visible to the CPU
3. The GIC decides to expose Y's interrupt number first in the Ack
4. The CPU runs the IRQ handler for Y, but the status register is stale
But this scenario somehow explains a strange thing I saw during IRQ
latency investigation (optimization attempts) during last weeks.
The strange sequence is as following:
      1. CPU takes an IRQ exception from the guest context
      2. In `enter_hypervisor_head()` function (i.e. in `gic_clear_lrs()`
as for mainline) from some LR registers interrupt statuses are read as
      3. Performing further code, without returning to the guest (i.e.
inside vgic_vcpu_inject_irq()), it happens that we read status INVALID
from the LR we read PENDING before, in step 2. >
Please note that we are tailoring xen based on RELEASE-4.10.0.

As you tailor Xen, I need more details on your modification to be able to provide feedback on the oddness you encounter.

But you are using GICv2, right? If so, you are not using system registers for the vGIC. This is not covered by this patch.

Case 2:
1. CPU takes an IRQ exception as a result of X raising an interrupt
2. CPU reads the interrupt number for X from the Ack register and runs
     its IRQ handler
3. Y raises its interrupt line and the Ack register is updated, but
     again, the update to its system register is not yet visible to the
4. Since the GIC drivers poll the Ack register, we read Y's interrupt
     number and run its handler without a context synchronisation
     operation, therefore seeing the stale register value.

In either case, we run the risk of missing an IRQ. This patch solves the
problem by ensuring that we execute an ISB in the GIC drivers prior
to invoking the interrupt handler.

Based on Linux commit 39a06b67c2c1256bcf2361a1f67d2529f70ab206
"irqchip/gic: Ensure we have an ISB between ack and ->handle_irq".

Signed-off-by: Julien Grall <julien.grall@xxxxxxx>

      This patch is a candidate for backporting up to Xen 4.9.
   xen/arch/arm/gic.c | 2 ++
   1 file changed, 2 insertions(+)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 8d7e491060..305fbd66dd 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -388,12 +388,14 @@ void gic_interrupt(struct cpu_user_regs *regs, int is_fiq)
           if ( likely(irq >= 16 && irq < 1020) )
+            isb();
Taking in account that the first GICH accesses are from
`gic_clear_lrs()`, called from `enter_hypervisor_head`, I would suggest
moving isb() there.

               do_IRQ(regs, irq, is_fiq);
           else if ( is_lpi(irq) )
+            isb();


Julien Grall

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.