[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/7] xen/events: bug fixes and some diagnostic aids

To: Julien Grall <julien@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, linux-block@xxxxxxxxxxxxxxx, netdev@xxxxxxxxxxxxxxx, linux-scsi@xxxxxxxxxxxxxxx
From: Jürgen Groß <jgross@xxxxxxxx>
Date: Mon, 8 Feb 2021 10:41:00 +0100
Cc: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, stable@xxxxxxxxxxxxxxx, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Jens Axboe <axboe@xxxxxxxxx>, Wei Liu <wei.liu@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, "David S. Miller" <davem@xxxxxxxxxxxxx>, Jakub Kicinski <kuba@xxxxxxxxxx>
Delivery-date: Mon, 08 Feb 2021 09:41:05 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 08.02.21 10:11, Julien Grall wrote:

Hi Juergen,

On 07/02/2021 12:58, Jürgen Groß wrote:
On 06.02.21 19:46, Julien Grall wrote:
Hi Juergen,

On 06/02/2021 10:49, Juergen Gross wrote:
The first three patches are fixes for XSA-332. The avoid WARN splats
and a performance issue with interdomain events.
Thanks for helping to figure out the problem. Unfortunately, I stillsee reliably the WARN splat with the latest Linux master(1e0d27fce010) + your first 3 patches.
I am using Xen 4.11 (1c7d984645f9) and dom0 is forced to use the 2Levents ABI.
After some debugging, I think I have an idea what's went wrong. Theproblem happens when the event is initially bound from vCPU0 to adifferent vCPU.
From the comment in xen_rebind_evtchn_to_cpu(), we are masking theevent to prevent it being delivered on an unexpected vCPU. However, Ibelieve the following can happen:
vCPU0                | vCPU1
                 |
                 | Call xen_rebind_evtchn_to_cpu()
receive event X            |
                 | mask event X
                 | bind to vCPU1
<vCPU descheduled>        | unmask event X
                 |
                 | receive event X
                 |
                 | handle_edge_irq(X)
handle_edge_irq(X)        |  -> handle_irq_event()
                 |   -> set IRQD_IN_PROGRESS
  -> set IRQS_PENDING        |
                 |   -> evtchn_interrupt()
                 |   -> clear IRQD_IN_PROGRESS
                 |  -> IRQS_PENDING is set
                 |  -> handle_irq_event()
                 |   -> evtchn_interrupt()
                 |     -> WARN()
                 |
All the lateeoi handlers expect a ONESHOT semantic andevtchn_interrupt() is doesn't tolerate any deviation.
I think the problem was introduced by 7f874a0447a9 ("xen/events: fixlateeoi irq acknowledgment") because the interrupt was disabledpreviously. Therefore we wouldn't do another iteration inhandle_edge_irq().
I think you picked the wrong commit for blaming, as this is just
the last patch of the three patches you were testing.
I actually found the right commit for blaming but I copied theinformation from the wrong shell :/. The bug was introduced by:
c44b849cee8c ("xen/events: switch user event channels to lateeoi model")
Aside the handlers, I think it may impact the defer EOI mitigationbecause in theory if a 3rd vCPU is joining the party (let say vCPU Amigrate the event from vCPU B to vCPU C). So info->{eoi_cpu,irq_epoch, eoi_time} could possibly get mangled?
For a fix, we may want to consider to hold evtchn_rwlock with thewrite permission. Although, I am not 100% sure this is going toprevent everything.
It will make things worse, as it would violate the locking hierarchy
(xen_rebind_evtchn_to_cpu() is called with the IRQ-desc lock held).
Ah, right.
On a first glance I think we'll need a 3rd masking state ("temporarily
masked") in the second patch in order to avoid a race with lateeoi.

In order to avoid the race you outlined above we need an "event is being
handled" indicator checked via test_and_set() semantics in
handle_irq_for_port() and reset only when calling clear_evtchn().
It feels like we are trying to workaround the IRQ flow we are using(i.e. handle_edge_irq()).


I'm not really sure this is the main problem here. According to your
analysis the main problem is occurring when handling the event, not when
handling the IRQ: the event is being received on two vcpus.

Our problem isn't due to the IRQ still being pending, but due it being
raised again, which should happen for a one shot IRQ the same way.

But maybe I'm misunderstanding your idea.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: application/pgp-keys

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Follow-Ups:
- Re: [PATCH 0/7] xen/events: bug fixes and some diagnostic aids
  - From: Julien Grall

References:
- [PATCH 0/7] xen/events: bug fixes and some diagnostic aids
  - From: Juergen Gross
- Re: [PATCH 0/7] xen/events: bug fixes and some diagnostic aids
  - From: Julien Grall
- Re: [PATCH 0/7] xen/events: bug fixes and some diagnostic aids
  - From: Jürgen Groß
- Re: [PATCH 0/7] xen/events: bug fixes and some diagnostic aids
  - From: Julien Grall

Prev by Date: Re: [PATCH v2 3/3] x86/time: don't move TSC backwards in time_calibration_tsc_rendezvous()
Next by Date: Re: [PATCH 6/7] xen/evtch: use smp barriers for user event ring
Previous by thread: Re: [PATCH 0/7] xen/events: bug fixes and some diagnostic aids
Next by thread: Re: [PATCH 0/7] xen/events: bug fixes and some diagnostic aids
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.