Xen project Mailing List

Re: [PATCH v6 2/3] xen/evtchn: rework per event channel lock

From: Jürgen Groß <jgross@xxxxxxxx>

Date: Mon, 23 Nov 2020 08:29:07 +0100

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Mon, 23 Nov 2020 07:29:39 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 18.11.20 14:19, Jan Beulich wrote:

On 09.11.2020 17:38, Juergen Gross wrote:

Currently the lock for a single event channel needs to be taken with
interrupts off, which causes deadlocks in some cases.

Rework the per event channel lock to be non-blocking for the case of
sending an event and removing the need for disabling interrupts for
taking the lock.

The lock is needed for avoiding races between event channel state
changes (creation, closing, binding) against normal operations (set
pending, [un]masking, priority changes).

Use a rwlock, but with some restrictions:

- Changing the state of an event channel (creation, closing, binding)
   needs to use write_lock(), with ASSERT()ing that the lock is taken as
   writer only when the state of the event channel is either before or
   after the locked region appropriate (either free or unbound).

- Sending an event needs to use read_trylock() mostly, in case of not
   obtaining the lock the operation is omitted. This is needed as
   sending an event can happen with interrupts off (at least in some
   cases).

- Dumping the event channel state for debug purposes is using
   read_trylock(), too, in order to avoid blocking in case the lock is
   taken as writer for a long time.

- All other cases can use read_lock().


One of the implications is that racing invocations of ->set_pending()
are now possible for the same port. Beyond what I said in reply to
0/3 already, I'm afraid there are (latent) issues:

1) The update of ->pending (or basically any bitfield in struct
evtchn, or yet more generically any field getting updated in a read-
modify-write fashion) is no longer generally safe in any of the
hooks called with just a read lock held. ->pending itself is not an
issue now merely because it shares storage only with xen_consumer,
which won't get updated once a port was bound.

This is fragile. We should put the pending indicator into a dedicated byte.

2) Of two racing sends, one may now complete without the port
actually having got fully recorded as linked in the FIFO code. This
is because the party losing the race of setting EVTCHN_FIFO_LINKED
will return early, without regard to whether the winner has made
enough progress. (Of course this is possible only with an
intermediate queue change, as only then the lock would become
available to the second of the senders early enough.)

No, I don't think this is limited to a queue change. If a caller of evtchn_fifo_set_pending() is being interrupted after setting EVTCHN_FIFO_PENDING, and then a second caller can make it to setting EVTCHN_FIFO_LINKED, the first caller won't even try to take the queue lock, resulting in evtchn_check_pollers() being called before the event might have been put properly into the queue. I'd suggest to extend the fifo queue lock region in order to mitigate this problem.


I've gone through other functions called from this path and didn't
find any further race potential there, but I'm not entirely certain
I didn't miss anything.

I can prepare a patch if you agree my ideas. Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: application/pgp-keys

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.