[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 5/5] evtchn: don't call Xen consumer callback with per-channel lock held

To: Jan Beulich <jbeulich@xxxxxxxx>
From: Julien Grall <julien@xxxxxxx>
Date: Fri, 4 Dec 2020 11:28:59 +0000
Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Tamas K Lengyel <lengyelt@xxxxxxxxxxxx>, Petre Ovidiu PIRCALABU <ppircalabu@xxxxxxxxxxxxxxx>, Alexandru Isaila <aisaila@xxxxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
Delivery-date: Fri, 04 Dec 2020 11:29:07 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Jan,

On 03/12/2020 10:09, Jan Beulich wrote:

On 02.12.2020 22:10, Julien Grall wrote:

On 23/11/2020 13:30, Jan Beulich wrote:

While there don't look to be any problems with this right now, the lock
order implications from holding the lock can be very difficult to follow
(and may be easy to violate unknowingly). The present callbacks don't
(and no such callback should) have any need for the lock to be held.

However, vm_event_disable() frees the structures used by respective
callbacks and isn't otherwise synchronized with invocations of these
callbacks, so maintain a count of in-progress calls, for evtchn_close()
to wait to drop to zero before freeing the port (and dropping the lock).


AFAICT, this callback is not the only place where the synchronization is
missing in the VM event code.

For instance, vm_event_put_request() can also race against
vm_event_disable().

So shouldn't we handle this issue properly in VM event?


I suppose that's a question to the VM event folks rather than me?

Yes. From my understanding of Tamas's e-mail, they are relying on themonitoring software to do the right thing.

I will refrain to comment on this approach. However, given the race ismuch wider than the event channel, I would recommend to not add morecode in the event channel to deal with such problem.

Instead, this should be fixed in the VM event code when someone has timeto harden the subsystem.

---
Should we make this accounting optional, to be requested through a new
parameter to alloc_unbound_xen_event_channel(), or derived from other
than the default callback being requested?


Aside the VM event, do you see any value for the other caller?


No (albeit I'm not entirely certain about vpl011_notification()'s
needs), hence the consideration. It's unnecessary overhead in
those cases.

I had another look and I think there is a small race in VPL011. Itshould be easy to fix (I will try to have a look later today).

@@ -781,9 +786,15 @@ int evtchn_send(struct domain *ld, unsig
           rport = lchn->u.interdomain.remote_port;
           rchn  = evtchn_from_port(rd, rport);
           if ( consumer_is_xen(rchn) )
+        {
+            /* Don't keep holding the lock for the call below. */
+            atomic_inc(&rchn->u.interdomain.active_calls);
+            evtchn_read_unlock(lchn);
               xen_notification_fn(rchn)(rd->vcpu[rchn->notify_vcpu_id], rport);
-        else
-            evtchn_port_set_pending(rd, rchn->notify_vcpu_id, rchn);


atomic_dec() doesn't contain any memory barrier, so we will want one
between xen_notification_fn() and atomic_dec() to avoid re-ordering.


Oh, indeed. But smp_mb() is too heavy handed here - x86 doesn't
really need any barrier, yet would gain a full MFENCE that way.
Actually - looks like I forgot we gained smp_mb__before_atomic()
a little over half a year ago.


Ah yes, I forgot that atomics instruction are ordered on x86.

Cheers,

--
Julien Grall

Follow-Ups:
- Re: [PATCH v3 5/5] evtchn: don't call Xen consumer callback with per-channel lock held
  - From: Tamas K Lengyel
- Re: [PATCH v3 5/5] evtchn: don't call Xen consumer callback with per-channel lock held
  - From: Jan Beulich

References:
- Re: [PATCH v3 5/5] evtchn: don't call Xen consumer callback with per-channel lock held
  - From: Julien Grall
- Re: [PATCH v3 5/5] evtchn: don't call Xen consumer callback with per-channel lock held
  - From: Jan Beulich

Prev by Date: Re: [PATCH v5 11/23] libxl: make sure callers of libxl_device_pci_list() free the list after use
Next by Date: Re: [PATCH v5 12/23] libxl: add libxl_device_pci_assignable_list_free()...
Previous by thread: Re: [PATCH v3 5/5] evtchn: don't call Xen consumer callback with per-channel lock held
Next by thread: Re: [PATCH v3 5/5] evtchn: don't call Xen consumer callback with per-channel lock held
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.