[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 02/10] evtchn: bind-interdomain doesn't need to hold both domains' event locks





On 09/01/2021 15:41, Julien Grall wrote:
Hi Jan,

On 05/01/2021 13:09, Jan Beulich wrote:
The local domain's lock is needed for the port allocation, but for the
remote side the per-channel lock is sufficient. The per-channel locks
then need to be acquired slightly earlier, though.

I was expecting is little bit more information in the commit message because there are a few changes in behavior with this change:

 1) AFAICT, evtchn_allocate_port() rely on rchn->state to be protected by the rd->event_lock. Now that you dropped the rd->event_lock, rchn->state may be accessed while it is updated in evtchn_bind_interdomain(). The port cannot go back to ECS_FREE here, but I think the access needs to be switched to {read, write}_atomic() or ACCESS_ONCE.

  2) xsm_evtchn_interdomain() is now going to be called without the rd->event_lock. Can you confirm that the lock is not needed by XSM?

Actually, I think there is a bigger issue. evtchn_close() will check chn1->state with just d1->event_lock held (IOW, there chn1->lock is not held).

If the remote domain happen to close the unbound port at the same time the local domain bound it, then you may end up in the following situation:


evtchn_bind_interdomain()        | evtchn_close()
                                 |
                                 |  switch ( chn1->state )
                                 |  case ECS_UNBOUND:
                                 |      /* nothing to do */
   double_evtchn_lock()          |
   rchn->state = ECS_INTERDOMAIN |
   double_evtchn_unlock()        |
                                 |  evtchn_write_lock(chn1)
                                 |  evtchn_free(d1, chn1)
                                 |  evtchn_write_unlock(chn1)

When the local domain will try to close the port, it will hit the BUG_ON(chn2->state != ECS_INTERDOMAIN) because the remote port were already freed.

I think this can be solved by acquiring the event lock earlier on in evtchn_close(). Although, this may become a can of worms as it would be more complex to prevent lock inversion because chn1->lock and chn2->lock.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.