[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 2/3] xen/evtchn: rework per event channel lock



On 10.11.2020 10:47, Julien Grall wrote:
> Hi,
> 
> On 10/11/2020 07:39, Jan Beulich wrote:
>> On 09.11.2020 18:46, Julien Grall wrote:
>>> Hi,
>>>
>>> On 09/11/2020 16:48, Jan Beulich wrote:
>>>> On 09.11.2020 17:38, Juergen Gross wrote:
>>>>> Currently the lock for a single event channel needs to be taken with
>>>>> interrupts off, which causes deadlocks in some cases.
>>>>>
>>>>> Rework the per event channel lock to be non-blocking for the case of
>>>>> sending an event and removing the need for disabling interrupts for
>>>>> taking the lock.
>>>>>
>>>>> The lock is needed for avoiding races between event channel state
>>>>> changes (creation, closing, binding) against normal operations (set
>>>>> pending, [un]masking, priority changes).
>>>>>
>>>>> Use a rwlock, but with some restrictions:
>>>>>
>>>>> - Changing the state of an event channel (creation, closing, binding)
>>>>>     needs to use write_lock(), with ASSERT()ing that the lock is taken as
>>>>>     writer only when the state of the event channel is either before or
>>>>>     after the locked region appropriate (either free or unbound).
>>>>>
>>>>> - Sending an event needs to use read_trylock() mostly, in case of not
>>>>>     obtaining the lock the operation is omitted. This is needed as
>>>>>     sending an event can happen with interrupts off (at least in some
>>>>>     cases).
>>>>>
>>>>> - Dumping the event channel state for debug purposes is using
>>>>>     read_trylock(), too, in order to avoid blocking in case the lock is
>>>>>     taken as writer for a long time.
>>>>>
>>>>> - All other cases can use read_lock().
>>>>>
>>>>> Fixes: e045199c7c9c54 ("evtchn: address races with evtchn_reset()")
>>>>> Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
>>>>> ---
>>>>> V4:
>>>>> - switch to rwlock
>>>>> - add ASSERT() to verify correct write_lock() usage
>>>>>
>>>>> V3:
>>>>> - corrected a copy-and-paste error (Jan Beulich)
>>>>> - corrected unlocking in two cases (Jan Beulich)
>>>>> - renamed evtchn_read_trylock() (Jan Beulich)
>>>>> - added some comments and an ASSERT() for evtchn_write_lock()
>>>>> - set EVENT_WRITE_LOCK_INC to INT_MIN
>>>>>
>>>>> V2:
>>>>> - added needed barriers
>>>>>
>>>>> Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
>>>>
>>>> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
>>>>
>>>> I'll give it overnight for others to possibly comment, but I'd
>>>> like to get this in tomorrow if at all possible.
>>>
>>> IIRC, Citrix originally reported the issues. I would like to give them
>>> an opportunity to test the patches and confirm this effectively fixed
>>> there issues.
>>
>> I would consider waiting longer, but I'd like to get staging unblocked.
> 
> So the plan is to have the patches sitting in staging for a few days and 
> then backport?

Right, the usual thing - wait at least until a push has happened.
Unless we learn of problems with the change, I definitely want to
have this in for 4.14.1.

>> Just like with 52e1fc47abc3a0123 ("evtchn/Flask: pre-allocate node on
>> send path") we can always decide to revert / fix up afterwards. The
>> patch here surely is an improvement, even if it was to turn out it
>> doesn't fix all issues yes. I'd appreciate if you would confirm you
>> don't object to the patch going in - I'll wait a few more hours,
>> perhaps until around noon.
> I would suggest to wait until noon and if you don't get any answer, then 
> merge it.

Okay.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.