[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/flush: use APIC ALLBUT destination shorthand when possible



On 08.01.2020 14:30, Roger Pau Monné  wrote:
> On Fri, Jan 03, 2020 at 01:55:51PM +0100, Jan Beulich wrote:
>> On 03.01.2020 13:34, Roger Pau Monné wrote:
>>> On Fri, Jan 03, 2020 at 01:08:20PM +0100, Jan Beulich wrote:
>>>> On 24.12.2019 13:44, Roger Pau Monne wrote:
>>>> Further a question on lock nesting: Since the commit message
>>>> doesn't say anything in this regard, did you check there are no
>>>> TLB flush invocations with the get_cpu_maps() lock held?
>>>
>>> The CPU maps lock is a recursive one, so it should be fine to attempt
>>> a TLB flush with the lock already held.
>>
>> When already held by the same CPU - sure. It being a recursive
>> one (which I paid attention to when writing my earlier reply)
>> doesn't make it (together with any other one) immune against
>> ABBA deadlocks, though.
> 
> There's no possibility of a deadlock here because get_cpu_maps does a
> trylock, so if another cpu is holding the lock the flush will just
> fallback to not using the shorthand.

Well, with the _exact_ arrangements (flush_lock used only in one
place, and cpu_add_remove_lock only used in ways similar to how it
is used now), there's no such risk, I agree. But there's nothing
at all making sure this doesn't change. Hence, as said, at the very
least this needs reasoning about in the description (or a code
comment).

>>>> Even if
>>>> you did and even if there are none, I think the function should
>>>> then get a comment attached to the effect of this lock order
>>>> inversion risk. (For example, it isn't obvious to me that no user
>>>> of stop_machine() would ever want to do any kind of TLB flushing.)
>>>>
>>>> Overall I wonder whether your goal couldn't be achieved without
>>>> the extra locking and without the special conditions.
>>>
>>> Hm, this so far has worked fine on all the boxes that I've tried.
>>> I'm happy to change it to a simpler approach, but I think the
>>> conditions and locking are required for this to work correctly.
>>
>> Which might then indicate said pre-existing use of cpu_online_map
>> to be a (latent?) problem.
> 
> Hm, maybe it could be a problem when offlining a CPU. I assume this is
> not an issue so far because there are no global TLB flushes with a
> mask contaning a CPU that is being offlined.
> 
> Regarding the patch itself, do you think the shorthand logic should be
> added to send_IPI_mask, or would you rather only use the shorthand for
> the TLB flushes?

If it could be generalize to all IPI sending, this would of course
seem preferable.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.