[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] x86/cpu: Sync any remaining RCU callbacks after CPU up/down



On 25/02/2020 12:46, Igor Druzhinin wrote:
> On 25/02/2020 12:40, Jan Beulich wrote:
>> On 25.02.2020 13:37, Igor Druzhinin wrote:
>>> On 25/02/2020 12:22, Jan Beulich wrote:
>>>> On 21.02.2020 20:26, Igor Druzhinin wrote:
>>>>> On 21/02/2020 16:29, Jan Beulich wrote:
>>>>>> On 19.02.2020 18:25, Igor Druzhinin wrote:
>>>>>>> --- a/xen/arch/x86/sysctl.c
>>>>>>> +++ b/xen/arch/x86/sysctl.c
>>>>>>> @@ -78,8 +78,11 @@ static void l3_cache_get(void *arg)
>>>>>>>  long cpu_up_helper(void *data)
>>>>>>>  {
>>>>>>>      unsigned int cpu = (unsigned long)data;
>>>>>>> -    int ret = cpu_up(cpu);
>>>>>>> +    int ret;
>>>>>>>  
>>>>>>> +    /* Flush potentially scheduled RCU work from preceding CPU offline 
>>>>>>> */
>>>>>>> +    rcu_barrier();
>>>>>>> +    ret = cpu_up(cpu);
>>>>>>>      if ( ret == -EBUSY )
>>>>>>>      {
>>>>>>>          /* On EBUSY, flush RCU work and have one more go. */
>>>>>>> @@ -104,7 +107,11 @@ long cpu_up_helper(void *data)
>>>>>>>  long cpu_down_helper(void *data)
>>>>>>>  {
>>>>>>>      int cpu = (unsigned long)data;
>>>>>>> -    int ret = cpu_down(cpu);
>>>>>>> +    int ret;
>>>>>>> +
>>>>>>> +    /* Flush potentially scheduled RCU work from preceding CPU online 
>>>>>>> */
>>>>>>> +    rcu_barrier();
>>>>>>> +    ret = cpu_down(cpu);
>>>>>>>      if ( ret == -EBUSY )
>>>>>>>      {
>>>>>>>          /* On EBUSY, flush RCU work and have one more go. */
>>>>>>>
>>>>>>
>>>>>> There are more calls to cpu_up() / cpu_down(), most notably in
>>>>>> core_parking.c. Wouldn't it be better to make the barrier part
>>>>>> of the two functions? This would the also cover non-x86 in
>>>>>> case an arch wants to support offlining/onlining of CPUs.
>>>>>
>>>>> Those functions are called from early init code and rcu_barrier() is
>>>>> an expensive operation. I think it's better if caller is responsible
>>>>> for syncing the state. This is the reason I moved rcu_barrier() in front
>>>>> of cpu_up/down.
>>>>
>>>> Well, there are two aspects here: One is to avoid the overhead where
>>>> it's not needed. The other is, as observed on this patch, that by
>>>> the chosen approach callers which in fact need amending may be
>>>> forgotten. To find middle grounds, perhaps the solution is to have
>>>> variants of cpu_{up,down}() which invoke the barrier, and which
>>>> would be used by all runtime invocations?
>>>>
>>>> The other question of course is why early init code is special in
>>>> this regard. If it indeed was, perhaps the barrier invocation could
>>>> also be tied to certain SYS_STATE_* values?
>>>
>>> It's not special - in fact it starts after RCU is initialized. The issue
>>> is, as you said, unnecessary overhead.
>>
>> Well, if it's unnecessary overhead, then it is special in some way.
>> After all at runtime the overhead isn't unnecessary. Is it perhaps
>> just that currently we don't _happen_ to have anything that would
>> make use of an RCU barrier necessary in this case? In which case it
>> would be a problem waiting to bite us down the road?
> 
> I agree to a certain extent that it might be prudent to avoid special
> casing even if we currently know that the case is safe. Let me see
> if overhead is tolerable at CPU bring up on our largest system available
> (448 CPUs).

I didn't see any significant slowdown in boot on 448 CPUs with latest
version of RCU series from Juergen. Will send v3 shortly.

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.