|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v6 1/4] xen/rcu: don't use stop_machine_run() for rcu_barrier()
On 16/03/2020 16:01, Jürgen Groß wrote:
> On 16.03.20 16:24, Igor Druzhinin wrote:
>> On 13/03/2020 13:06, Juergen Gross wrote:
>>> Today rcu_barrier() is calling stop_machine_run() to synchronize all
>>> physical cpus in order to ensure all pending rcu calls have finished
>>> when returning.
>>>
>>> As stop_machine_run() is using tasklets this requires scheduling of
>>> idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
>>> cpus only in case of core scheduling being active, as otherwise a
>>> scheduling deadlock would occur.
>>>
>>> There is no need at all to do the syncing of the cpus in tasklets, as
>>> rcu activity is started in __do_softirq() called whenever softirq
>>> activity is allowed. So rcu_barrier() can easily be modified to use
>>> softirq for synchronization of the cpus no longer requiring any
>>> scheduling activity.
>>>
>>> As there already is a rcu softirq reuse that for the synchronization.
>>>
>>> Remove the barrier element from struct rcu_data as it isn't used.
>>>
>>> Finally switch rcu_barrier() to return void as it now can never fail.
>>>
>>> Partially-based-on-patch-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
>>> Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
>>> ---
>>> V2:
>>> - add recursion detection
>>>
>>> V3:
>>> - fix races (Igor Druzhinin)
>>>
>>> V5:
>>> - rename done_count to pending_count (Jan Beulich)
>>> - fix race (Jan Beulich)
>>>
>>> V6:
>>> - add barrier (Julien Grall)
>>> - add ASSERT() (Julien Grall)
>>> - hold cpu_map lock until end of rcu_barrier() (Julien Grall)
>>> ---
>>> xen/common/rcupdate.c | 95
>>> +++++++++++++++++++++++++++++++++-------------
>>> xen/include/xen/rcupdate.h | 2 +-
>>> 2 files changed, 69 insertions(+), 28 deletions(-)
>>>
>>> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
>>> index 03d84764d2..ed9083d2b2 100644
>>> --- a/xen/common/rcupdate.c
>>> +++ b/xen/common/rcupdate.c
>>> @@ -83,7 +83,6 @@ struct rcu_data {
>>> struct rcu_head **donetail;
>>> long blimit; /* Upper limit on a
>>> processed batch */
>>> int cpu;
>>> - struct rcu_head barrier;
>>> long last_rs_qlen; /* qlen during the last
>>> resched */
>>> /* 3) idle CPUs handling */
>>> @@ -91,6 +90,7 @@ struct rcu_data {
>>> bool idle_timer_active;
>>> bool process_callbacks;
>>> + bool barrier_active;
>>> };
>>> /*
>>> @@ -143,51 +143,85 @@ static int qhimark = 10000;
>>> static int qlowmark = 100;
>>> static int rsinterval = 1000;
>>> -struct rcu_barrier_data {
>>> - struct rcu_head head;
>>> - atomic_t *cpu_count;
>>> -};
>>> +/*
>>> + * rcu_barrier() handling:
>>> + * cpu_count holds the number of cpus required to finish barrier
>>> handling.
>>> + * pending_count is initialized to nr_cpus + 1.
>>> + * Cpus are synchronized via softirq mechanism. rcu_barrier() is
>>> regarded to
>>> + * be active if pending_count is not zero. In case rcu_barrier() is
>>> called on
>>> + * multiple cpus it is enough to check for pending_count being not
>>> zero on entry
>>> + * and to call process_pending_softirqs() in a loop until
>>> pending_count drops to
>>> + * zero, before starting the new rcu_barrier() processing.
>>> + * In order to avoid hangs when rcu_barrier() is called multiple
>>> times on the
>>> + * same cpu in fast sequence and a slave cpu couldn't drop out of the
>>> + * barrier handling fast enough a second counter pending_count is
>>> needed.
>>> + * The rcu_barrier() invoking cpu will wait until pending_count
>>> reaches 1
>>> + * (meaning that all cpus have finished processing the barrier) and
>>> then will
>>> + * reset pending_count to 0 to enable entering rcu_barrier() again.
>>> + */
>>> +static atomic_t cpu_count = ATOMIC_INIT(0);
>>> +static atomic_t pending_count = ATOMIC_INIT(0);
>>> static void rcu_barrier_callback(struct rcu_head *head)
>>> {
>>> - struct rcu_barrier_data *data = container_of(
>>> - head, struct rcu_barrier_data, head);
>>> - atomic_inc(data->cpu_count);
>>> + smp_wmb(); /* Make all previous writes visible to other
>>> cpus. */
>>> + atomic_dec(&cpu_count);
>>> }
>>> -static int rcu_barrier_action(void *_cpu_count)
>>> +static void rcu_barrier_action(void)
>>> {
>>> - struct rcu_barrier_data data = { .cpu_count = _cpu_count };
>>> -
>>> - ASSERT(!local_irq_is_enabled());
>>> - local_irq_enable();
>>> + struct rcu_head head;
>>> /*
>>> * When callback is executed, all previously-queued RCU work
>>> on this CPU
>>> - * is completed. When all CPUs have executed their callback,
>>> data.cpu_count
>>> - * will have been incremented to include every online CPU.
>>> + * is completed. When all CPUs have executed their callback,
>>> cpu_count
>>> + * will have been decremented to 0.
>>> */
>>> - call_rcu(&data.head, rcu_barrier_callback);
>>> + call_rcu(&head, rcu_barrier_callback);
>>> - while ( atomic_read(data.cpu_count) != num_online_cpus() )
>>> + while ( atomic_read(&cpu_count) )
>>> {
>>> process_pending_softirqs();
>>> cpu_relax();
>>> }
>>> - local_irq_disable();
>>> -
>>> - return 0;
>>> + atomic_dec(&pending_count);
>>> }
>>> -/*
>>> - * As rcu_barrier() is using stop_machine_run() it is allowed to be
>>> used in
>>> - * idle context only (see comment for stop_machine_run()).
>>> - */
>>> -int rcu_barrier(void)
>>> +void rcu_barrier(void)
>>> {
>>> - atomic_t cpu_count = ATOMIC_INIT(0);
>>> - return stop_machine_run(rcu_barrier_action, &cpu_count, NR_CPUS);
>>> + unsigned int n_cpus;
>>> +
>>> + ASSERT(!in_irq() && local_irq_is_enabled());
>>> +
>>> + for ( ;; )
>>> + {
>>> + if ( !atomic_read(&pending_count) && get_cpu_maps() )
>>> + {
>>
>> If the whole action is happening while cpu_maps are taken why do you
>> need to check pending_count first? I think the logic of this loop
>> could be simplified if taken this into account.
>
> get_cpu_maps() can be successful on multiple cpus (its a read_lock()).
> Testing pending_count avoids hammering on the cache lines.
I see - the logic was changed recently. I'm currently testing this
version of the patch.
Igor
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |