[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()



On Mon, Feb 17, 2020 at 02:17:23PM +0100, Jürgen Groß wrote:
> On 17.02.20 13:49, Roger Pau Monné wrote:
> > On Mon, Feb 17, 2020 at 01:32:59PM +0100, Jürgen Groß wrote:
> > > On 17.02.20 13:17, Roger Pau Monné wrote:
> > > > On Mon, Feb 17, 2020 at 01:11:59PM +0100, Jürgen Groß wrote:
> > > > > On 17.02.20 12:49, Julien Grall wrote:
> > > > > > Hi Juergen,
> > > > > > 
> > > > > > On 17/02/2020 07:20, Juergen Gross wrote:
> > > > > > > +void rcu_barrier(void)
> > > > > > >     {
> > > > > > > -    atomic_t cpu_count = ATOMIC_INIT(0);
> > > > > > > -    return stop_machine_run(rcu_barrier_action, &cpu_count, 
> > > > > > > NR_CPUS);
> > > > > > > +    if ( !atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) )
> > > > > > 
> > > > > > What does prevent the cpu_online_map to change under your feet?
> > > > > > Shouldn't you grab the lock via get_cpu_maps()?
> > > > > 
> > > > > Oh, indeed.
> > > > > 
> > > > > This in turn will require a modification of the logic to detect 
> > > > > parallel
> > > > > calls on multiple cpus.
> > > > 
> > > > If you pick my patch to turn that into a rw lock you shouldn't worry
> > > > about parallel calls I think, but the lock acquisition can still fail
> > > > if there's a CPU plug/unplug going on:
> > > > 
> > > > https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg00940.html
> > > 
> > > Thanks, but letting rcu_barrier() fail is a no go, so I still need to
> > > handle that case (I mean the case failing to get the lock). And handling
> > > of parallel calls is not needed to be functional correct, but to avoid
> > > not necessary cpu synchronization (each parallel call detected can just
> > > wait until the master has finished and then return).
> > > 
> > > BTW - The recursive spinlock today would allow for e.g. rcu_barrier() to
> > > be called inside a CPU plug/unplug section. Your rwlock is removing that
> > > possibility. Any chance that could be handled?
> > 
> > While this might be interesting for the rcu stuff, it certainly isn't
> > for other pieces also relying on the cpu maps lock.
> > 
> > Ie: get_cpu_maps must fail when called by send_IPI_mask if there's a
> > CPU plug/unplug operation going on, even if it's on the same pCPU
> > that's holding the lock in write mode.
> 
> Sure? How is cpu_down() working then?

send_IPI_mask failing to acquire the cpu maps lock prevents it from
using the APIC shorthand, which is what we want in that case.

> It is calling stop_machine_run()
> which is using send_IPI_mask()...

Xen should avoid using the APIC shorthand in that case, which I don't
think it's happening now, as the lock is recursive.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.