|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/flush: use APIC ALLBUT destination shorthand when possible
On Fri, Jan 03, 2020 at 01:08:20PM +0100, Jan Beulich wrote:
> On 24.12.2019 13:44, Roger Pau Monne wrote:
> > @@ -227,14 +233,47 @@ void flush_area_mask(const cpumask_t *mask, const
> > void *va, unsigned int flags)
> > if ( (flags & ~FLUSH_ORDER_MASK) &&
> > !cpumask_subset(mask, cpumask_of(cpu)) )
> > {
> > + bool cpus_locked = false;
> > +
> > spin_lock(&flush_lock);
> > cpumask_and(&flush_cpumask, mask, &cpu_online_map);
> > cpumask_clear_cpu(cpu, &flush_cpumask);
> > flush_va = va;
> > flush_flags = flags;
> > - send_IPI_mask(&flush_cpumask, INVALIDATE_TLB_VECTOR);
> > +
> > + /*
> > + * Prevent any CPU hot{un}plug while sending the IPIs if we are to
> > use
> > + * a shorthand, also refuse to use a shorthand if not all CPUs are
> > + * online or have been parked.
> > + */
> > + if ( system_state > SYS_STATE_smp_boot && !cpu_overflow &&
> > + (cpus_locked = get_cpu_maps()) &&
> > + (park_offline_cpus ||
>
> Why is it relevant whether we park offline CPUs, or whether we've
> even brought up all of the ones a system has? An IPI, in particular
> a broadcast one, shouldn't have any issue getting delivered if some
> of the nominal recipients don't listen, should it? (The use of
> cpu_online_map that was already there above is a sign - but not a
> proof, as it may itself be buggy - that the set of online CPUs
> fluctuating behind this function's back ought to not be a problem.)
I've tried it myself, and if not all CPUs are onlined when the
shorthand is used the box would just reboot. This matches the
description at:
https://lwn.net/Articles/793065/
Of the Linux shorthand implementation.
> Further a question on lock nesting: Since the commit message
> doesn't say anything in this regard, did you check there are no
> TLB flush invocations with the get_cpu_maps() lock held?
The CPU maps lock is a recursive one, so it should be fine to attempt
a TLB flush with the lock already held.
> Even if
> you did and even if there are none, I think the function should
> then get a comment attached to the effect of this lock order
> inversion risk. (For example, it isn't obvious to me that no user
> of stop_machine() would ever want to do any kind of TLB flushing.)
>
> Overall I wonder whether your goal couldn't be achieved without
> the extra locking and without the special conditions.
Hm, this so far has worked fine on all the boxes that I've tried.
I'm happy to change it to a simpler approach, but I think the
conditions and locking are required for this to work correctly.
Thanks, Roger.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |