[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/flush: use APIC ALLBUT destination shorthand when possible


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Fri, 3 Jan 2020 13:34:09 +0100
  • Authentication-results: esa2.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=roger.pau@xxxxxxxxxx; spf=Pass smtp.mailfrom=roger.pau@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Wei Liu <wl@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Fri, 03 Jan 2020 12:34:37 +0000
  • Ironport-sdr: xwl6nRZo2x/gxTcO/aL5yNSBdVamV22zW0QvTImsbC20CN9x4aSCOIDrzgM5B2eLzCTjDxn50/ obI7MjRi8udhUMU07sZTeGqh4UzkSr3Yno8FCc0VAnQ5uUEpBfMPwlOlr5keIrkD4y/8TsLGjU LeOL5OdDFiweaSfEkYr4Vc6XQU83eMWOBI5q4q0roNJLseC7G2s1NkgkDfTx338G9bhYgmjY5A hB7ndYrLSsCk9EsnZsGpwFvTBARCSEIIkM13LaHqS7IRc8jMgjzUiyZyTLJCXr8C/RWZiDCWhb 9PM=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, Jan 03, 2020 at 01:08:20PM +0100, Jan Beulich wrote:
> On 24.12.2019 13:44, Roger Pau Monne wrote:
> > @@ -227,14 +233,47 @@ void flush_area_mask(const cpumask_t *mask, const 
> > void *va, unsigned int flags)
> >      if ( (flags & ~FLUSH_ORDER_MASK) &&
> >           !cpumask_subset(mask, cpumask_of(cpu)) )
> >      {
> > +        bool cpus_locked = false;
> > +
> >          spin_lock(&flush_lock);
> >          cpumask_and(&flush_cpumask, mask, &cpu_online_map);
> >          cpumask_clear_cpu(cpu, &flush_cpumask);
> >          flush_va      = va;
> >          flush_flags   = flags;
> > -        send_IPI_mask(&flush_cpumask, INVALIDATE_TLB_VECTOR);
> > +
> > +        /*
> > +         * Prevent any CPU hot{un}plug while sending the IPIs if we are to 
> > use
> > +         * a shorthand, also refuse to use a shorthand if not all CPUs are
> > +         * online or have been parked.
> > +         */
> > +        if ( system_state > SYS_STATE_smp_boot && !cpu_overflow &&
> > +             (cpus_locked = get_cpu_maps()) &&
> > +             (park_offline_cpus ||
> 
> Why is it relevant whether we park offline CPUs, or whether we've
> even brought up all of the ones a system has? An IPI, in particular
> a broadcast one, shouldn't have any issue getting delivered if some
> of the nominal recipients don't listen, should it? (The use of
> cpu_online_map that was already there above is a sign - but not a
> proof, as it may itself be buggy - that the set of online CPUs
> fluctuating behind this function's back ought to not be a problem.)

I've tried it myself, and if not all CPUs are onlined when the
shorthand is used the box would just reboot. This matches the
description at:

https://lwn.net/Articles/793065/

Of the Linux shorthand implementation.

> Further a question on lock nesting: Since the commit message
> doesn't say anything in this regard, did you check there are no
> TLB flush invocations with the get_cpu_maps() lock held?

The CPU maps lock is a recursive one, so it should be fine to attempt
a TLB flush with the lock already held.

> Even if
> you did and even if there are none, I think the function should
> then get a comment attached to the effect of this lock order
> inversion risk. (For example, it isn't obvious to me that no user
> of stop_machine() would ever want to do any kind of TLB flushing.)
> 
> Overall I wonder whether your goal couldn't be achieved without
> the extra locking and without the special conditions.

Hm, this so far has worked fine on all the boxes that I've tried.
I'm happy to change it to a simpler approach, but I think the
conditions and locking are required for this to work correctly.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.