[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/smp: use APIC ALLBUT destination shorthand when possible


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 16 Jan 2020 16:05:12 +0100
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=roger.pau@xxxxxxxxxx; spf=Pass smtp.mailfrom=roger.pau@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Wei Liu <wl@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Thu, 16 Jan 2020 15:05:53 +0000
  • Ironport-sdr: FaT+88Sp5+RBgknpAQIC9ycf3Ra575USBjhkdfYd/DW99mZRjZvoNVdcO1jkC5bsbnjFZN29xR ScgizHFiLpSEkJSUY8vDydGY/PwCrdOXW/qWI0ubOEfPMFoslgAPo9YLI8zxsj6XycMGTX2Ble y5+I/pE05y4/IxSBSdP9DKCW0ormAOGeRbSyAVLE3KLqriDm3WltP32sH+30KutSdEeAbdpzbZ 6gIqSxmu4U0kf80zvppVZX/4HPJNezfl94dfyl8PSFaskN3hNP/BgfdCxJn73XycQ6St5lnBuF w2M=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Jan 16, 2020 at 11:44:51AM +0100, Jan Beulich wrote:
> On 09.01.2020 17:22, Roger Pau Monne wrote:
> > If the IPI destination mask matches the mask of online CPUs use the
> > APIC ALLBUT destination shorthand in order to send an IPI to all CPUs
> > on the system except the current one. This can only be safely used
> > when no CPU hotplug or unplug operations are taking place, no offline
> > CPUs or those have been onlined and parked and finally when all CPUs
> > in the system have been accounted for (ie: the number of CPUs doesn't
> > exceed NR_CPUS and APIC IDs are below MAX_APICS).
> > 
> > This is specially beneficial when using the PV shim, since using the
> > shorthand avoids performing an APIC register write (or multiple ones
> > if using xAPIC mode) for each destination when doing a global TLB
> > flush.
> > 
> > The lock time of flush_lock on a 32 vCPU guest using the shim without
> > the shorthand is:
> > 
> > Global lock flush_lock: addr=ffff82d0804b21c0, lockval=f602f602, not locked
> >   lock:228455938(79406065573135), block:205908580(556416605761539)
> > 
> > Average lock time: 347577ns
> > 
> > While the same guest using the shorthand:
> > 
> > Global lock flush_lock: addr=ffff82d0804b41c0, lockval=d9c4d9bc, cpu=12
> >   lock:1890775(416719148054), block:1663958(2500161282949)
> > 
> > Average lock time: 220395ns
> > 
> > Approximately a 1/3 improvement in the lock time.
> > 
> > Note that this requires locking the CPU maps (get_cpu_maps) which uses
> > a trylock. This is currently safe as all users of cpu_add_remove_lock
> > do a trylock, but will need reevaluating if non-trylock users appear.
> 
> All of this looks okay to me, but I have a number of mechanical
> (hopefully not too nitpicky) comments.
> 
> > Also there's some code movement of __prepare_ICR and
> > __default_send_IPI_shortcut, which is a non-functional change but I
> > didn't feel like it should be split to a separate patch.
> 
> This may better be split out - see below for why.

Done in a pre-patch.

> 
> > --- a/xen/arch/x86/acpi/boot.c
> > +++ b/xen/arch/x86/acpi/boot.c
> > @@ -103,6 +103,7 @@ acpi_parse_x2apic(struct acpi_subtable_header *header, 
> > const unsigned long end)
> >                            processor->lapic_flags & ACPI_MADT_ENABLED
> >                            ? KERN_WARNING "WARNING: " : KERN_INFO,
> >                            processor->local_apic_id, processor->uid);
> > +           cpu_overflow = true;
> 
> I don't think this is a good name. Seeing that we have "disabled_cpus",
> how about "unaccounted_cpus" or some such? (This could still be a boolean
> for the time being, if preferred to not be a true count.)

Done, left it as a boolean though.

> Thinking about it, what about the period of time between a CPU having
> got physically hot-added (and hence known at the hardware level) and
> it actually getting brought up for the first time? IOW - do you
> perhaps need to exclude use of the shortcut also when disabled_cpus
> is non-zero?

Yes, I guess there's no signal to Xen that a new CPU is going to be
physically hot-added, and hence as long as there are empty slots the
scenario you describe above is possible.

I don't think there's also anyway to figure out if a system is capable
of physical CPU hotplug, so as long as there are disabled_cpus we
shouldn't use the shortcut (that's kind of a shame, because I think
there are many systems reporting disabled CPUs in MADT).

> > @@ -23,6 +24,31 @@
> >  #include <irq_vectors.h>
> >  #include <mach_apic.h>
> >  
> > +static inline int __prepare_ICR(unsigned int shortcut, int vector)
> > +{
> > +    return APIC_DM_FIXED | shortcut | vector;
> > +}
> 
> I think __prepare_ICR2() should be moved together with it, and I also
> think movement like this should include correcting the name (by
> dropping at least one of the underscores). As per recent comments
> elsewhere "inline" may also want dropping at this occasion.

I've dropped both underscores and the inline attribute in a pre-patch.

> > +static void __default_send_IPI_shortcut(unsigned int shortcut, int vector,
> > +                                        unsigned int dest)
> 
> The renaming should go even farther here: There's nothing "default"
> about this function - there's not second, non-default implementation.
> Just like other similar functions it should just be
> send_IPI_shortcut().

Ack, done in the pre-patch.

> > @@ -30,7 +56,40 @@
> >  
> >  void send_IPI_mask(const cpumask_t *mask, int vector)
> >  {
> > -    alternative_vcall(genapic.send_IPI_mask, mask, vector);
> > +    bool cpus_locked = false;
> > +
> > +    /*
> > +     * Prevent any CPU hot{un}plug while sending the IPIs if we are to use
> > +     * a shorthand, also refuse to use a shorthand if not all CPUs are
> > +     * online or have been parked.
> > +     */
> > +    if ( system_state > SYS_STATE_smp_boot && !cpu_overflow &&
> > +         /* NB: get_cpu_maps lock requires enabled interrupts. */
> > +         local_irq_is_enabled() && (cpus_locked = get_cpu_maps()) &&
> > +         (park_offline_cpus ||
> > +          cpumask_equal(&cpu_online_map, &cpu_present_map)) )
> 
> On the first and second pass I thought this contradicts the description.
> To disambiguate (and assuming I understand it correctly), how about
> saying something like "This can only be safely used when no CPU hotplug
> or unplug operations are taking place, there are no offline CPUs (unless
> those have been onlined and parked) and finally ..."?

I'm not sure I understand what should come after the finally ...
Taking your suggestion I would write:

This can only be safely used when no CPU hotplug or unplug operations
are taking place, there are no offline CPUs (unless those have been
onlined and parked), there are no disabled CPUs and all possible CPUs
in the system have been accounted for.

> 
> > +    {
> > +        cpumask_copy(this_cpu(scratch_cpumask), &cpu_online_map);
> > +        cpumask_clear_cpu(smp_processor_id(), this_cpu(scratch_cpumask));
> 
>         cpumask_andnot(this_cpu(scratch_cpumask), &cpu_online_map,
>                        cpumask_of(smp_processor_id()));
> 
> ?
> 
> > +    }
> > +    else
> > +    {
> > +        if ( cpus_locked )
> > +        {
> > +            put_cpu_maps();
> > +            cpus_locked = false;
> > +        }
> > +        cpumask_clear(this_cpu(scratch_cpumask));
> 
> Is there a guarantee that the function won't be called with an
> empty mask? All backing functions look to gracefully deal with
> this case, yet ...
> 
> > +    }
> > +
> > +    if ( cpumask_equal(mask, this_cpu(scratch_cpumask)) )> +        
> > __default_send_IPI_shortcut(APIC_DEST_ALLBUT, vector,
> > +                                    APIC_DEST_PHYSICAL);
> 
> ... you'd make this an "all-but" message then. Adding a
> !cpumask_empty() check would seem a little expensive, so how
> about you copy cpumask_of(smp_processor_id()) above and add
> !cpumask_test_cpu(smp_processor_id(), ...) here?

A cpumask_empty check at the top of the function would be easier to
parse, but it could incur in more overhead, I've implemented what you
describe.

> > +    else
> > +        alternative_vcall(genapic.send_IPI_mask, mask, vector);
> 
> Is there no reason at all to also check here whether APIC_DEST_ALL
> could be used? Oh, I see - the backing functions all exclude the
> local CPU. I wonder why e.g. flush_area_mask() then clears the
> CPU off the mask it passes on. And with this behavior the
> single cpumask_equal() check above is too restrictive - you'll
> want to check whether mask matches the calculated all-but one or
> cpu_online_map. I.e. perhaps you want
> 
>         cpumask_or(this_cpu(scratch_cpumask), mask,
>                    cpumask_of(smp_processor_id()));
> 
> and then
> 
>     if ( cpumask_equal(this_cpu(scratch_cpumask), &cpu_online_map) )
> 
> ?

Oh, OK, in which case most of the comments above are moot if we go
this route. What I have now if I properly parsed your comments is:

    bool cpus_locked = false;
    cpumask_t *scratch = this_cpu(scratch_cpumask);

    /*
     * This can only be safely used when no CPU hotplug or unplug operations
     * are taking place, there are no offline CPUs (unless those have been
     * onlined and parked), there are no disabled CPUs and all possible CPUs in
     * the system have been accounted for.
     */
    if ( system_state > SYS_STATE_smp_boot &&
         !unaccounted_cpus && !disabled_cpus &&
         /* NB: get_cpu_maps lock requires enabled interrupts. */
         local_irq_is_enabled() && (cpus_locked = get_cpu_maps()) &&
         (park_offline_cpus ||
          cpumask_equal(&cpu_online_map, &cpu_present_map)) )
        cpumask_or(scratch, mask, cpumask_of(smp_processor_id()));
    else
    {
        if ( cpus_locked )
        {
            put_cpu_maps();
            cpus_locked = false;
        }
        cpumask_clear(scratch);
    }

    if ( cpumask_equal(scratch, &cpu_online_map) )
        send_IPI_shortcut(APIC_DEST_ALLBUT, vector, APIC_DEST_PHYSICAL);
    else
        alternative_vcall(genapic.send_IPI_mask, mask, vector);

    if ( cpus_locked )
        put_cpu_maps();

Can assert this matches your expectations? I think it fixes your
comments about empty masks and a mask containing the current pCPU ID.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.