|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 1/2] x86/mwait-idle: enable interrupts before C1 on Xeons
On Thu, Jan 27, 2022 at 04:13:21PM +0100, Jan Beulich wrote:
> From: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx>
>
> Enable local interrupts before requesting C1 on the last two generations
> of Intel Xeon platforms: Sky Lake, Cascade Lake, Cooper Lake, Ice Lake.
> This decreases average C1 interrupt latency by about 5-10%, as measured
> with the 'wult' tool.
>
> The '->enter()' function of the driver enters C-states with local
> interrupts disabled by executing the 'monitor' and 'mwait' pair of
> instructions. If an interrupt happens, the CPU exits the C-state and
> continues executing instructions after 'mwait'. It does not jump to
> the interrupt handler, because local interrupts are disabled. The
> cpuidle subsystem enables interrupts a bit later, after doing some
> housekeeping.
>
> With this patch, we enable local interrupts before requesting C1. In
> this case, if the CPU wakes up because of an interrupt, it will jump
> to the interrupt handler right away. The cpuidle housekeeping will be
> done after the pending interrupt(s) are handled.
>
> Enabling interrupts before entering a C-state has measurable impact
> for faster C-states, like C1. Deeper, but slower C-states like C6 do
> not really benefit from this sort of change, because their latency is
> a lot higher comparing to the delay added by cpuidle housekeeping.
>
> This change was also tested with cyclictest and dbench. In case of Ice
> Lake, the average cyclictest latency decreased by 5.1%, and the average
> 'dbench' throughput increased by about 0.8%. Both tests were run for 4
> hours with only C1 enabled (all other idle states, including 'POLL',
> were disabled). CPU frequency was pinned to HFM, and uncore frequency
> was pinned to the maximum value. The other platforms had similar
> single-digit percentage improvements.
>
> It is worth noting that this patch affects 'cpuidle' statistics a tiny
> bit. Before this patch, C1 residency did not include the interrupt
> handling time, but with this patch, it will include it. This is similar
> to what happens in case of the 'POLL' state, which also runs with
> interrupts enabled.
>
> Suggested-by: Len Brown <len.brown@xxxxxxxxx>
> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx>
> [Linux commit: c227233ad64c77e57db738ab0e46439db71822a3]
>
> We don't have a pointer into cpuidle_state_table[] readily available.
> To compensate, propagate the flag into struct acpi_processor_cx.
>
> Unlike Linux we want to
> - disable IRQs again after MWAITing, as subsequently invoked functions
> assume so,
> - avoid enabling IRQs if cstate_restore_tsc() is not a no-op, to avoid
> interfering with, in particular, the time rendezvous.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
Acked-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> ---
> RFC: I'm not entirely certain that we want to take this, i.e. whether
> we're as much worried about interrupt latency.
I would assume taking this would make it easier for you to pick
further patches.
> RFC: I was going back and forth between putting the local_irq_enable()
> ahead of or after cpu_is_haltable().
> ---
> v3: Propagate flag to struct acpi_processor_cx. Don't set flag when TSC
> may stop whild in a C-state.
> v2: New.
>
> --- a/xen/arch/x86/cpu/mwait-idle.c
> +++ b/xen/arch/x86/cpu/mwait-idle.c
> @@ -108,6 +108,11 @@ static const struct cpuidle_state {
>
> #define CPUIDLE_FLAG_DISABLED 0x1
> /*
> + * Enable interrupts before entering the C-state. On some platforms and for
> + * some C-states, this may measurably decrease interrupt latency.
> + */
> +#define CPUIDLE_FLAG_IRQ_ENABLE 0x8000
> +/*
> * Set this flag for states where the HW flushes the TLB for us
> * and so we don't need cross-calls to keep it consistent.
> * If this flag is set, SW flushes the TLB, so even if the
> @@ -539,7 +544,7 @@ static struct cpuidle_state __read_mostl
> static struct cpuidle_state __read_mostly skx_cstates[] = {
> {
> .name = "C1",
> - .flags = MWAIT2flg(0x00),
> + .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE,
> .exit_latency = 2,
> .target_residency = 2,
> },
> @@ -561,7 +566,7 @@ static struct cpuidle_state __read_mostl
> static const struct cpuidle_state icx_cstates[] = {
> {
> .name = "C1",
> - .flags = MWAIT2flg(0x00),
> + .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE,
> .exit_latency = 1,
> .target_residency = 1,
> },
> @@ -842,9 +847,15 @@ static void mwait_idle(void)
>
> update_last_cx_stat(power, cx, before);
>
> - if (cpu_is_haltable(cpu))
> + if (cpu_is_haltable(cpu)) {
> + if (cx->irq_enable_early)
> + local_irq_enable();
> +
> mwait_idle_with_hints(cx->address, MWAIT_ECX_INTERRUPT_BREAK);
>
> + local_irq_disable();
> + }
> +
> after = alternative_call(cpuidle_get_tick);
>
> cstate_restore_tsc();
> @@ -1335,6 +1346,11 @@ static int mwait_idle_cpu_init(struct no
> cx->latency = cpuidle_state_table[cstate].exit_latency;
> cx->target_residency =
> cpuidle_state_table[cstate].target_residency;
> + if ((cpuidle_state_table[cstate].flags &
> + CPUIDLE_FLAG_IRQ_ENABLE) &&
> + /* cstate_restore_tsc() needs to be a no-op */
> + boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
> + cx->irq_enable_early = true;
>
> dev->count++;
> }
> --- a/xen/include/xen/cpuidle.h
> +++ b/xen/include/xen/cpuidle.h
> @@ -42,6 +42,7 @@ struct acpi_processor_cx
> u8 idx;
> u8 type; /* ACPI_STATE_Cn */
> u8 entry_method; /* ACPI_CSTATE_EM_xxx */
> + bool irq_enable_early;
Should you use a bit field here and limit the field to :1 in
expectation of maybe adding more flags at a later point?
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |