[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/idle: prevent entering C6 with in service interrupts on Intel
On 07.05.2020 15:22, Roger Pau Monne wrote: > Apply a workaround for Intel errata CLX30: "A Pending Fixed Interrupt > May Be Dispatched Before an Interrupt of The Same Priority Completes". > > It's not clear which models are affected, as the errata is listed in > the "Second Generation Intel Xeon Scalable Processors" specification > update, but the issue has been seen as far back as Nehalem processors. > Apply the workaround to all Intel processors, the condition can be > relaxed later. > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > --- > docs/misc/xen-command-line.pandoc | 8 ++++++++ > xen/arch/x86/acpi/cpu_idle.c | 22 +++++++++++++++++++++- > xen/arch/x86/cpu/mwait-idle.c | 3 +++ > xen/include/asm-x86/cpuidle.h | 1 + > 4 files changed, 33 insertions(+), 1 deletion(-) > > diff --git a/docs/misc/xen-command-line.pandoc > b/docs/misc/xen-command-line.pandoc > index ee12b0f53f..6e868a2185 100644 > --- a/docs/misc/xen-command-line.pandoc > +++ b/docs/misc/xen-command-line.pandoc > @@ -652,6 +652,14 @@ Specify the size of the console debug trace buffer. By > specifying `cpu:` > additionally a trace buffer of the specified size is allocated per cpu. > The debug trace feature is only enabled in debugging builds of Xen. > > +### disable-c6-isr > +> `= <boolean>` > + > +> Default: `true for Intel CPUs` > + > +Workaround for Intel errata CLX30. Prevent entering C6 idle states with in > +service local APIC interrupts. Enabled by default for all Intel CPUs. > + > ### dma_bits > > `= <integer>` > > diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c > index b83446e77d..5023fea148 100644 > --- a/xen/arch/x86/acpi/cpu_idle.c > +++ b/xen/arch/x86/acpi/cpu_idle.c > @@ -573,6 +573,25 @@ static bool errata_c6_eoi_workaround(void) > return (fix_needed && cpu_has_pending_apic_eoi()); > } > > +static int8_t __read_mostly disable_c6_isr = -1; > +boolean_param("disable-c6-isr", disable_c6_isr); > + > +/* > + * Errata CLX30: A Pending Fixed Interrupt May Be Dispatched Before an > + * Interrupt of The Same Priority Completes. > + * > + * Prevent entering C6 if there are pending lapic interrupts, or else the > + * processor might dispatch further pending interrupts before the first one > has > + * been completed. > + */ > +bool errata_c6_isr_workaround(void) > +{ > + if ( unlikely(disable_c6_isr == -1) ) > + disable_c6_isr = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL; > + > + return disable_c6_isr && cpu_has_pending_apic_eoi(); This check being the same as in errata_c6_eoi_workaround(), why don't you simply extend that function? (See also below.) Also both variable and command line option descriptor could go inside the function, to limit their scopes. > @@ -676,7 +695,8 @@ static void acpi_processor_idle(void) > return; > } > > - if ( (cx->type == ACPI_STATE_C3) && errata_c6_eoi_workaround() ) > + if ( (cx->type == ACPI_STATE_C3) && > + (errata_c6_eoi_workaround() || errata_c6_isr_workaround()) ) > cx = power->safe_state; I realize you only add to existing code, but I'm afraid this existing code cannot be safely added to. Already prior to your change there was a curious mismatch of C3 and c6 on this line, and I don't see how ACPI_STATE_C3 correlates with "core C6" state. Now this may have been the convention for Nehalem/Westmere systems, but already the mwait-idle entries for these CPU models have 4 entries (albeit such that they match this scheme). As a result I think this at the very least needs to be >=, not ==, even more so now that you want to cover all Intel CPUs. Obviously (I think) it is a mistake that mwait-idle doesn't also activate this workaround, which imo is another reason to stick to just errata_c6_eoi_workaround(). > --- a/xen/arch/x86/cpu/mwait-idle.c > +++ b/xen/arch/x86/cpu/mwait-idle.c > @@ -770,6 +770,9 @@ static void mwait_idle(void) > return; > } > > + if (cx->type == ACPI_STATE_C3 && errata_c6_isr_workaround()) > + cx = power->safe_state; Here it becomes even more relevant I think that == not be used, as the static tables list deeper C-states; it's just that the SKX table, which also gets used for CLX afaict, has no entries beyond C6-SKX. Others with deeper ones presumably have the deeper C-states similarly affected (or not) by this erratum. For reference, mwait_idle_cpu_init() has hint = flg2MWAIT(cpuidle_state_table[cstate].flags); state = MWAIT_HINT2CSTATE(hint) + 1; ... cx->type = state; i.e. the value you compare against is derived from the static table entries. For Nehalem/Westmere this means that what goes under ACPI_STATE_C3 is indeed C6-NHM, and this looks to match for all the non-Atoms, but for none of the Atoms. Now Atoms could easily be unaffected, but (just to take an example) if C6-SNB was affected, surely C7-SNB would be, too. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |