[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/idle: prevent entering C6 with in service interrupts on Intel
On Fri, May 08, 2020 at 03:46:08PM +0200, Jan Beulich wrote: > On 07.05.2020 15:22, Roger Pau Monne wrote: > > diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c > > index b83446e77d..5023fea148 100644 > > --- a/xen/arch/x86/acpi/cpu_idle.c > > +++ b/xen/arch/x86/acpi/cpu_idle.c > > @@ -573,6 +573,25 @@ static bool errata_c6_eoi_workaround(void) > > return (fix_needed && cpu_has_pending_apic_eoi()); > > } > > > > +static int8_t __read_mostly disable_c6_isr = -1; > > +boolean_param("disable-c6-isr", disable_c6_isr); > > + > > +/* > > + * Errata CLX30: A Pending Fixed Interrupt May Be Dispatched Before an > > + * Interrupt of The Same Priority Completes. > > + * > > + * Prevent entering C6 if there are pending lapic interrupts, or else the > > + * processor might dispatch further pending interrupts before the first > > one has > > + * been completed. > > + */ > > +bool errata_c6_isr_workaround(void) > > +{ > > + if ( unlikely(disable_c6_isr == -1) ) > > + disable_c6_isr = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL; > > + > > + return disable_c6_isr && cpu_has_pending_apic_eoi(); > > This check being the same as in errata_c6_eoi_workaround(), > why don't you simply extend that function? (See also below.) > Also both variable and command line option descriptor could > go inside the function, to limit their scopes. Since this is actually a superset (as it covers all Intel CPUs), I should delete the previous (more restricted) workaround matching logic. > > @@ -676,7 +695,8 @@ static void acpi_processor_idle(void) > > return; > > } > > > > - if ( (cx->type == ACPI_STATE_C3) && errata_c6_eoi_workaround() ) > > + if ( (cx->type == ACPI_STATE_C3) && > > + (errata_c6_eoi_workaround() || errata_c6_isr_workaround()) ) > > cx = power->safe_state; > > I realize you only add to existing code, but I'm afraid this > existing code cannot be safely added to. Already prior to > your change there was a curious mismatch of C3 and c6 on this > line, and I don't see how ACPI_STATE_C3 correlates with > "core C6" state. Now this may have been the convention for > Nehalem/Westmere systems, but already the mwait-idle entries > for these CPU models have 4 entries (albeit such that they > match this scheme). As a result I think this at the very > least needs to be >=, not ==, even more so now that you want > to cover all Intel CPUs. Hm, I think this is because AFAICT acpi_processor_idle only understands up to ACPI_STATE_C3, passing a type > ACPI_STATE_C3 will just cause it to fallback to C0. I've adjusted the comparison to use >= instead, as a safety imporvement in case we add support for more states to acpi_processor_idle. > Obviously (I think) it is a mistake that mwait-idle doesn't > also activate this workaround, which imo is another reason to > stick to just errata_c6_eoi_workaround(). I assumed the previous workaround didn't apply to any of the CPUs supported by the mwait-idle driver, since it seems to only support recent-ish models. > > --- a/xen/arch/x86/cpu/mwait-idle.c > > +++ b/xen/arch/x86/cpu/mwait-idle.c > > @@ -770,6 +770,9 @@ static void mwait_idle(void) > > return; > > } > > > > + if (cx->type == ACPI_STATE_C3 && errata_c6_isr_workaround()) > > + cx = power->safe_state; > > Here it becomes even more relevant I think that == not be > used, as the static tables list deeper C-states; it's just > that the SKX table, which also gets used for CLX afaict, has > no entries beyond C6-SKX. Others with deeper ones presumably > have the deeper C-states similarly affected (or not) by this > erratum. > > For reference, mwait_idle_cpu_init() has > > hint = flg2MWAIT(cpuidle_state_table[cstate].flags); > state = MWAIT_HINT2CSTATE(hint) + 1; > ... > cx->type = state; > > i.e. the value you compare against is derived from the static > table entries. For Nehalem/Westmere this means that what goes > under ACPI_STATE_C3 is indeed C6-NHM, and this looks to match > for all the non-Atoms, but for none of the Atoms. Now Atoms > could easily be unaffected, but (just to take an example) if > C6-SNB was affected, surely C7-SNB would be, too. Yes, I've adjusted this to use cx->type >= 3 instead. I have to admit I'm confused by the name of the states being C6-* while the cstate value reported by Xen will be 3 (I would instead expect the type to be 6 in order to match the name). Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |