[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 5/6] x86/smp: use a dedicated scratch cpumask in send_IPI_mask
On Tue, Feb 18, 2020 at 01:29:56PM +0000, Andrew Cooper wrote: > On 18/02/2020 11:46, Roger Pau Monné wrote: > > On Tue, Feb 18, 2020 at 11:35:37AM +0000, Andrew Cooper wrote: > >> > >> On 18/02/2020 11:22, Roger Pau Monné wrote: > >>> On Tue, Feb 18, 2020 at 11:21:12AM +0000, Andrew Cooper wrote: > >>>> On 18/02/2020 11:10, Roger Pau Monné wrote: > >>>>> On Tue, Feb 18, 2020 at 10:53:45AM +0000, Andrew Cooper wrote: > >>>>>> On 17/02/2020 18:43, Roger Pau Monne wrote: > >>>>>>> @@ -67,7 +68,20 @@ static void send_IPI_shortcut(unsigned int > >>>>>>> shortcut, int vector, > >>>>>>> void send_IPI_mask(const cpumask_t *mask, int vector) > >>>>>>> { > >>>>>>> bool cpus_locked = false; > >>>>>>> - cpumask_t *scratch = this_cpu(scratch_cpumask); > >>>>>>> + cpumask_t *scratch = this_cpu(send_ipi_cpumask); > >>>>>>> + unsigned long flags; > >>>>>>> + > >>>>>>> + if ( in_mc() || in_nmi() ) > >>>>>>> + { > >>>>>>> + /* > >>>>>>> + * When in #MC or #MNI context Xen cannot use the per-CPU > >>>>>>> scratch mask > >>>>>>> + * because we have no way to avoid reentry, so do not use > >>>>>>> the APIC > >>>>>>> + * shorthand. > >>>>>>> + */ > >>>>>>> + alternative_vcall(genapic.send_IPI_mask, mask, vector); > >>>>>>> + return; > >>>>>> The set of things you can safely do in an NMI/MCE handler is small, and > >>>>>> does not include sending IPIs. (In reality, if you're using x2apic, it > >>>>>> is safe to send an IPI because there is no risk of clobbering ICR2 > >>>>>> behind your outer context's back). > >>>>>> > >>>>>> However, if we escalate from NMI/MCE context into crash context, then > >>>>>> anything goes. In reality, we only ever send NMIs from the crash path, > >>>>>> and that is not permitted to use a shorthand, making this code dead. > >>>>> This was requested by Jan, as safety measure > >>>> That may be, but it doesn't mean it is correct. If execution ever > >>>> enters this function in NMI/MCE context, there is a real, > >>>> state-corrupting bug, higher up the call stack. > >>> Ack, then I guess we should just BUG() here if ever called from #NMI > >>> or #MC context? > >> Well. There is a reason I suggested removing it, and not using BUG(). > >> > >> If NMI/MCE context escalates to crash context, we do need to send NMIs. > >> It won't be this function specifically, but it will be part of the > >> general IPI infrastructure. > >> > >> We definitely don't want to get into the game of trying to clobber each > >> of the state variables, so the only thing throwing BUG()'s around in > >> this area will do is make the crash path more fragile. > > I see, panicking in such context will just clobber the previous crash > > happened in NMI/MC context. > > > > So you would rather keep the current version of falling back to the > > usage of the non-shorthand IPI sending routine instead of panicking? > > > > What about: > > > > if ( in_mc() || in_nmi() ) > > { > > /* > > * When in #MC or #MNI context Xen cannot use the per-CPU scratch mask > > * because we have no way to avoid reentry, so do not use the APIC > > * shorthand. The only IPI that should be sent from such context > > * is a #NMI to shutdown the system in case of a crash. > > */ > > if ( vector == APIC_DM_NMI ) > > alternative_vcall(genapic.send_IPI_mask, mask, vector); > > else > > BUG(); > > > > return; > > } > > How do you intent to test it? > > It might be correct now[*] but it doesn't protect against someone > modifying code, violating the constraint, and this going unnoticed > because the above codepath will only be entered in exceptional > circumstances. Sods law says that code inside that block is first going > to be tested in a customer environment. > > ASSERT()s would be less bad, but any technical countermeasures, however > well intentioned, get in the way of the crash path functioning when it > matters most. OK, so what about: if ( in_mc() || in_nmi() ) { bool x2apic = current_local_apic_mode() == APIC_MODE_X2APIC; unsigned int icr2; /* * When in #MC or #MNI context Xen cannot use the per-CPU scratch mask * because we have no way to avoid reentry, so do not use the APIC * shorthand. The only IPI that should be sent from such context * is a #NMI to shutdown the system in case of a crash. */ ASSERT(vector == APIC_DM_NMI); if ( !x2apic ) icr2 = apic_read(APIC_ICR2); alternative_vcall(genapic.send_IPI_mask, mask, vector); if ( !x2apic ) apic_write(APIC_ICR2, icr2); return; } I'm unsure as to whether the assert is actually helpful, but would like to settle this before sending a new version. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |