Xen project Mailing List

Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode

Date: Fri, 30 Aug 2019 09:30:54 +0200

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Ashok Raj <ashok.raj@xxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>

Delivery-date: Fri, 30 Aug 2019 07:31:12 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 30.08.2019 08:33, Chao Gao wrote: > On Thu, Aug 29, 2019 at 02:22:47PM +0200, Jan Beulich wrote: >> On 19.08.2019 03:25, Chao Gao wrote: >>> @@ -481,12 +478,28 @@ static int do_microcode_update(void *patch) >>> return ret; >>> } >>> >>> +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int >>> cpu) >>> +{ >>> + /* The first thread of a core is to load an update. Don't block it. */ >>> + if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) || >>> + loading_state != LOADING_CALLIN ) >>> + return 0; >>> + >>> + cpumask_set_cpu(cpu, &cpu_callin_map); >>> + >>> + while ( loading_state != LOADING_EXIT ) >>> + cpu_relax(); >>> + >>> + return 0; >>> +} >> >> By returning 0 you tell do_nmi() to continue processing the NMI. >> Since you can't tell whether a non-IPI NMI has surfaced at about >> the same time this is generally the right thing imo, but how do >> you prevent unknown_nmi_error() from getting entered when do_nmi() >> ends up setting handle_unknown to true? (The question is mostly >> rhetorical, but there's a disconnect between do_nmi() checking >> "cpu == 0" and the control thread running on >> cpumask_first(&cpu_online_map), i.e. you introduce a well hidden >> dependency on CPU 0 never going offline. IOW my request is to at >> least make this less well hidden, such that it can be noticed if >> and when someone endeavors to remove said limitation.) > > Seems the issue is that we couldn't send IPI NMI to BSP, otherwise > unknown_nmi_error() would be trigger. And loading ucode after > rendezvousing all CPUs in NMI handler expects all CPUs to receive IPI > NMI. So this approach always has such issue. Not really, I don't think: If both sides agreed (explicitly!) on which CPU leads this effort, then it would be clear that the one CPU handling NMIs coming from the platform should not be sent an NMI, and hence it should be this one to lead the effort. FAOD - my remark really was because of the new hidden(!) dependency you introduce on CPU 0 always being this "special" CPU. I don't expect you to change the code, but I'd like you to make the currently hidden dependency explicit. > Considering self_nmi is called at another place, could we provide a > way to temporarily suppress or (force) ignore unknown nmi error? I'm afraid any attempt at doing so will leave room for missing an actual (platform) NMI. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.