Xen project Mailing List

Re: [Xen-devel] [PATCH v8 16/16] microcode: block #NMI handling when loading an ucode

Date: Wed, 7 Aug 2019 10:44:36 +0200

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Ashok Raj <ashok.raj@xxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>

Delivery-date: Wed, 07 Aug 2019 08:44:45 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 07.08.2019 09:59, Chao Gao wrote:

On Mon, Aug 05, 2019 at 12:11:01PM +0000, Jan Beulich wrote:

On 01.08.2019 12:22, Chao Gao wrote:

@@ -439,12 +440,37 @@ static int do_microcode_update(void *patch)
       return ret;
   }

+static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)

+{
+    bool print = false;
+
+    /* The first thread of a core is to load an update. Don't block it. */
+    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
+        return 0;
+
+    if ( loading_state == LOADING_ENTERED )
+    {
+        cpumask_set_cpu(cpu, &cpu_callin_map);
+        printk(XENLOG_DEBUG "CPU%u enters %s\n", smp_processor_id(), __func__);


Here  and ...

+        print = true;
+    }
+
+    while ( loading_state == LOADING_ENTERED )
+        rep_nop();
+
+    if ( print )
+        printk(XENLOG_DEBUG "CPU%u exits %s\n", smp_processor_id(), __func__);


... here - why smp_processor_id() when you can use "cpu"? And what
use is __func__ here?

The rep_nop() above also presumably wants to be cpu_relax() again.

But on the whole I was really hoping for more aggressive disabling
of NMI handling, more like (but of course not quite as heavy as)
the crash path wiring the IDT entry to trap_nop().


I agree with you that it should be more aggressive. This patch is
problematic considering there is a lot of code before reaching this
callback (especially, SPEC_CTRL_ENTRY_FROM_INTR_IST which may write
MSR_SPEC_CTRL).

In my mind, we have two options to solve the issue:
1. Wire the IDT entry to trap_nop() like the crash path.

As said, this is not directly an option - at the very least a thread should record the fact that there was an NMI, such that it can handle it after the ucode update has completed.

2. Enhance this patch: A thread which is not going to load an update
is forced to send an #NMI to itself to enter the callback, do
busy-loop until completion of loading ucode on all cores.
With this method, no #NMI delivery, or MSR write would happen on this
threads during microcode update.

This sounds doable at the first glance; iirc the CPU would latch another NMI while NMIs are blocked due to there not having been an IRET yet after the last one was raised. There would still be some ambiguity in case the self-NMI and an actual one arrived at about the same time, but I guess we need to live with this small window. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.