Xen project Mailing List

Re: [Xen-devel] [PATCH v8 14/16] x86/microcode: Synchronize late microcode loading

Date: Mon, 5 Aug 2019 13:28:43 +0000

Accept-language: en-US

Arc-authentication-results: i=1; mx.microsoft.com 1;spf=pass smtp.mailfrom=suse.com;dmarc=pass action=none header.from=suse.com;dkim=pass header.d=suse.com;arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=CaAhHy7m0tTp49IIQRxF4k1VzyM7o3Wj4OznEEjopwc=; b=oNSmfHpeOxoUvyFvpC+t3qRet63IqYF9Z2nAWnCxBcJCDX5LNHQTMwiH7TWeVAdBaHGkIvhG9qancPepxTwubZpuSJCCIN8+5kMgf6Zf89UmcXK88W4Ppm7qcu+eml+caAQHk3RlYxBxB/uYMirAQPr/udv+7p5SnCGBeXcNAu1hRf41U08p0PVU1KCt7CHwDeZcXA6IFAZBDAuA1Zb8qQq2+RP/yiYMe2TEhXN7le4HFqYAPR7Jl5DNUirxZXMMQ5JfH7NhAjnvn/hrNFfv8gU+kVWJJxmZ71PcdH8ws8qK8kPOqqyYiZwE7Ow8n/oqPaQnrNLi3BLcIwYyaL75ZQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bPAFMZWemKjKvI+rj5XvFCZOlLT/sL5StZ9FNxIxsnZciVPyscYP5PK6nK5UFKA4BqAOB9NvIhoHH37CJHOXNaBp26xtLrzHLkH/I5fRMNcIxViXN2id+HhdsdwpdIHwgiIycWVGjPY+40ml2FPraY07iO1G+Cg74ruaesWm4lEDb/mTj6ov7M8bCayXsfN34s5iwKE500/16GzVEKAJKJEY0xJwARjVgEVqG1jLuF3C+uzw6Sajx4ShW3bPH1O0EZtSZ3PvfUvIm3iQS9A+rIDlr1NHRvvyy4J8Cx9Cz9cJnGafRZ8cnj/N+iMZv80i1ZSm8lL+sJya3JbPpJP/pw==

Authentication-results: spf=none (sender IP is ) smtp.mailfrom=JBeulich@xxxxxxxx;

Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Ashok Raj <ashok.raj@xxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, Borislav Petkov <bp@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>

Delivery-date: Mon, 05 Aug 2019 13:30:55 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHVSFKr3nOgmxjRgEC58Iy/dUUpk6bsZIcAgAAp5guAAARNgA==

Thread-topic: [PATCH v8 14/16] x86/microcode: Synchronize late microcode loading

On 05.08.2019 15:16, Chao Gao wrote: > On Mon, Aug 05, 2019 at 10:43:16AM +0000, Jan Beulich wrote: >> On 01.08.2019 12:22, Chao Gao wrote: >>> @@ -283,37 +345,105 @@ static int microcode_update_cpu(const struct >>> microcode_patch *patch) >>> return err; >>> } >>> >>> -static long do_microcode_update(void *patch) >>> +static int do_microcode_update(void *patch) >>> { >>> - unsigned int cpu; >>> + unsigned int cpu = smp_processor_id(); >>> + unsigned int cpu_nr = num_online_cpus(); >>> + int ret; >>> >>> - /* store the patch after a successful loading */ >>> - if ( !microcode_update_cpu(patch) && patch ) >>> + /* Mark loading an ucode is in progress */ >>> + cmpxchg(&loading_state, LOADING_EXITED, LOADING_ENTERED); >> >> Why cmpxchg(), especially when you don't check whether the store >> has actually happened? (Same further down then.) > > loading_state is set to "LOADING_EXITED" by the caller. So the first CPU > coming here would store "LOADING_ENTERED" to it. And we don't need > special handling for the CPU that sets the state, so the return value > isn't checked. > > Here are my considerations about how to set the state: > 1) We cannot set the state in the caller. Because we would use this > state to block #NMI handling. Setting the state in the caller may > break stop_machine_run mechanism with the patch 16. > > 2) The first CPU reaching here should set the state (it means we cannot > choose one CPU, e.g. BSP, to do it). With this, #NMI handling is > disabled system-wise before any CPU calls in. Otherwise, if there is > an exception for a CPU, it may be still in #NMI handling, when its > sibling thread starts loading an ucode. > > 3) A simple assignment on every CPU is problematic in some cases. > For example, if one CPU comes in after other CPUs timed out and left, > it might set the state to "LOADING_ENTERED" and be blocked in #NMI > handling forever with patch 16. > > That's why I choose to use a cmpxhg(). But if the expected incoming state is _not_ the one actually found in memory, then the cmpxchg() will silently degenerate to a no-op, without anyone noticing other than the system misbehaving in an unpredictable way. On the whole, seeing your explanation above, there is merit to using cmpxchg(). But at the very least this background needs to be put in the description; perhaps better in a code comment next to the variable definition. > Regarding the cmpxchg() in error-handling below, it can be replaced by > a simple assignment. But I am not sure whether it is better to use > cmpxchg() considering cache line bouncing. I didn't think MOV would cause more heavy cacheline bouncing compared to CMPXCHG. >>> + cpumask_set_cpu(cpu, &cpu_callin_map); >>> + ret = wait_for_condition(wait_cpu_callin, (void *)(unsigned >>> long)cpu_nr, >>> + MICROCODE_CALLIN_TIMEOUT_US); >>> + if ( ret ) >>> { >>> - spin_lock(&microcode_mutex); >>> - microcode_update_cache(patch); >>> - spin_unlock(&microcode_mutex); >>> - patch = NULL; >>> + cmpxchg(&loading_state, LOADING_ENTERED, LOADING_ABORTED); >>> + return ret; >>> } >>> >>> - if ( microcode_ops->end_update ) >>> - microcode_ops->end_update(); >>> + /* >>> + * Load microcode update on only one logical processor per core, or in >>> + * AMD's term, one core per compute unit. The one with the lowest >>> thread >>> + * id among all siblings is chosen to perform the loading. >>> + */ >>> + if ( (cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu))) ) >>> + { >>> + static unsigned int panicked = 0; >>> + bool monitor; >>> + unsigned int done; >>> + unsigned long tick = 0; >>> >>> - cpu = cpumask_next(smp_processor_id(), &cpu_online_map); >>> - if ( cpu < nr_cpu_ids ) >>> - return continue_hypercall_on_cpu(cpu, do_microcode_update, patch); >>> + ret = microcode_ops->apply_microcode(patch); >>> + if ( !ret ) >>> + { >>> + unsigned int cpu2; >>> >>> - /* Free the patch if no CPU has loaded it successfully. */ >>> - if ( patch ) >>> - microcode_free_patch(patch); >>> + atomic_inc(&cpu_updated); >>> + /* Propagate revision number to all siblings */ >>> + for_each_cpu(cpu2, per_cpu(cpu_sibling_mask, cpu)) >>> + per_cpu(cpu_sig, cpu2).rev = this_cpu(cpu_sig).rev; >>> + } >>> >>> - return 0; >>> + /* >>> + * The first CPU reaching here will monitor the progress and emit >>> + * warning message if the duration is too long (e.g. >1 second). >>> + */ >>> + monitor = !atomic_inc_return(&cpu_out); >>> + if ( monitor ) >>> + tick = rdtsc_ordered(); >>> + >>> + /* Waiting for all cores or computing units finishing update */ >>> + done = atomic_read(&cpu_out); >>> + while ( panicked && done != nr_cores ) >> >> Don't you mean "!panicked" here, or || instead of && ? Otherwise I don't >> see how the loop would ever be entered. > > Sorry. It should be !panicked. I.e. you mean to exit the loop on all CPUs once one of them has invoked panic()? I did append the alternative for a reason - it seems to me that keeping CPUs in this loop after a panic() might be better. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.