[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 6/6] x86/microcode: Synchronize late microcode loading



On Thu, Nov 29, 2018 at 12:43:25PM +0800, Chao Gao wrote:
> On Wed, Nov 28, 2018 at 04:22:09PM +0100, Roger Pau Monné wrote:
> >On Wed, Nov 28, 2018 at 01:34:16PM +0800, Chao Gao wrote:
> >> This patch ports microcode improvement patches from linux kernel.
> >> 
> >> Before you read any further: the early loading method is still the
> >> preferred one and you should always do that. The following patch is
> >> improving the late loading mechanism for long running jobs and cloud use
> >> cases.
> >> 
> >> Gather all cores and serialize the microcode update on them by doing it
> >> one-by-one to make the late update process as reliable as possible and
> >> avoid potential issues caused by the microcode update.
> >> 
> >> Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx>
> >> Tested-by: Chao Gao <chao.gao@xxxxxxxxx>
> >> [linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
> >> [linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
> >
> >If this patch is the squash of two Linux commits, please post the
> >ported versions of the two commits separately.
> 
> I don't understand this one.

You reference two Linux commits above, why is this done?

I assume this is because you are porting two Linux commits to Xen, in
which case I think that should be done in two different patches, or a
note needs to be added to why you merge two Linux commits into a
single Xen patch.

> >> @@ -311,13 +350,45 @@ int 
> >> microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
> >>      if ( ret <= 0 )
> >>      {
> >>          printk("No valid or newer microcode found. Update abort!\n");
> >> -        return -EINVAL;
> >> +        ret = -EINVAL;
> >> +        goto put;
> >>      }
> >>  
> >> -    info->error = 0;
> >> -    info->cpu = cpumask_first(&cpu_online_map);
> >> +    atomic_set(&info->cpu_in, 0);
> >> +    atomic_set(&info->cpu_out, 0);
> >> +
> >> +    /* Calculate the number of online CPU core */
> >> +    nr_cores = 0;
> >> +    for_each_online_cpu(cpu)
> >> +        if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> >> +            nr_cores++;
> >> +
> >> +    printk("%d cores are to update its microcode\n", nr_cores);
> >>  
> >> -    return continue_hypercall_on_cpu(info->cpu, do_microcode_update, 
> >> info);
> >> +    /*
> >> +     * We intend to disable interrupt for long time, which may lead to
> >> +     * watchdog timeout.
> >> +     */
> >> +    watchdog_disable();
> >> +    /*
> >> +     * Late loading dance. Why the heavy-handed stop_machine effort?
> >> +     *
> >> +     * - HT siblings must be idle and not execute other code while the 
> >> other
> >> +     *   sibling is loading microcode in order to avoid any negative
> >> +     *   interactions cause by the loading.
> >
> >Well, the HT siblings will be executing code, since they are in a
> >while loop waiting for the non-siblings cores to finish updating.
> 
> Strictly speaking, you are right. The 'idle' I think means no other
> workload on the cpu except microcode loading (for a HT sibling which
> isn't chosen to do the update, means waiting for the completion of
> the other sibling).

Could you clarify the comment then?

By workload you mean that no other microcode loading should be
attempted from a HT sibling?

Is there a set of instructions or functionality that cannot be used by
HT siblings while performing a microcode load?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.