[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Live-Patch application failure in core-scheduling mode

On 08.02.20 13:19, Andrew Cooper wrote:
On 07/02/2020 08:42, Jürgen Groß wrote:

Without it being entirely clear that there's no alternative to
it, I don't think I'd be fine with re-introduction of
continue_hypercall_on_cpu(0, ...) into ucode loading.

I don't see a viable alternative.

Sorry to interject in the middle of a conversation, but I'd like to make
something very clear.

continue_hypercall_on_cpu(0, ...) is, and has always been fundamentally
broken for microcode updates.  It causes real crashes on real systems,
and that is why the mechanism was replaced.

Changing back to it is going to break customer systems.

It is necessary to have the full system quiesced in practice, because
for a given piece of microcode, we don't know whether its a cross-thread
load (the common case which most people are familiar with), whether it
is a cross-core load (yes - it turns out this does exist - it
highlighted a bug in testing), and whether there an uncore/pcode/etc
update included as well.

I haven't come across a cross-socket load yet (and it likely doesn't
exists, given some aspects of loading which I think would be prohibitive
in this case), but there really are systems where loading microcode on
core 0 will flush and reload the MSROMs on all other cores in the
package, under the feet of whatever else is going on there.  This
includes making things like MSR_SPEC_CTRL disappear transiently.

We don't necessarily need to use stop_machine(), or use it exactly like
we currently do, but we do need a global rendezvous.

Did you look at the patch?

It uses continue_hypercall_on_cpu(0, ...) to call stop_machine_run()
from a tasklet. So there is a global rendezvous. Its just the start
of the rendezvous which is moved into a tasklet. That's all.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.