[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] misc/xenmicrocode: Upload /lib/firmware/<some blob> to the hypervisor



On Thu, Jan 29, 2015 at 08:15:16PM +0100, Borislav Petkov wrote:
> On Thu, Jan 29, 2015 at 04:21:05AM +0100, Luis R. Rodriguez wrote:
> > How close?
> 
> As close as we can get but not closer - see the thing about updating
> microcode on Intel hyperthreaded logical cores in the other mail.
> 
> We probably can do it in parallel if needed. But it hasn't been needed
> until now.
> 
> > I've reviewed the implmentation a bit more on the Xen side. For early boot
> > things look similar to what is done upstream on the kernel. For the run time
> > update here's what Xen does in detail, elaborating a bit more on Andrew's
> > summary of how it works.
> > 
> > The XENPF_microcode_update hypercall calls the general Xen 
> > microcode_update() which
> > will do microcode_ops->start_update() (only AMD has this op for and it does
> > svm_host_osvw_reset()) and finally it continues the hypercall by calling
> > do_microcode_update() on the cpumask_first(&cpu_online_map) *always*. The
> > mechanism that Xen uses to continue the hypercall is by using
> > continue_hypercall_on_cpu(), if this returns 0 then it is guaranteed to run
> > *at some in the future* on the given CPU. If preemption is enabled this
> > could also mean the hypercall was preempted, and can be preempted later
> > on the other CPU. This will in turn will do the same call but on the
> > next CPU using continuation until it reaches the end of the CPU mask.
> > The do_microcode_update() call itself calls ops->cpu_request_microcode()
> > on each iteration which in turn should also do the ops->apply_microcode()
> > once a microcode buffer on the file that fits is found. The buffers are
> > kept in case of suspend / resume.
> 
> Yah, this is mostly fine except the preemption thing. If the guests get
> to see an inconsistent state with a subset of the cores updated and the
> rest not, then that is bad.
> 
> Not to mention the case when we have to late-update problematic
> microcode which has to happen in parallel on each core. I haven't seen
> one so far but we should be prepared.
> 
> > There is no tight loop here or locking of what other CPUs do while one is 
> > running
> > work to update microcode. Tons of things can happen in between so some 
> > refinements
> > seem desirable and likely this implementation does differ quite signifantly
> > over the Linux kernel's legacy 'rescan' interface.
> 
> Well, we're not very strict there either but that works so far. We'll
> change it if the need arises.
> 
> > Given this review, it seems folks should use xenmicrocode keeping in mind 
> > the
> > above algorithm, and support wise folks should be ready to consider 
> > upgrades on
> > microcode and possible issues / caveats from vendors on a case by case 
> > basis.
> 
> Right.
> 
> > From what I gather some folks have even considered tainting kernels when the
> > sysfs rescan interface is used, I do wonder if this is worthy on Xen for 
> > this
> > tool given the possible issues here... or am I just paranoid about this?
> > It seems like this might be more severe of an issue for Xen as-is.
> 
> That would not be unnecesary.
> 
> So, I would try to do the application of the microcode in the hypervisor
> as tight as possible. Maybe the hypercall could hand in the microcode
> blob only and the hypervisor can "schedule" an update later, after
> having frozen the guests.

OK I like this for now.

> In any case, we should strive for a parallel late update, simultaneously
> on each core with no interruption. The kernel doesn't do that either but
> it'll probably have to, one day.

OK.

> I don't know whether this is possible at all in xen and whether doing
> a simple sequential update method now and improving it later is easier
> than doing it right and in parallel from the get-go. I'm talking
> hypothetically here, I have no idea what actually is possible and doable
> in xen.

The implementation of the microcode update right now is done serially
by continuing the hypercall on each CPU. To do what you say seems
possible but would obviously require quite a bit of changes from
the existing solution.

For now I've just given a try to quiesce the domains before doing
the microcode update, will send that out next for RFC.

  Luis

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.