[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC v2] misc/xenmicrocode: Upload /lib/firmware/<some blob> to the hypervisor

On Fri, Jan 30, 2015 at 08:37:33PM +0000, Andrew Cooper wrote:
> On 30/01/15 19:51, Luis R. Rodriguez wrote:
> > On Fri, Jan 30, 2015 at 02:23:48PM +0000, Andrew Cooper wrote:
> >> On 30/01/15 01:14, Luis R. Rodriguez wrote:
> >>> From: "Luis R. Rodriguez" <mcgrof@xxxxxxxx>
> >>>
> >>> There are several ways that a Xen system can update the
> >>> CPU microcode on a pvops kernel [0] now, the preferred way
> >>> is through the early microcode update mechanism. At run
> >>> time folks should use this new xenmicrocode tool and use
> >>> the same CPU microcode file as present on /lib/firmware.
> >>>
> >>> Some distributions may use the historic sysfs rescan interface,
> >>> users of that mechanism should be aware that the interface will
> >>> not be available when using Xen and as such should first check
> >>> the presence of the interface before usage, as an alternative
> >>> this xenmicrocode tool can be used on priviledged domains.
> >>>
> >>> Folks wishing to update CPU microcode at run time should be
> >>> aware that not all CPU microcode can be updated on a system
> >>> and should take care to ensure that only known-to-work and
> >>> supported CPU microcode updates are used [0]. To avoid issues
> >>> with delays on the hypercall / microcode update this
> >>> implementation will quiesce all domains prior to updating
> >>> the microcode, and then only queisce'd domains will be
> >>> unpaused once done.
> >>>
> >>> [0] http://wiki.xenproject.org/wiki/XenParavirtOps/microcode_update
> >>>
> >>> Based on original work by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> >>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> >>> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> >>> Cc: Borislav Petkov <bp@xxxxxxx>
> >>> Cc: Takashi Iwai <tiwai@xxxxxxx>
> >>> Cc: Olaf Hering <ohering@xxxxxxx>
> >>> Cc: Jan Beulich <JBeulich@xxxxxxxx>
> >>> Cc: Jason Douglas <jdouglas@xxxxxxxx>
> >>> Cc: Juergen Gross <jgross@xxxxxxxx>
> >>> Cc: Michal Marek <mmarek@xxxxxxx>
> >>> Cc: Henrique de Moraes Holschuh <hmh@xxxxxxxxxx>
> >>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> >>> Signed-off-by: Luis R. Rodriguez <mcgrof@xxxxxxxx>
> >>> ---
> >>>
> >>> Just wrote this, haven't tested it. This does some queiscing prior
> >>> to doing the microcode update. The quiescing is done by pausing
> >>> all domains. Once the microcode update is done we only unpause
> >>> domains which we queisced as part of our work. Let me know if this
> >>> is on the right track to help avoid issues noted on the list.
> >> There is also a TOCTOU race with your paused check, which itself is
> >> buggy as it you should unconditionally pause all VMs (userspace pause
> >> refcounting has been fixed for a little while now).
> > Also, currently __domain_pause_by_systemcontroller() has a limit to 255
> > guests. Are the fixes that you describe to the refcounting sufficient
> > to remove that limitation from __domain_pause_by_systemcontroller()?
> The limit is the number of concurrent userspace refs taken on an
> individual domain.  I.e. you can plausibly have 255 different bits of
> the toolstack each taking a pause reference for a specific reason.
> 255 was chosen an arbitrary limit, which is used to prevent buggy
> toolstacks from being able to overflow the refcounts used by the
> scheduler by using the pause domain hypercall 4 billion times.
> >
> > My implementation uses 1024 but has no check on nb_domain (the amount of
> > domains set) so that requires fixing as well but I figure we should also
> > review the above first too. Artificial limits bother me and I went with
> > 1024 as I also saw that limit used elsewhere, not sure if it was a stack
> > concern or what.
> >
> >> However, xenmicrocode (even indirectly via xc_microcode_update()) is not
> >> in a position to safely pause all domains as there is no interlock to
> >> preventing a new domain being created.  Even if all domains do get
> >> successfully paused, 
> > I did think about this and figured we could use this as a segway into
> > a discussion about how we would want to implement this sort of
> > interlocking or see if there is anything available already for this
> > sort of thing. Also, are there other future users of this that could benefit
> > from it ? If so then perhaps we can wrap the requirements up together.
> >
> >> the idle loops are substantially less trivial than
> >> ideal.
> > Interesting, can you elaborate on the possible issues that might creep
> > up on the idle loops while a guest is paused during a microcode update?
> > Any single issue would suffice, just curious.
> >
> > Do we need something more intricate than pause implemented then? Using
> > suspend seemed rather grotesque to shove down a guest's throat. If
> > pause / suspend do not suffice perhaps some new artificial virtual
> > tempory quiesce is needed to ensure integrity here, which would address
> > some of the idle concerns you highligted might creep up.
> >
> >> The toolstack should not hack around hypervisor bugs, and indeed is not
> >> capable of doing so.
> > Agreed. I figured I'd at least do what I can with what is available
> > and use this as a discussoin for what is the Right Thing To Do (TM)
> > in the future.
> The right thing to do is to fix the hypercall implementation, at which
> point the utility below is fine and xc_microcode_update() can be a thing
> wrapper around the hypercall.
> The actions Xen needs to take are:
> - Copy the buffer into Xen.
> - Scan the buffer for the correct patch
> - Rendezvous all online cpus in an IPI to apply the patch, and keep the
> processors in until all have completed the patch.
> The IPI itself probably wants common rendezvous code, and a system
> specific call for application.  The system specific call will need to
> adhere to the requirements in the relevant manual.  Care will have to be
> taken to avoid deadlocking with the time calibration rendezvous, and
> facilities such as the watchdog might temporally need pausing.
> If you feel up to all that, then please go ahead.  If not, I will
> attempt to find some copious free time.

You can have a crack at that. Let me know when the above is ready and I'll
respin. I'd try it but it seems you would spend considerbly less time than
me doing the above.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.