[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 00/12] improve late microcode loading



On 3/19/19 3:22 PM, Brian Woods wrote:
> On 3/11/19 2:57 AM, Chao Gao wrote:
>> Major changes in version 6:
>>   - run wbinvd before updating microcode (patch 10)
>>   - add an userspace tool for late microcode update (patch 1)
>>   - scale time to wait by the number of remaining CPUs to respond
>>   - remove 'cpu' parameters from some related callbacks and functins
>>   - save an ucode patch only if its supported CPU is allowed to mix with
>>     current cpu.
>>
>> Changes in version 5:
>>   - support parallel microcode updates for all cores (see patch 8)
>>   - Address Roger's comments on the last version.
>>
>> The intention of this series is to make the late microcode loading
>> more reliable by rendezvousing all cpus in stop_machine context.
>> This idea comes from Ashok. I am porting his linux patch to Xen
>> (see patch 10 and 11 for more details).
>>
>> This series makes five changes:
>>   1. Patch 1: an userspace tool for late microcode update
>>   2. Patch 2-9: introduce a global microcode cache and some cleanup
>>   3. Patch 10: writeback and invalidate cache before updating microcode
>>   3. Patch 11: synchronize late microcode loading
>>   4. Patch 12: support parallel microcodes update on different cores
>>
>> Currently, late microcode loading does a lot of things including
>> parsing microcode blob, checking the signature/revision and performing
>> update. Putting all of them into stop_machine context is a bad idea
>> because of complexity (One issue I observed is memory allocation
>> triggered one assertion in stop_machine context). In order to simplify
>> the load process, I move parsing microcode out of the load process.
>> The microcode blob is parsed and a global microcode cache is built on
>> a single CPU before rendezvousing all cpus to update microcode. Other
>> CPUs just get and load a suitable microcode from the global cache.
>> With this global cache, it is safe to put simplified load process to
>> stop_machine context.
>>
>> Regarding changes to AMD side, I didn't do any test for them due to
>> lack of hardware. Could you help to test this series on an AMD machine?
>> At least, two basic tests are needed:
>> * do a microcode update after system bootup
>> * don't bring all pCPUs up at bootup by specifying maxcpus option in xen
>>    command line and then do a microcode update and online all offlined
>>    CPUs via 'xen-hptool'.
>>
>> Chao Gao (12):
>>    misc/xenmicrocode: Upload a microcode blob to the hypervisor
>>    microcode/intel: use union to get fields without shifting and masking
>>    microcode/intel: extend microcode_update_match()
>>    microcode: introduce a global cache of ucode patch
>>    microcode: only save compatible ucode patches
>>    microcode: remove struct ucode_cpu_info
>>    microcode: remove pointless 'cpu' parameter
>>    microcode: split out apply_microcode() from cpu_request_microcode()
>>    microcode: remove struct microcode_info
>>    microcode/intel: Writeback and invalidate caches before updating
>>      microcode
>>    x86/microcode: Synchronize late microcode loading
>>    microcode: update microcode on cores in parallel
>>
>>   tools/libxc/include/xenctrl.h   |   1 +
>>   tools/libxc/xc_misc.c           |  20 +++
>>   tools/misc/Makefile             |   4 +
>>   tools/misc/xenmicrocode.c       |  89 ++++++++++
>>   xen/arch/x86/acpi/power.c       |   2 +-
>>   xen/arch/x86/apic.c             |   2 +-
>>   xen/arch/x86/microcode.c        | 380 
>> +++++++++++++++++++++++++++-------------
>>   xen/arch/x86/microcode_amd.c    | 236 ++++++++++++-------------
>>   xen/arch/x86/microcode_intel.c  | 206 +++++++++++++---------
>>   xen/arch/x86/smpboot.c          |   5 +-
>>   xen/arch/x86/spec_ctrl.c        |   2 +-
>>   xen/include/asm-x86/microcode.h |  40 +++--
>>   xen/include/asm-x86/processor.h |   3 +-
>>   13 files changed, 639 insertions(+), 351 deletions(-)
>>   create mode 100644 tools/misc/xenmicrocode.c
>>
> 
> Sorry for the delay.  These patches fail on F17h.  I'm looking into 
> where it fails now.

Bisecting it says it's commit "microcode: introduce a global cache of 
ucode patch."

The failing commit fails with:
(XEN) [00000085227df312] microcode: CPU0 update from revision 0x8001207 
to 0xffff8304 failed
(XEN) [00000085240578ec] traps.c:1574: GPF (0000): ffff82d080426c88 
[probe_cpuid_faulting+0xe/0xa2] -> ffff82d0803818b2

That microcode revision is WAY off.  It should be 0x8001227 and not 
0xffff8304.  I don't think I'll be able to do much on it before the end 
of today, but let me what information you need or if there's anything I 
should be looking at in particular.

Thanks,
Brian
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.