[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V2] x86, amd_ucode: Safeguard against #GP

To: Aravind Gopalakrishnan <aravind.gopalakrishnan@xxxxxxx>, boris ostrovsky <boris.ostrovsky@xxxxxxxxxx>
From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Date: Fri, 30 May 2014 17:21:25 +0100
Cc: keir@xxxxxxx, JBeulich@xxxxxxxx, xen-devel@xxxxxxxxxxxxx
Delivery-date: Fri, 30 May 2014 16:21:51 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>


On 30/05/2014 17:01, Aravind Gopalakrishnan wrote:

On 5/28/2014 12:56 PM, boris ostrovsky wrote:

On 5/28/2014 11:16 AM, Aravind Gopalakrishnan wrote:
On 5/27/2014 6:47 PM, Andrew Cooper wrote:
On 27/05/2014 19:24, Aravind Gopalakrishnan wrote:
When HW tries to load a corrupted patch, it generates #GP
and hangs the system. Use wrmsr_safe instead so that we
fail to load microcode gracefully.

Also, massage error handling around apply_microcode to keep
in tune with error handling style of other parts of the code.

Example on a Fam15h system-
(XEN) microcode: CPU0 collect_cpu_info: patch_id=0x6000626
(XEN) microcode: CPU0 size 7870, block size 2586 offset 76 equivID
0x6012 rev 0x6000637
(XEN) microcode: CPU0 found a matching microcode update with version
0x6000637 (current=0x6000626)
(XEN) traps.c:3073: GPF (0000): ffff82d08016f682 -> ffff82d08022d9f8
(XEN) microcode: CPU0 update from revision 0x6000637 to 0x6000626failed
^^^^^^^^^^^^^^^^^^^^^^
As shown, the log message above has the two revisions reversed.Fix this
Changes in V2:
     - Do not ignore return value from wrmsr_safe
     - Flip revision numbers as shown above
Signed-off-by: Aravind Gopalakrishnan<aravind.gopalakrishnan@xxxxxxx>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
I thought we had identified that the hangs were to do with your use of
'noreboot' on the Xen command line.
Hmm. Yeah.. I figured using wrmsr_safe allows user to just boot intodom0 without
having to run through reboot loops. (lazy alternative I guess)
Nevermind then. Thanks for the comments (Jan and Andrew). Will keepin mind for the future.
I don't understand --- the fact that you had noreboot option meantthat your system wouldn't reboot (duh!) when a patch is corrupted(aka "it will hang"). But I'd think we don't want a reboot neither--- we want to safely skip the patch (and possibly backlist it).


So, allowing the reboot to happen in turn allows entry to grub.
Where we can simply remove 'ucode=' option and this is same as
'skipping' the patch using wrmsr_safe right?

Except this is now explicitly done by the sysadmin
while wrmsr_safe just works without anyone doing some extra work;
and printing a log message informs user that update went wrong
for whatever reason.

My understanding from earlier comments is that Andrew (and Jan)
would rather not see a change from wrmsrl to wrmsr_safe if it
is needless because there is already a way someone can circumvent the
corrupt patch: just don't provide it on the grub menu.

Thanks,
-Aravind.


Sorry - I think my comment confused the issue.  Let me retry.

Originally, the bug was described approximately as "A corrupt ucode willcause a GP fault, causing the server to hang".

The unhandled #GP fault certainly should be wrapped with wrmsr_safe(),and an error/warning presented to the user. In the case that a baducode is discovered, it should be discarded and the server allowed toboot. It is substantially more useful for the server to come up and say"I couldn't load that bit of microcode you wanted me to", than to sit ina reboot loop because you made a typo in the bootloader config, and haveto get someone in the datacenter to poke the physical server.

My objection was to the wording of the comment alone. Unhandled #GPfaults do not "hang" Xen unless you ask for them to behave in that way,given "noreboot" on the command line.


~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH V2] x86, amd_ucode: Safeguard against #GP
  - From: Aravind Gopalakrishnan

References:
- [Xen-devel] [PATCH V2] x86, amd_ucode: Safeguard against #GP
  - From: Aravind Gopalakrishnan
- Re: [Xen-devel] [PATCH V2] x86, amd_ucode: Safeguard against #GP
  - From: Andrew Cooper
- Re: [Xen-devel] [PATCH V2] x86, amd_ucode: Safeguard against #GP
  - From: Aravind Gopalakrishnan
- Re: [Xen-devel] [PATCH V2] x86, amd_ucode: Safeguard against #GP
  - From: boris ostrovsky
- Re: [Xen-devel] [PATCH V2] x86, amd_ucode: Safeguard against #GP
  - From: Aravind Gopalakrishnan

Prev by Date: Re: [Xen-devel] [v4][PATCH 5/5] xen, gfx passthrough: add opregion mapping
Next by Date: Re: [Xen-devel] [PATCH V2] x86, amd_ucode: Safeguard against #GP
Previous by thread: Re: [Xen-devel] [PATCH V2] x86, amd_ucode: Safeguard against #GP
Next by thread: Re: [Xen-devel] [PATCH V2] x86, amd_ucode: Safeguard against #GP
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.