[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86: extend coverage of HLE "bad page" workaround



On 26/05/2020 14:35, Jan Beulich wrote:
> On 26.05.2020 13:17, Andrew Cooper wrote:
>> On 26/05/2020 07:49, Jan Beulich wrote:
>>> Respective Core Gen10 processor lines are affected, too.
>>>
>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>>>
>>> --- a/xen/arch/x86/mm.c
>>> +++ b/xen/arch/x86/mm.c
>>> @@ -6045,6 +6045,8 @@ const struct platform_bad_page *__init g
>>>      case 0x000506e0: /* errata SKL167 / SKW159 */
>>>      case 0x000806e0: /* erratum KBL??? */
>>>      case 0x000906e0: /* errata KBL??? / KBW114 / CFW103 */
>>> +    case 0x000a0650: /* erratum Core Gen10 U/H/S 101 */
>>> +    case 0x000a0660: /* erratum Core Gen10 U/H/S 101 */
>> This is marred in complexity.
>>
>> The enumeration of MSR_TSX_CTRL (from the TAA fix, but architectural
>> moving forwards on any TSX-enabled CPU) includes a confirmation that HLE
>> no longer exists/works.  This applies to IceLake systems, but possibly
>> not their initial release configuration (hence, via a later microcode
>> update).
>>
>> HLE is also disabled in microcode on all older parts for errata reasons,
>> so in practice it doesn't exist anywhere now.
>>
>> I think it is safe to drop this workaround, and this does seem a more
>> simple option than encoding which microcode turned HLE off (which sadly
>> isn't covered by the spec updates, as even when turned off, HLE is still
>> functioning according to its spec of "may speed things up, may do
>> nothing"), or the interactions with the CPUID hiding capabilities of
>> MSR_TSX_CTRL.
> I'm afraid I don't fully follow: For one, does what you say imply HLE is
> no longer enumerated in CPUID?

No - sadly not.  For reasons of "not repeating the Haswell/Broadwell
microcode fiasco", the HLE bit will continue to exist and be set. 
(Although on CascadeLake and later, you can turn it off with MSR_TSX_CTRL.)

It was always a weird CPUID bit.  You were supposed to put
XACQUIRE/XRELEASE prefixes on your legacy locking, and it would be a nop
on old hardware and go faster on newer hardware.

There is nothing runtime code needs to look at the HLE bit for, except
perhaps for UI reporting purposes.

> But then this
> erratum does not have the usual text effectively meaning that an ucode
> update is or will be available to address the issue; instead it says
> that BIOS or VMM can reserve the respective address range.

This is not surprising at all.  Turning off HLE was an unrelated
activity, and I bet the link went unnoticed.

> This - assuming the alternative you describe is indeed viable - then is surely
> a much more intrusive workaround than needed. Which I wouldn't assume
> they would suggest in such a case.

My suggestion was to drop the workaround, not to complicated it with a
microcode revision matrix.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.