|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 5/5] x86/ucode: Relax digest check when Entrysign is fixed in firmware
On 23/10/2025 8:05 am, Jan Beulich wrote:
> On 22.10.2025 23:19, Andrew Cooper wrote:
>> On 21/10/2025 10:47 am, Jan Beulich wrote:
>>> On 20.10.2025 15:19, Andrew Cooper wrote:
>>>> +void __init amd_check_entrysign(void)
>>>> +{
>>>> + unsigned int curr_rev;
>>>> + uint8_t fixed_rev;
>>>> +
>>>> + if ( boot_cpu_data.vendor != X86_VENDOR_AMD ||
>>>> + boot_cpu_data.family < 0x17 ||
>>>> + boot_cpu_data.family > 0x1a )
>>>> + return;
>>>> +
>>>> + /*
>>>> + * Table taken from Linux, which is the only known source of
>>>> information
>>>> + * about client revisions.
>>>> + */
>>>> + curr_rev = this_cpu(cpu_sig).rev;
>>>> + switch ( curr_rev >> 8 )
>>>> + {
>>>> + case 0x080012: fixed_rev = 0x6f; break;
>>>> + case 0x080082: fixed_rev = 0x0f; break;
>>> In your reply you mentioned a "general off-by-1" when comparing with Linux,
>>> but I'm in trouble understanding how both can be correct. Leaving aside the
>>> 1st line (for which you sent a Linux patch anyway), how can our
>>> "(uint8_t)curr_rev >= fixed_rev" (i.e. "(uint8_t)curr_rev >= 0x0f") further
>>> below be correct at the same time as Linux'es "return cur_rev <= 0x800820f"
>>> (indicating to the caller whether a SHA check is needed) is also correct?
>>> We say 0x0f is okay, while they demand a SHA check for that revision.
>>>
>>> In any event, whatever (legitimate) off-by-1 it is that I'm failing to spot,
>>> I think this would want explaining in the comment above.
>> What you've spotted is the off-by-one error.
>>
>> Linux is written as "curr <= last-vuln-rev" in order to do the digest check.
>>
>> Xen wants "cur >= first-fixed-rev"; I renamed the variable and forgot to
>> adjust the table to compensate. I've already fixed it in v2, so this
>> line now reads fixed_rev = 0x0a.
> Now I'm even more confused. If Linux uses 0x0f for last-vuln-rev, how would
> 0x0a be first-fixed-ref?
Sorry, that was a typo in my email. I've got 0x10 locally.
>
>>>> + case 0x083010: fixed_rev = 0x7c; break;
>>>> + case 0x086001: fixed_rev = 0x0e; break;
>>>> + case 0x086081: fixed_rev = 0x08; break;
>>>> + case 0x087010: fixed_rev = 0x34; break;
>>>> + case 0x08a000: fixed_rev = 0x0a; break;
>>>> + case 0x0a0010: fixed_rev = 0x7a; break;
>>>> + case 0x0a0011: fixed_rev = 0xda; break;
>>>> + case 0x0a0012: fixed_rev = 0x43; break;
>>>> + case 0x0a0082: fixed_rev = 0x0e; break;
>>>> + case 0x0a1011: fixed_rev = 0x53; break;
>>>> + case 0x0a1012: fixed_rev = 0x4e; break;
>>>> + case 0x0a1081: fixed_rev = 0x09; break;
>>>> + case 0x0a2010: fixed_rev = 0x2f; break;
>>>> + case 0x0a2012: fixed_rev = 0x12; break;
>>>> + case 0x0a4041: fixed_rev = 0x09; break;
>>>> + case 0x0a5000: fixed_rev = 0x13; break;
>>>> + case 0x0a6012: fixed_rev = 0x0a; break;
>>>> + case 0x0a7041: fixed_rev = 0x09; break;
>>>> + case 0x0a7052: fixed_rev = 0x08; break;
>>>> + case 0x0a7080: fixed_rev = 0x09; break;
>>>> + case 0x0a70c0: fixed_rev = 0x09; break;
>>>> + case 0x0aa001: fixed_rev = 0x16; break;
>>>> + case 0x0aa002: fixed_rev = 0x18; break;
>>>> + case 0x0b0021: fixed_rev = 0x46; break;
>>>> + case 0x0b1010: fixed_rev = 0x46; break;
>>>> + case 0x0b2040: fixed_rev = 0x31; break;
>>>> + case 0x0b4040: fixed_rev = 0x31; break;
>>>> + case 0x0b6000: fixed_rev = 0x31; break;
>>>> + case 0x0b7000: fixed_rev = 0x31; break;
>>> Without at least brief model related comments this looks extremely opaque.
>>> Linux, as a minimal reference, at least has cpuid_to_ucode_rev() and the
>>> accompanying union zen_patch_rev.
>> We have other tables like this in Xen. Linux has even more.
> The one in amd-patch-digests.c I'm aware of. Oh, and tsa_calculations().
> But ...
>
>> These case labels are family/model/steppings, but not in the same format
>> as CPUID.1.EAX, and also not in the same format at patch->processor_id.
> ... none of them explaining what these numbers really mean isn't helpful.
> I didn't question them earlier because I assumed them to be all "magic".
> Now that I learned how they're encoded, I thought it might be (have been)
> nice if they weren't left as "entirely magic".
Well - they are about as magic as numbers get.
It's just a convention that AMD uses when choosing the (otherwise
arbitrary) patch_id, and I'm not aware of it being written down
anywhere. Using the entrysign vulnerability, AIUI you can choose an
arbitrary 32bit value here.
Linux says it's from Fam17h onwards, but the pattern works from Fam12h,
and Fam10h was definitely different.
I've got no idea how long it will continue. For one, the 8-bit ucode
revision is proving to be a limiting factor on some CPUs, and e.g. one
of the 3 F/M/S encoding (patch->processor_id) will run out when we hit
Zen15 CPUs at the current rate that AMD are using Family numbers.
>>> Background of my remark is that I would
>>> have expected there to be more models per Zen<N>, seeing in particular how
>>> many different BKDGs / PPRs and RGs there are. Many RGs in particular say
>>> they apply to a range of models, yet no similar ranges are covered here
>>> (unless my deciphering attempts went wrong).
>> PPRs/RGs are generally per block of 0x10 models and all steppings
>> therewith. This is quite often one production CPU and a handful of
>> preproduction steppings, but e.g. Milan and MilanX are two production
>> CPUs share a same PPR/RG, as they differ only by stepping.
>>
>> Preproduction CPUs probably won't have a fix (other than the final two
>> rows which are A0 stepping of something presumably trying to get out of
>> the door when Entrysign was found.) The list does look to be right
>> order of magnitude for the production CPUs.
> Sure, and my question wasn't towards steppings of individual models. My
> question was towards models of individual families, where the docs
> suggest far more exist than this table would cover. I guess that while
> talking mainly of steppings, you really (also) meant to say that most of
> the model numbers weren't used in practice (for production CPUs) either?
AMD's numbering space is very sparse. From a block of 0x10 (or in some
cases 8) model numbers, it's uncommon to see anything other than 0 or 1.
~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |