|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 5/5] x86/ucode: Relax digest check when Entrysign is fixed in firmware
On 22.10.2025 23:19, Andrew Cooper wrote:
> On 21/10/2025 10:47 am, Jan Beulich wrote:
>> On 20.10.2025 15:19, Andrew Cooper wrote:
>>> +void __init amd_check_entrysign(void)
>>> +{
>>> + unsigned int curr_rev;
>>> + uint8_t fixed_rev;
>>> +
>>> + if ( boot_cpu_data.vendor != X86_VENDOR_AMD ||
>>> + boot_cpu_data.family < 0x17 ||
>>> + boot_cpu_data.family > 0x1a )
>>> + return;
>>> +
>>> + /*
>>> + * Table taken from Linux, which is the only known source of
>>> information
>>> + * about client revisions.
>>> + */
>>> + curr_rev = this_cpu(cpu_sig).rev;
>>> + switch ( curr_rev >> 8 )
>>> + {
>>> + case 0x080012: fixed_rev = 0x6f; break;
>>> + case 0x080082: fixed_rev = 0x0f; break;
>> In your reply you mentioned a "general off-by-1" when comparing with Linux,
>> but I'm in trouble understanding how both can be correct. Leaving aside the
>> 1st line (for which you sent a Linux patch anyway), how can our
>> "(uint8_t)curr_rev >= fixed_rev" (i.e. "(uint8_t)curr_rev >= 0x0f") further
>> below be correct at the same time as Linux'es "return cur_rev <= 0x800820f"
>> (indicating to the caller whether a SHA check is needed) is also correct?
>> We say 0x0f is okay, while they demand a SHA check for that revision.
>>
>> In any event, whatever (legitimate) off-by-1 it is that I'm failing to spot,
>> I think this would want explaining in the comment above.
>
> What you've spotted is the off-by-one error.
>
> Linux is written as "curr <= last-vuln-rev" in order to do the digest check.
>
> Xen wants "cur >= first-fixed-rev"; I renamed the variable and forgot to
> adjust the table to compensate. I've already fixed it in v2, so this
> line now reads fixed_rev = 0x0a.
Now I'm even more confused. If Linux uses 0x0f for last-vuln-rev, how would
0x0a be first-fixed-ref?
>>> + case 0x083010: fixed_rev = 0x7c; break;
>>> + case 0x086001: fixed_rev = 0x0e; break;
>>> + case 0x086081: fixed_rev = 0x08; break;
>>> + case 0x087010: fixed_rev = 0x34; break;
>>> + case 0x08a000: fixed_rev = 0x0a; break;
>>> + case 0x0a0010: fixed_rev = 0x7a; break;
>>> + case 0x0a0011: fixed_rev = 0xda; break;
>>> + case 0x0a0012: fixed_rev = 0x43; break;
>>> + case 0x0a0082: fixed_rev = 0x0e; break;
>>> + case 0x0a1011: fixed_rev = 0x53; break;
>>> + case 0x0a1012: fixed_rev = 0x4e; break;
>>> + case 0x0a1081: fixed_rev = 0x09; break;
>>> + case 0x0a2010: fixed_rev = 0x2f; break;
>>> + case 0x0a2012: fixed_rev = 0x12; break;
>>> + case 0x0a4041: fixed_rev = 0x09; break;
>>> + case 0x0a5000: fixed_rev = 0x13; break;
>>> + case 0x0a6012: fixed_rev = 0x0a; break;
>>> + case 0x0a7041: fixed_rev = 0x09; break;
>>> + case 0x0a7052: fixed_rev = 0x08; break;
>>> + case 0x0a7080: fixed_rev = 0x09; break;
>>> + case 0x0a70c0: fixed_rev = 0x09; break;
>>> + case 0x0aa001: fixed_rev = 0x16; break;
>>> + case 0x0aa002: fixed_rev = 0x18; break;
>>> + case 0x0b0021: fixed_rev = 0x46; break;
>>> + case 0x0b1010: fixed_rev = 0x46; break;
>>> + case 0x0b2040: fixed_rev = 0x31; break;
>>> + case 0x0b4040: fixed_rev = 0x31; break;
>>> + case 0x0b6000: fixed_rev = 0x31; break;
>>> + case 0x0b7000: fixed_rev = 0x31; break;
>> Without at least brief model related comments this looks extremely opaque.
>> Linux, as a minimal reference, at least has cpuid_to_ucode_rev() and the
>> accompanying union zen_patch_rev.
>
> We have other tables like this in Xen. Linux has even more.
The one in amd-patch-digests.c I'm aware of. Oh, and tsa_calculations().
But ...
> These case labels are family/model/steppings, but not in the same format
> as CPUID.1.EAX, and also not in the same format at patch->processor_id.
... none of them explaining what these numbers really mean isn't helpful.
I didn't question them earlier because I assumed them to be all "magic".
Now that I learned how they're encoded, I thought it might be (have been)
nice if they weren't left as "entirely magic".
>> Background of my remark is that I would
>> have expected there to be more models per Zen<N>, seeing in particular how
>> many different BKDGs / PPRs and RGs there are. Many RGs in particular say
>> they apply to a range of models, yet no similar ranges are covered here
>> (unless my deciphering attempts went wrong).
>
> PPRs/RGs are generally per block of 0x10 models and all steppings
> therewith. This is quite often one production CPU and a handful of
> preproduction steppings, but e.g. Milan and MilanX are two production
> CPUs share a same PPR/RG, as they differ only by stepping.
>
> Preproduction CPUs probably won't have a fix (other than the final two
> rows which are A0 stepping of something presumably trying to get out of
> the door when Entrysign was found.) The list does look to be right
> order of magnitude for the production CPUs.
Sure, and my question wasn't towards steppings of individual models. My
question was towards models of individual families, where the docs
suggest far more exist than this table would cover. I guess that while
talking mainly of steppings, you really (also) meant to say that most of
the model numbers weren't used in practice (for production CPUs) either?
> The AMD bulletin only gives microcode versions for server. Clients only
> state AgesaPI versions, so I'm entirely reliant on Linux for the
> microcode versions.
I did understand that, yes, as you have a code comment saying so.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |