[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 5/5] x86/ucode: Relax digest check when Entrysign is fixed in firmware


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 23 Oct 2025 09:05:35 +0200
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 23 Oct 2025 07:05:47 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 22.10.2025 23:19, Andrew Cooper wrote:
> On 21/10/2025 10:47 am, Jan Beulich wrote:
>> On 20.10.2025 15:19, Andrew Cooper wrote:
>>> +void __init amd_check_entrysign(void)
>>> +{
>>> +    unsigned int curr_rev;
>>> +    uint8_t fixed_rev;
>>> +
>>> +    if ( boot_cpu_data.vendor != X86_VENDOR_AMD ||
>>> +         boot_cpu_data.family < 0x17 ||
>>> +         boot_cpu_data.family > 0x1a )
>>> +        return;
>>> +
>>> +    /*
>>> +     * Table taken from Linux, which is the only known source of 
>>> information
>>> +     * about client revisions.
>>> +     */
>>> +    curr_rev = this_cpu(cpu_sig).rev;
>>> +    switch ( curr_rev >> 8 )
>>> +    {
>>> +    case 0x080012: fixed_rev = 0x6f; break;
>>> +    case 0x080082: fixed_rev = 0x0f; break;
>> In your reply you mentioned a "general off-by-1" when comparing with Linux,
>> but I'm in trouble understanding how both can be correct. Leaving aside the
>> 1st line (for which you sent a Linux patch anyway), how can our
>> "(uint8_t)curr_rev >= fixed_rev" (i.e. "(uint8_t)curr_rev >= 0x0f") further
>> below be correct at the same time as Linux'es "return cur_rev <= 0x800820f"
>> (indicating to the caller whether a SHA check is needed) is also correct?
>> We say 0x0f is okay, while they demand a SHA check for that revision.
>>
>> In any event, whatever (legitimate) off-by-1 it is that I'm failing to spot,
>> I think this would want explaining in the comment above.
> 
> What you've spotted is the off-by-one error.
> 
> Linux is written as "curr <= last-vuln-rev" in order to do the digest check.
> 
> Xen wants "cur >= first-fixed-rev"; I renamed the variable and forgot to
> adjust the table to compensate.  I've already fixed it in v2, so this
> line now reads fixed_rev = 0x0a.

Now I'm even more confused. If Linux uses 0x0f for last-vuln-rev, how would
0x0a be first-fixed-ref?

>>> +    case 0x083010: fixed_rev = 0x7c; break;
>>> +    case 0x086001: fixed_rev = 0x0e; break;
>>> +    case 0x086081: fixed_rev = 0x08; break;
>>> +    case 0x087010: fixed_rev = 0x34; break;
>>> +    case 0x08a000: fixed_rev = 0x0a; break;
>>> +    case 0x0a0010: fixed_rev = 0x7a; break;
>>> +    case 0x0a0011: fixed_rev = 0xda; break;
>>> +    case 0x0a0012: fixed_rev = 0x43; break;
>>> +    case 0x0a0082: fixed_rev = 0x0e; break;
>>> +    case 0x0a1011: fixed_rev = 0x53; break;
>>> +    case 0x0a1012: fixed_rev = 0x4e; break;
>>> +    case 0x0a1081: fixed_rev = 0x09; break;
>>> +    case 0x0a2010: fixed_rev = 0x2f; break;
>>> +    case 0x0a2012: fixed_rev = 0x12; break;
>>> +    case 0x0a4041: fixed_rev = 0x09; break;
>>> +    case 0x0a5000: fixed_rev = 0x13; break;
>>> +    case 0x0a6012: fixed_rev = 0x0a; break;
>>> +    case 0x0a7041: fixed_rev = 0x09; break;
>>> +    case 0x0a7052: fixed_rev = 0x08; break;
>>> +    case 0x0a7080: fixed_rev = 0x09; break;
>>> +    case 0x0a70c0: fixed_rev = 0x09; break;
>>> +    case 0x0aa001: fixed_rev = 0x16; break;
>>> +    case 0x0aa002: fixed_rev = 0x18; break;
>>> +    case 0x0b0021: fixed_rev = 0x46; break;
>>> +    case 0x0b1010: fixed_rev = 0x46; break;
>>> +    case 0x0b2040: fixed_rev = 0x31; break;
>>> +    case 0x0b4040: fixed_rev = 0x31; break;
>>> +    case 0x0b6000: fixed_rev = 0x31; break;
>>> +    case 0x0b7000: fixed_rev = 0x31; break;
>> Without at least brief model related comments this looks extremely opaque.
>> Linux, as a minimal reference, at least has cpuid_to_ucode_rev() and the
>> accompanying union zen_patch_rev.
> 
> We have other tables like this in Xen.  Linux has even more.

The one in amd-patch-digests.c I'm aware of. Oh, and tsa_calculations().
But ...

> These case labels are family/model/steppings, but not in the same format
> as CPUID.1.EAX, and also not in the same format at patch->processor_id.

... none of them explaining what these numbers really mean isn't helpful.
I didn't question them earlier because I assumed them to be all "magic".
Now that I learned how they're encoded, I thought it might be (have been)
nice if they weren't left as "entirely magic".

>>  Background of my remark is that I would
>> have expected there to be more models per Zen<N>, seeing in particular how
>> many different BKDGs / PPRs and RGs there are. Many RGs in particular say
>> they apply to a range of models, yet no similar ranges are covered here
>> (unless my deciphering attempts went wrong).
> 
> PPRs/RGs are generally per block of 0x10 models and all steppings
> therewith.  This is quite often one production CPU and a handful of
> preproduction steppings, but e.g. Milan and MilanX are two production
> CPUs share a same PPR/RG, as they differ only by stepping.
> 
> Preproduction CPUs probably won't have a fix (other than the final two
> rows which are A0 stepping of something presumably trying to get out of
> the door when Entrysign was found.)  The list does look to be right
> order of magnitude for the production CPUs.

Sure, and my question wasn't towards steppings of individual models. My
question was towards models of individual families, where the docs
suggest far more exist than this table would cover. I guess that while
talking mainly of steppings, you really (also) meant to say that most of
the model numbers weren't used in practice (for production CPUs) either?

> The AMD bulletin only gives microcode versions for server.  Clients only
> state AgesaPI versions, so I'm entirely reliant on Linux for the
> microcode versions.

I did understand that, yes, as you have a code comment saying so.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.