[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/3] x86/amd: Fix race editing DE_CFG


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Wed, 26 Nov 2025 17:56:21 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=db9E0UhHHX0avoxIBa0aaEVVFiDQUNRvZvsUs3TtZ5c=; b=gfSroiFqeXRqQtTfAmNvPv4QzDxkUNfMfFluXnijj+gnSbYu3mHbvLjHtVQoDbQ1EutSb47p+AGyoIqVmiQPQwdOyRfCPs39Q7EqbxlE32XHlQW7N1jgt6H/DFPIqZqenRfBMytCVBYvzQtqT0LTosbO+I/CVkItfaFHYdorVDTr2ak/waFQG0JDoWlygrhZgCvd9PhN+Z7WgLJ7G2jqGZL31XT2Yxr4BzENV1Lq1LNsfzKjkX5T6yysTNYG7wEyMozdD+tW7UBZwHZPFz3Tvw5Y2rCnxCtgFeKYkoKYvBqqCqAm8KLbHHdFoGEtw0CYiOVJPGahoh43RupW5s8Hkw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=wAjbNxDJS/5gCDkVH7Qm7g8wbWWTQVi/BaN6pWua/bOZjai0fwP2X+bVh3t5K5Ji86jWJ7kv8KeBSzq4qlHzYu0yX9coPpAZVOy+MKOvvPNxtRsgk9Vsv2F2EQsSsXqf3l2bj0pbmHbgPCAGBJ5jdS+6ZUY9toI2EskbtbYn8DJMoNPaviUhXdlCrQOZbn6rJ5GXw/+bjYi0/mtkpAFxBlkw8ipB7Zqvl4GjG9XciBUQMM1FztgK5z4BTcpbDHWIrJ06xy2gTDThGfVxEC2f4ha62jjfB1o/2zsOvt59zm96PXSIW2m873UehTTU5BxQ5B1fcqwAHS9+DjwwhPq/zA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: andrew.cooper3@xxxxxxxxxx, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 26 Nov 2025 17:56:36 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 26/11/2025 4:55 pm, Andrew Cooper wrote:
> On 26/11/2025 3:07 pm, Jan Beulich wrote:
>> On 26.11.2025 14:22, Andrew Cooper wrote:
>>> @@ -1075,6 +966,112 @@ static void cf_check fam17_disable_c6(void *arg)
>>>     wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
>>>  }
>>>  
>>> +static bool zenbleed_use_chickenbit(void)
>>> +{
>>> +    unsigned int curr_rev;
>>> +    uint8_t fixed_rev;
>>> +
>>> +    /*
>>> +     * If we're virtualised, we can't do family/model checks safely, and
>>> +     * we likely wouldn't have access to DE_CFG even if we could see a
>>> +     * microcode revision.
>>> +     *
>>> +     * A hypervisor may hide AVX as a stopgap mitigation.  We're not in a
>>> +     * position to care either way.  An admin doesn't want to be disabling
>>> +     * AVX as a mitigation on any build of Xen with this logic present.
>>> +     */
>>> +    if ( cpu_has_hypervisor || boot_cpu_data.family != 0x17 )
>>> +        return false;
>>> +
>>> +    curr_rev = this_cpu(cpu_sig).rev;
>>> +    switch ( curr_rev >> 8 )
>>> +    {
>>> +    case 0x083010: fixed_rev = 0x7a; break;
>>> +    case 0x086001: fixed_rev = 0x0b; break;
>>> +    case 0x086081: fixed_rev = 0x05; break;
>>> +    case 0x087010: fixed_rev = 0x32; break;
>>> +    case 0x08a000: fixed_rev = 0x08; break;
>>> +    default:
>>> +        /*
>>> +         * With the Fam17h check above, most parts getting here are Zen1.
>>> +         * They're not affected.  Assume Zen2 ones making it here are 
>>> affected
>>> +         * regardless of microcode version.
>>> +         */
>>> +        return is_zen2_uarch();
>>> +    }
>>> +
>>> +    return (uint8_t)curr_rev >= fixed_rev;
>>> +}
>>> +
>>> +void amd_init_de_cfg(const struct cpuinfo_x86 *c)
>>> +{
>>> +    uint64_t val, new = 0;
>>> +
>>> +    /* The MSR doesn't exist on Fam 0xf/0x11. */
>>> +    if ( c->family != 0xf && c->family != 0x11 )
>>> +        return;
>> Comment and code don't match. Did you mean
>>
>>     if ( c->family == 0xf || c->family == 0x11 )
>>         return;
>>
>> (along the lines of what you have in amd_init_lfence_dispatch())?
> Oh - that was a last minute refactor which I didn't do quite correctly. 
> Yes, it should match amd_init_lfence_dispatch().
>
>>> +    /*
>>> +     * On Zen3 (Fam 0x19) and later CPUs, LFENCE is unconditionally 
>>> dispatch
>>> +     * serialising, and is enumerated in CPUID.  Hypervisors may also
>>> +     * enumerate it when the setting is in place and MSR_AMD64_DE_CFG isn't
>>> +     * available.
>>> +     */
>>> +    if ( !test_bit(X86_FEATURE_LFENCE_DISPATCH, c->x86_capability) )
>>> +        new |= AMD64_DE_CFG_LFENCE_SERIALISE;
>>> +
>>> +    /*
>>> +     * If vulnerable to Zenbleed and not mitigated in microcode, use the
>>> +     * bigger hammer.
>>> +     */
>>> +    if ( zenbleed_use_chickenbit() )
>>> +        new |= (1 << 9);
>>> +
>>> +    if ( !new )
>>> +        return;
>>> +
>>> +    if ( rdmsr_safe(MSR_AMD64_DE_CFG, &val) ||
>>> +         (val & new) == new )
>>> +        return;
>>> +
>>> +    /*
>>> +     * DE_CFG is a Core-scoped MSR, and this write is racy.  However, both
>>> +     * threads calculate the new value from state which expected to be
>>> +     * consistent across CPUs and unrelated to the old value, so the result
>>> +     * should be consistent.
>>> +     */
>>> +    wrmsr_safe(MSR_AMD64_DE_CFG, val | new);
>> Either of the bits may be the cause of #GP. In that case we wouldn't set the
>> other bit, even if it may be possible to set it.
> This MSR does not #GP on real hardware.
>
> Also, both of these bits come from instructions AMD have provided,
> saying "set $X in case $Y", which we have honoured as part of the
> conditions for setting up new, which I consider to be a reasonable
> guarantee that no #GP will ensue.
>
> This wrmsr_safe() is covering the virt case, because older Xen and
> Byhive used to disallow writes to it, and OpenBSD would explode as a
> consequence.  Xen's fix was 4175fd3ccd17.
>
> I toyed with the idea of having a tristate de_cfg_writeable, but that
> got very ugly very quickly
>
> The other option would be to ignore DE_CFG entirely under virt.  That's
> what we do for BP_CFG already, and no hypervisor is going to really let
> us have access to it, and it would downgrade to non-safe variants.

In fact, ignoring the virt case for DE_CFG makes this generally nicer.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.