[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH for-4.22 1/2] x86/platform: Expose DTS sensors MSR


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Wed, 29 Oct 2025 16:06:48 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Eytu9P3R1oLbBV3tsOzcZ6+137ornPWN35PjmWSG8fE=; b=y7/8pePSDYTr8rv4B/LW1OXWuvGjf/6a92V1q9gueii5hrCumdb76JMntmdXRUu6qsXOxRvRZS4vCtrOzfLPkX9uJryr4yL7JH8WDhzWhkPj5JqEy4eJ1x/ZzqdGtEHDEBQqrveX6TjXI8lVrPJWLTBR2bc03QQpZSDKGbwkqf5l+LULMVfOhw+SBpc93oXWtqq//YykYfXBL+UDnyeJPx7P4/uw3oceIUBjOfnLTP98997iFfky8JF7MuunAyWaTVmWL1qmeSYc0eCyJOggjQ9RSt3bY67ez0CitpDXIxSRJQnohK8jOgs8tR1wuaKzbHS1MUGWsCsyNNEUKXs9Ug==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=DaM1FFFyFyZ4hsiUVebJygpWj0Xk5pyeTWdyWeYC3GZ93scDirlvcQQNBlmN+w7OD/d7rlqegkL/6Xe7NXhbX2abTng7JswTtbaUc9is5zgMAZIW56gLemZ+pdDpaRfgr6k0ORlzkdfL1YShQTjaFilBAteGBoDUJV6Lrz7ShP/hfNaXKu06sXQcQN6gcjDWgGr1t1w158tnuU7SSRPyCv0E/+txsRHSYau8O3LKvp6RUqR5oYBebXuVDVYv5KG08JIBYcbkz69kvjdF0rZEFlyPshf3nWcz+b/M3CgSurT41bSGHuyvAKkAToCuow8RMdvc7AUGx9fhvoeCx6RTkg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Teddy Astie <teddy.astie@xxxxxxxxxx>
  • Delivery-date: Wed, 29 Oct 2025 16:07:01 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 28/10/2025 9:20 am, Jan Beulich wrote:
> On 27.10.2025 20:38, Andrew Cooper wrote:
>> On 27/10/2025 5:26 pm, Teddy Astie wrote:
>>> I'm not a fan of doing a inline cpuid check here, but I don't have a
>>> better approach in mind.
>> I'm not sure if there's enough information in leaf 6 to justify putting
>> it fully into the CPUID infrastructure.
>>
>> But, if you do something like this:
>>
>> diff --git a/xen/include/xen/lib/x86/cpu-policy.h 
>> b/xen/include/xen/lib/x86/cpu-policy.h
>> index f94f23e159d2..d02fe4d22151 100644
>> --- a/xen/include/xen/lib/x86/cpu-policy.h
>> +++ b/xen/include/xen/lib/x86/cpu-policy.h
>> @@ -121,7 +121,13 @@ struct cpu_policy
>>              uint64_t :64, :64; /* Leaf 0x3 - PSN. */
>>              uint64_t :64, :64; /* Leaf 0x4 - Structured Cache. */
>>              uint64_t :64, :64; /* Leaf 0x5 - MONITOR. */
>> -            uint64_t :64, :64; /* Leaf 0x6 - Therm/Perf. */
>> +
>> +            /* Leaf 0x6 - Thermal and Perf. */
>> +            struct {
>> +                bool /* a */ dts:1;
>> +                uint32_t /* b */:32, /* c */:32, /* d */:32;
>> +            };
>> +
>>              uint64_t :64, :64; /* Leaf 0x7 - Structured Features. */
>>              uint64_t :64, :64; /* Leaf 0x8 - rsvd */
>>              uint64_t :64, :64; /* Leaf 0x9 - DCA */
> Just to mention, below a patch I have pending as part of a series to
> e.g. replace the various CPUID6_* values we presently use. As you did
> indicate when we talked about this, a prereq to then use respective
> bits from host_policy is an adjustment to cpu-policy.c, which is also
> part of that series. If we weren't in freeze right now, I would have
> posted the series already.
>
> Jan
>
> x86/cpu-policy: define bits of leaf 6
>
> ... as far as we presently use them in the codebase.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> Or should we make both parts proper featureset elements? At least
> APERFMPERF could likely be made visible to guests (in principle).
>
> --- a/xen/include/xen/lib/x86/cpu-policy.h
> +++ b/xen/include/xen/lib/x86/cpu-policy.h
> @@ -128,7 +128,31 @@ struct cpu_policy
>              uint64_t :64, :64; /* Leaf 0x3 - PSN. */
>              uint64_t :64, :64; /* Leaf 0x4 - Structured Cache. */
>              uint64_t :64, :64; /* Leaf 0x5 - MONITOR. */
> -            uint64_t :64, :64; /* Leaf 0x6 - Therm/Perf. */
> +
> +            /* Leaf 0x6 - Therm/Perf. */
> +            struct {
> +                uint32_t /* a */:1,
> +                    turbo:1,
> +                    arat:1,
> +                    :4,
> +                    hwp:1,
> +                    hwp_notification:1,
> +                    hwp_activity_window:1,
> +                    hwp_epp:1,
> +                    hwp_plr:1,
> +                    :1,
> +                    hdc:1,
> +                    :2,
> +                    hwp_peci:1,
> +                    :2,
> +                    hw_feedback:1,
> +                    :12;
> +                uint32_t /* b */:32;
> +                uint32_t /* c */ aperfmperf:1,
> +                    :31;
> +                uint32_t /* d */:32;
> +            } pm;

This works too, although we don't have 'pm' equivalents elsewhere in
this part of the union.

APERF/MPERF is a disaster of an interface.  It can't safely be read even
in root mode, because an NMI/SMI breaks the algorithm in a way that
isn't easy to spot and retry.  On AMD, it's marginally better because
GIF can be used to exclude NMIs and non-fatal MCEs while sampling the
register pair.

In a VM, it's simply unusable.  Any VMExit, and even a vCPU reschedule,
breaks reading the pair.

Until the CPU vendors produce a way of reading the two counters together
(i.e. atomically, which has been asked for, repeatedly), there's no
point considering it for use in a VM.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.