[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH v4 09/15] xen/x86: introduce a new amd cppc driver for cpufreq scaling


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: "Penny, Zheng" <penny.zheng@xxxxxxx>
  • Date: Wed, 7 May 2025 09:19:13 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XelD2B4pXLO/pscd3ozgXouGCQgqJpN16QfhWbtxGSs=; b=rOQRr6d1HibGWh1sFGK1TTkWe3J6Rw2GLvQcnxjRW/XVcWSvsndMRNpq2nsIvP5tjA+fnSiJbpic2K1RKGQpTbnQ6luEmSWTdpHxsD1up0hsqxcCdi7tCvVZRLjxD89t47MHt05IAINJsfObhhAYzENiDtZ2IaqaZgxT04CTiWQvgBpU3LyUB9THFUygZhQuXy3Bepqmr1FGDgajs35KqRmBfyb1BxGzfMmkqMAvcn3I0//tn/EdheV6JbgwlpsaYk8ckcNa+nGcWUjUzvZj9Ecs90p9tDz4fgFkhedoubtWtLzosAR3oN+JvjH78LFU3YMWPVuB64qY0dYE5KI2kA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=eZuCAr0ZYvR9InAzCG8m/B35zjGHIb5p//tDEY4pBo90CCwtgNIbevVQKwru+ZtTrAYvsL1s+7iWyhJdEOtG8ipQDQjH3dzHc4p28laUdv+aBpA0vEG45TbVN14s44XVopc1Z/Uz9xTAAeX45BlLzx9Iw3eh3P6CKQUrQJU4gY61VUT+F6JuO95NJ0tNxnPCDSvkHm4gFO1XZfFEqSjYA9AeY4Agvzn37F3Rj1lkLH2Ks3UAly8DSYAbu6W7PhHHTLVYPZsAE5yG2RhtjFdXpm63N6K6tWsdMQFYq62WjbAN1hjs/F9NAljLGbcWfkzXztgBsrFTiGqbrE4eQTc1ug==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com;
  • Cc: "Huang, Ray" <Ray.Huang@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 07 May 2025 09:19:35 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Msip_labels: MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_ActionId=2e500a0c-1fb0-47c4-85e0-2cb5a0521676;MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_ContentBits=0;MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_Enabled=true;MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_Method=Privileged;MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_Name=Open Source;MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_SetDate=2025-05-07T09:19:04Z;MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_SiteId=3dd8961f-e488-4e60-8e11-a82d994e183d;MSIP_Label_f265efc6-e181-49d6-80f4-fae95cf838a0_Tag=10, 0, 1, 1;
  • Thread-index: AQHbrRCsgX0DHfsxBEGtMOCcDZZeSLO6zAoAgAwkvoA=
  • Thread-topic: [PATCH v4 09/15] xen/x86: introduce a new amd cppc driver for cpufreq scaling

[Public]

Hi,

> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: Tuesday, April 29, 2025 10:29 PM
> To: Penny, Zheng <penny.zheng@xxxxxxx>
> Cc: Huang, Ray <Ray.Huang@xxxxxxx>; Andrew Cooper
> <andrew.cooper3@xxxxxxxxxx>; Roger Pau Monné <roger.pau@xxxxxxxxxx>; xen-
> devel@xxxxxxxxxxxxxxxxxxxx
> Subject: Re: [PATCH v4 09/15] xen/x86: introduce a new amd cppc driver for
> cpufreq scaling
>
> On 14.04.2025 09:40, Penny Zheng wrote:
> > --- a/xen/arch/x86/acpi/cpufreq/amd-cppc.c
> > +++ b/xen/arch/x86/acpi/cpufreq/amd-cppc.c
> > +/*
> > + * If CPPC lowest_freq and nominal_freq registers are exposed then we
> > +can
> > + * use them to convert perf to freq and vice versa. The conversion is
> > + * extrapolated as an linear function passing by the 2 points:
> > + *  - (Low perf, Low freq)
> > + *  - (Nominal perf, Nominal freq)
> > + */
> > +static int amd_cppc_khz_to_perf(const struct amd_cppc_drv_data *data,
> > +                                unsigned int freq, uint8_t *perf) {
> > +    const struct xen_processor_cppc *cppc_data = data->cppc_data;
> > +    uint64_t mul, div;
> > +    int offset = 0, res;
> > +
> > +    if ( freq == (cppc_data->cpc.nominal_mhz * 1000) )
>
> I'm pretty sure I commented on this before: The expression here _suggests_ 
> that
> "freq" is in kHz, but that's not being made explicit anywhere.
>

Sorry, I may overlook, and I'll be more careful.
I have clarified it in the function title, and maybe it's not enough. I'll 
change the parameter
name from "freq" to "freq_khz" to be more explicit.

> > +    {
> > +        *perf = data->caps.nominal_perf;
> > +        return 0;
> > +    }
> > +
> > +    if ( freq == (cppc_data->cpc.lowest_mhz * 1000) )
> > +    {
> > +        *perf = data->caps.lowest_perf;
> > +        return 0;
> > +    }
>
> How likely is it that these two early return paths are taken, when the 
> incoming
> "freq" is 25 or 5 MHz granular? IOW - is it relevant to shortcut these two 
> cases?
>

Sorry, I may not understand what you mean here.
Answering " How likely is it that these two early return paths are taken "
It's rare ig.... maybe *ondemand* governor will frequently give frequency 
around nominal frequency,
but the exact value is rare ig.
I'm confused about " when the incoming  "freq" is 25 or 5 MHz granular ".
Are we talking about is it worthy to have these two early return paths 
considering it is rarely taken

> > +    if ( cppc_data->cpc.lowest_mhz && cppc_data->cpc.nominal_mhz &&
> > +         cppc_data->cpc.lowest_mhz < cppc_data->cpc.nominal_mhz )
>
> Along the lines of a comment on an earlier patch, the middle part of the 
> condition
> here is redundant with the 3rd one. Also, don't you check this relation 
> already
> during init? IOW isn't it the 3rd part which can be dropped?
>

Yes, you're right. I've checked it in set_cppc_pminfo() already and only gave 
warnings there.
I shall delete the check here, and besides giving warning message during init, 
if values are
invalid, instead of storing invalid values, we shall set 
cppc_data->cpc.lowest_mhz / cppc_data->cpc.nominal_mhz them
zero... Then wherever we are trying to use them, like here, non-zero values 
ensures valid values.

> > +static int amd_get_max_freq(const struct amd_cppc_drv_data *data,
> > +                            unsigned int *max_freq) {
> > +    unsigned int nom_freq = 0, boost_ratio;
> > +    int res;
> > +
> > +    res = amd_get_lowest_or_nominal_freq(data,
> > +                                         data->cppc_data->cpc.nominal_mhz,
> > +                                         data->caps.nominal_perf,
> > +                                         &nom_freq);
> > +    if ( res )
> > +        return res;
> > +
> > +    boost_ratio = (unsigned int)(data->caps.highest_perf /
> > +                                 data->caps.nominal_perf);
>
> I may have seen logic ensuring nominal_perf isn't 0, so that part may be 
> fine. What
> guarantees this division to yield a positive value, though?
> If it yields zero (say 0xff / 0x80), ...
>

I think maybe you were saying 0x80/0xff to yield zero value. For that, we 
checked that highest_perf
must not be smaller than nominal_perf during init, see ...

> > +    *max_freq = nom_freq * boost_ratio;
>
> ... zero will be reported back here. I think you want to scale the 
> calculations here to
> avoid such.
>
> > +static void cf_check amd_cppc_init_msrs(void *info) {
> > +    struct cpufreq_policy *policy = info;
> > +    struct amd_cppc_drv_data *data = this_cpu(amd_cppc_drv_data);
> > +    uint64_t val;
> > +    unsigned int min_freq = 0, nominal_freq = 0, max_freq;
> > +
> > +    /* Package level MSR */
> > +    rdmsrl(MSR_AMD_CPPC_ENABLE, val);
> > +    /*
> > +     * Only when Enable bit is on, the hardware will calculate the 
> > processor’s
> > +     * performance capabilities and initialize the performance level 
> > fields in
> > +     * the CPPC capability registers.
> > +     */
> > +    if ( !(val & AMD_CPPC_ENABLE) )
> > +    {
> > +        val |= AMD_CPPC_ENABLE;
> > +        wrmsrl(MSR_AMD_CPPC_ENABLE, val);
> > +    }
> > +
> > +    rdmsrl(MSR_AMD_CPPC_CAP1, data->caps.raw);
> > +
> > +    if ( data->caps.highest_perf == 0 || data->caps.lowest_perf == 0 ||
> > +         data->caps.nominal_perf == 0 || data->caps.lowest_nonlinear_perf 
> > == 0 ||
> > +         data->caps.lowest_perf > data->caps.lowest_nonlinear_perf ||
>
> Same question as asked elsewhere - where is this relation specified?
>
> > +         data->caps.lowest_nonlinear_perf > data->caps.nominal_perf ||
> > +         data->caps.nominal_perf > data->caps.highest_perf )

here ...

>
> Jan

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.