[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/cpufreq: Reset policy after enabling/disabling turbo status


  • To: Andrew Cooper <amc96@xxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 15 Nov 2021 17:21:34 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zQaKEobEC1SD5fnmPFlVOyw6Li6YOZJ6uXH4bR60K1w=; b=Tp7dUR8Rgpu399Pmv5hE4Iq76k4OPhgSDLLMyhljhbX93dicb7Ww8km2qQUd+JSwA6dJy1FLXL3b1xhAegCbH9GmS6EyY7TYUzLV8xHLanJaMpMtlHR/Xemi/OeH+dOhjhfCjsad6zXNTGzr1BuN6eZcBiQof6hEWOJIa27RsQFU05dqO/q1Xtc5dOWJfy/2nNFRE+3r8hLG9E4MXqjgdweWBI9D0BIJE75nNItaGIukG+aQaaU+agLWOYK/G6n76lY2QCZ082Y1YxsBHULoDoOnGbPWL/tiydnmNku7t5sDRh4P6cpF2ssAb8CbrhbgKJSLf7+9VAwWT4B+Aq/rHA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hsNf5qBfRiIj4LlaFcTgGaBtkWGw/jjiX2OtZYysoZxF0vY6Y90amp1kSsZFVn5wTDQhYixMVFPT1HgNzg4GHwPhr+cmBuca0kR0OYftfcu0GWr3AEt9uYHAiNSPN/Z3/CK/W2agPia7wD08BZU0syRyRb9wUTtEu22xIhA96yPgHLyZsUrmMJygRphNpqt9VZFwf7UCKJBxVFKZdxxn+uriZvRvaLdjr7w4FSBufd8NsxModKi3PJjVBaSURjVsHR1WlK0HJ6t72nfe2Yz517/x1BjixnMP1GuRPscnrtUNtu2kyc2gSh5kS974LnNasjmilEC5uINOldO9CYlLIw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Ian Jackson <iwj@xxxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jane Malalane <jane.malalane@xxxxxxxxxx>
  • Delivery-date: Mon, 15 Nov 2021 16:21:49 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 15.11.2021 15:33, Andrew Cooper wrote:
> On 15/11/2021 10:53, Jan Beulich wrote:
>> On 12.11.2021 19:51, Andrew Cooper wrote:
>>> On 10/11/2021 09:19, Jane Malalane wrote:
>>>> Before, user would change turbo status but this had no effect: boolean
>>>> was set but policy wasn't reevaluated.  Policy must be reevaluated so
>>>> that CPU frequency is chosen according to the turbo status and the
>>>> current governor.
>>>>
>>>> Therefore, add __cpufreq_governor() in cpufreq_update_turbo().
>>>>
>>>> Reported-by: <edvin.torok@xxxxxxxxxx>
>>>> Signed-off-by: <jane.malalane@xxxxxxxxxx>
>>>> ---
>>>> CC: Jan Beulich <jbeulich@xxxxxxxx>
>>>> CC: Ian Jackson <iwj@xxxxxxxxxxxxxx>
>>>> ---
>>>>
>>>> Release rationale:
>>>> Not taking this patch means that turbo status is misleading.
>>>>
>>>> Taking this patch is low-risk as it only uses a function that already
>>>> exists and is already used for setting the chosen scaling governor.
>>>> Essentially, this change is equivalent to running 'xenpm
>>>> en/disable-turbo-mode' and, subsequently, running 'xenpm
>>>> set-scaling-governor <current governor>'.
>>>> ---
>>>>  xen/drivers/cpufreq/utility.c | 6 +++++-
>>>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
>>>> index b93895d4dd..5f200ff3ee 100644
>>>> --- a/xen/drivers/cpufreq/utility.c
>>>> +++ b/xen/drivers/cpufreq/utility.c
>>>> @@ -417,10 +417,14 @@ int cpufreq_update_turbo(int cpuid, int new_state)
>>>>      {
>>>>          ret = cpufreq_driver.update(cpuid, policy);
>>>>          if (ret)
>>>> +        {
>>>>              policy->turbo = curr_state;
>>>> +            return ret;
>>>> +        }
>>>>      }
>>>>  
>>>> -    return ret;
>>>> +    /* Reevaluate current CPU policy. */
>>>> +    return __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
>>>>  }
>>> So, having looked through the manual, what the cpufreq_driver does for
>>> turbo on Intel is bogus according to the SDM.
>>>
>>> There is a non-architectrual dance with IA32_MISC_ENABLE bit 38 (per
>>> package) for firmware to configure turbo, but it manifests as another
>>> dynamic CPUID bit (which I think we handle correctly).  It has the same
>>> semantics as XD_DISABLE and CPUID_LIMIT so we might want to consider
>>> adding it to the set of bits we clear during boot.
>> This is quite confusing in the docs - at least one of the tables calls
>> the bit "IDA Disable", while other entries at least also refer to the
>> effect of disabling IDA. I'm afraid I have trouble connecting turbo
>> mode and IDA disabling, unless both are two different names of the
>> same thing. Maybe they really are, except that SDM vol 2 uses yet
>> another slightly different term for CPUID[6].EAX[1]: "Intel Turbo Boost
>> Technology".
> 
> SDM Vol3 14.3 uses IDA and turbo interchangeably in some cases.  It
> reads as if IDA is the general technology name, and turbo is a sub-mode
> for "doing it automatically without software involvement".
> 
> On CPUs which support IDA, the CPUID bit is !MISC_ENABLE[38], with
> further adds to the confusion of which is which.
> 
>>> However, the correct way to turn off turbo is to set
>>> IA32_PERF_CTL.TURBO_DISENGAGE bit, which is per logical processor.  As
>>> such, it *should* behave far more like the AMD CPB path.
>> I'm afraid public documentation has no mention of a bit of this name.
>> Considering the above I wonder whether you mean "IDA engage" (bit 32),
>> albeit this doesn't seem very likely when you're taking about a
>> "disengage" bit.
> 
> It is that bit.  Despite the name given in Vol4 Table 2-12, it is "set
> to disable".
> 
> Vol3 Figure 14-2 explicitly calls it the "IDA/Turbo disengage" bit and
> the surrounding text makes it clear that it is disable bit, not an
> enable bit.  Also, that's how the Linux intel_pstate driver uses it.

Okay, then I agree with the proposal you've made.

Jan

>>  If it is, we'd still need to cope with it being
>> unavailable: While as per the doc it exists from Merom onwards, i.e.
>> just far enough back for all 64-bit capable processors to be covered,
>> at least there it is attributed "Mobile only".
> 
> Honestly, the number of errors in those tables are alarming.  The MSR is
> has been around longer than turbo, and I'm told that the interface has
> never changed since its introduction.
> 
> Looking across Linux, there's a mess too.
> 
> acpi-cpufreq and energy_perf_policy use the MISC_ENABLE bit (which is
> package wide)
> intel_ips driver uses PERF_CTL (which is logical processor)
> intel_pstate uses MISC_ENABLE || max_pstate == turbo_pstate
> 
> I'm certain I've seen "one pstate being special" WRT turbo before, but I
> can't locate anything about this in the SDM, which possibly means it is
> model specific.
> 
> ~Andrew
> 




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.