[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 5/5] x86/mwait-idle: adjust the SKX C6 parameters if PC6 is disabled


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 18 Jan 2022 15:11:28 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=EoPkVXf6WvDD8510zSmkNj5376jmeCzr0LeqNmW1j0k=; b=fwYyvQ/m+XAYhpgzjOquowdPWBWQTdtzviOxtTiqkRa978VPh8fDiPK6rSorU1+yIs8YN9Mehd2/rPPu9gqXZsiv+uLIAaBF82r9XkkyrRpuvcGwb7pDkVCtqZChBqPdlRmvGyKeyBd2eBmTSGYpMeDYgcJ7S9XZGE3uAhZKKCGnaQJXEGod4OGBgF76JOilYbbnGOl/0Tj0bdzB6GmWzGIX74dBP2ch88yuEf0QhMlZyaXXQVZouqM+8WXAP8gcnn1dMSetnXV9J+3DXJQwfrhvrsro1xdwBbwaR71MobBL4GsefO+eGB4UPUZ+GoJN/Xglephr4fZKbkB99A87sA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IhOGbIhfaLecuRUY93coZAL8ylVAJ2GzAG4l0OLAu+mrRJsC20rEPC/dNVtHmGS0KtOLXD0ZU9NDkbwpPzOOtEQvKKQHAIcSdmedLMPhM5j+cN/1xHafgWEwhQ3S9cce/hMwaGx6CujHIDR8xU+LjlcqtlwFkQ0kOXepI+PXhAEv01Oeu/Hr6MJoCqb8K7eUM6UJ2i/ghzOnHYNlg2xf2H28lyJtyPD8mR9OG1hk4oXMjZgeaRIRE4suVanUjyGTzEXVOWPUmiz8XGRCOnnUrSM2DpJaqKT2VxMF1xQKUuHqJ2mD1ObdcU2+AddD9GAANmLqq+rJxI8EUAuSVmIn6Q==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Tue, 18 Jan 2022 14:11:41 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 18.01.2022 11:48, Roger Pau Monné wrote:
> On Mon, Sep 06, 2021 at 03:02:12PM +0200, Jan Beulich wrote:
>> From: Chen Yu <yu.c.chen@xxxxxxxxx>
>>
>> Because cpuidle assumes worst-case C-state parameters, PC6 parameters
>> are used for describing C6, which is worst-case for requesting CC6.
>> When PC6 is enabled, this is appropriate. But if PC6 is disabled
>> in the BIOS, the exit latency and target residency should be adjusted
>> accordingly.
>>
>> Exit latency:
>> Previously the C6 exit latency was measured as the PC6 exit latency.
>> With PC6 disabled, the C6 exit latency should be the one of CC6.
>>
>> Target residency:
>> With PC6 disabled, the idle duration within [CC6, PC6) would make the
>> idle governor choose C1E over C6. This would cause low energy-efficiency.
>> We should lower the bar to request C6 when PC6 is disabled.
>>
>> To fill this gap, check if PC6 is disabled in the BIOS in the
>> MSR_PKG_CST_CONFIG_CONTROL(0xe2) register. If so, use the CC6 exit latency
>> for C6 and set target_residency to 3 times of the new exit latency. [This
>> is consistent with how intel_idle driver uses _CST to calculate the
>> target_residency.] As a result, the OS would be more likely to choose C6
>> over C1E when PC6 is disabled, which is reasonable, because if C6 is
>> enabled, it implies that the user cares about energy, so choosing C6 more
>> frequently makes sense.
>>
>> The new CC6 exit latency of 92us was measured with wult[1] on SKX via NIC
>> wakeup as the 99.99th percentile. Also CLX and CPX both have the same CPU
>> model number as SkX, but their CC6 exit latencies are similar to the SKX
>> one, 96us and 89us respectively, so reuse the SKX value for them.
>>
>> There is a concern that it might be better to use a more generic approach
>> instead of optimizing every platform. However, if the required code
>> complexity and different PC6 bit interpretation on different platforms
>> are taken into account, tuning the code per platform seems to be an
>> acceptable tradeoff.
>>
>> Link: 
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fintel.github.io%2Fwult%2F&amp;data=04%7C01%7Croger.pau%40citrix.com%7Ccdf115e71eb14429868408d97136a902%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637665301851513469%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=D9sIoe%2BFAvia3OuRsT19VAxkTqrlnPHnKqTSKiW6pRM%3D&amp;reserved=0
>>  # [1]
>> Suggested-by: Len Brown <len.brown@xxxxxxxxx>
>> Signed-off-by: Chen Yu <yu.c.chen@xxxxxxxxx>
>> Reviewed-by: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx>
>> [ rjw: Subject and changelog edits ]
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>> [Linux commit: 64233338499126c5c31e07165735ab5441c7e45a]
>>
>> Pull in Linux'es MSR_PKG_CST_CONFIG_CONTROL. Alongside the dropping of
>> "const" from skx_cstates[] add __read_mostly, and extend that to other
>> similar non-const tables.
>>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> 
> Acked-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Thanks.

>> @@ -1161,6 +1161,36 @@ static void __init sklh_idle_state_table
>>  }
>>  
>>  /*
>> + * skx_idle_state_table_update - Adjust the Sky Lake/Cascade Lake
>> + * idle states table.
>> + */
>> +static void __init skx_idle_state_table_update(void)
>> +{
>> +    unsigned long long msr;
>> +
>> +    rdmsrl(MSR_PKG_CST_CONFIG_CONTROL, msr);
>> +
>> +    /*
>> +     * 000b: C0/C1 (no package C-state support)
>> +     * 001b: C2
>> +     * 010b: C6 (non-retention)
>> +     * 011b: C6 (retention)
>> +     * 111b: No Package C state limits.
>> +     */
>> +    if ((msr & 0x7) < 2) {
> 
> I wouldn't mind adding this mask field to msr-index.h.

This wouldn't buy us much since - as per the original Linux change - the
manifest constant then still wouldn't be used here.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.