[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 3/4] mwait-idle: add 'preferred_cstates' module argument


  • To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 26 Apr 2022 12:05:28 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=V8XyV84mse8tuFsetJlisN+guE7TcB4fbUoJF+UDFJg=; b=V0GuZJiIA8qw9T4O8sf1sA+Xb/B+t8tLjCbXR+YXVwV0m7bClNVFMul5hrSnH3FhcgT8Lsa/5gihQfC0DI7HKnlH8cEW5fOUjTeDO7MUf77MrSNygGykHW1FMlbapRSP/taZynjtKWRT1jRwQAEznL/imvRHP4ttEUyrWmzhkFvYGL3jyK50d1PKMVPTLe2YTdiEPFgCH+Av585Ea66ivjdGzV17usqZ+w0yi7tLYvDi5AH2EzqC0EMHKzKkPQDmNQ00gqDXs8YaETjTCK5/FYGvtAu95zamcNQKFnAgRnJNsKdY7zi42wlwGS1tt8M5ZB8FlGpCkPw7rsxqfAUxpQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=d2n1mtdYf5goyiy6WnINRTy+oihxM/HzRoQx32gLrFLijXCf8OMfJ+rifivITAAtPUzcBg7KDkS/jNxUgGPB41Gguv7UuUN5C7Ell+JlDiBXRha0pau0ZgVBcWR32Ki9/kj6ET2GbjEw5UvpenfGMAnMoLx9yovRhL0CrvlVqSOxF+8nbILq6P1fis+XFL6UF8e6cfXju6EmHD/Fn9XuHfFeMh2swTyl8/lg9VZGyz2GL29q4Q0z48UoBfm6lNZH9ejb39TljPkWSRlZNjcp1R3FeNd+h7GDYHLLB7L2Lo0D2Se1R4BJhUiAi4QpEO/Q4FDVBI+qw0JYprhnHhlIiw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Tue, 26 Apr 2022 10:05:35 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

From: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx>

On Sapphire Rapids Xeon (SPR) the C1 and C1E states are basically mutually
exclusive - only one of them can be enabled. By default, 'intel_idle' driver
enables C1 and disables C1E. However, some users prefer to use C1E instead of
C1, because it saves more energy.

This patch adds a new module parameter ('preferred_cstates') for enabling C1E
and disabling C1. Here is the idea behind it.

1. This option has effect only for "mutually exclusive" C-states like C1 and
   C1E on SPR.
2. It does not have any effect on independent C-states, which do not require
   other C-states to be disabled (most states on most platforms as of today).
3. For mutually exclusive C-states, the 'intel_idle' driver always has a
   reasonable default, such as enabling C1 on SPR by default. On other
   platforms, the default may be different.
4. Users can override the default using the 'preferred_cstates' parameter.
5. The parameter accepts the preferred C-states bit-mask, similarly to the
   existing 'states_off' parameter.
6. This parameter is not limited to C1/C1E, and leaves room for supporting
   other mutually exclusive C-states, if they come in the future.

Today 'intel_idle' can only be compiled-in, which means that on SPR, in order
to disable C1 and enable C1E, users should boot with the following kernel
argument: intel_idle.preferred_cstates=4

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
da0e58c038e6

Enable C1E (if requested) not only on the BSP's socket / package.

Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

--- unstable.orig/docs/misc/xen-command-line.pandoc     2022-04-25 
17:59:42.123387258 +0200
+++ unstable/docs/misc/xen-command-line.pandoc  2022-04-25 17:36:00.000000000 
+0200
@@ -1884,6 +1884,12 @@ paging controls access to usermode addre
 ### ple_window (Intel)
 > `= <integer>`
 
+### preferred-cstates (x86)
+> `= <integer>`
+
+This is a mask of C-states which are to be use preferably.  This option is
+applicable only oh hardware were certain C-states are exlusive of one another.
+
 ### psr (Intel)
 > `= List of ( cmt:<boolean> | rmid_max:<integer> | cat:<boolean> | 
 > cos_max:<integer> | cdp:<boolean> )`
 
--- unstable.orig/xen/arch/x86/cpu/mwait-idle.c 2022-04-25 17:17:05.000000000 
+0200
+++ unstable/xen/arch/x86/cpu/mwait-idle.c      2022-04-25 17:33:47.000000000 
+0200
@@ -82,6 +82,18 @@ boolean_param("mwait-idle", opt_mwait_id
 
 static unsigned int mwait_substates;
 
+/*
+ * Some platforms come with mutually exclusive C-states, so that if one is
+ * enabled, the other C-states must not be used. Example: C1 and C1E on
+ * Sapphire Rapids platform. This parameter allows for selecting the
+ * preferred C-states among the groups of mutually exclusive C-states - the
+ * selected C-states will be registered, the other C-states from the mutually
+ * exclusive group won't be registered. If the platform has no mutually
+ * exclusive C-states, this parameter has no effect.
+ */
+static unsigned int __ro_after_init preferred_states_mask;
+integer_param("preferred-cstates", preferred_states_mask);
+
 #define LAPIC_TIMER_ALWAYS_RELIABLE 0xFFFFFFFF
 /* Reliable LAPIC Timer States, bit 1 for C1 etc. Default to only C1. */
 static unsigned int lapic_timer_reliable_states = (1 << 1);
@@ -96,6 +108,7 @@ struct idle_cpu {
        unsigned long auto_demotion_disable_flags;
        bool byt_auto_demotion_disable_flag;
        bool disable_promotion_to_c1e;
+       bool enable_promotion_to_c1e;
 };
 
 static const struct idle_cpu *icpu;
@@ -924,6 +937,15 @@ static void cf_check byt_auto_demotion_d
        wrmsrl(MSR_MC6_DEMOTION_POLICY_CONFIG, 0);
 }
 
+static void cf_check c1e_promotion_enable(void *dummy)
+{
+       uint64_t msr_bits;
+
+       rdmsrl(MSR_IA32_POWER_CTL, msr_bits);
+       msr_bits |= 0x2;
+       wrmsrl(MSR_IA32_POWER_CTL, msr_bits);
+}
+
 static void cf_check c1e_promotion_disable(void *dummy)
 {
        u64 msr_bits;
@@ -1241,6 +1263,26 @@ static void __init skx_idle_state_table_
 }
 
 /*
+ * spr_idle_state_table_update - Adjust Sapphire Rapids idle states table.
+ */
+static void __init spr_idle_state_table_update(void)
+{
+       /* Check if user prefers C1E over C1. */
+       if (preferred_states_mask & BIT(2, U)) {
+               if (preferred_states_mask & BIT(1, U))
+                       /* Both can't be enabled, stick to the defaults. */
+                       return;
+
+               spr_cstates[0].flags |= CPUIDLE_FLAG_DISABLED;
+               spr_cstates[1].flags &= ~CPUIDLE_FLAG_DISABLED;
+
+               /* Request enabling C1E using the "C1E promotion" bit. */
+               idle_cpu_spr.disable_promotion_to_c1e = false;
+               idle_cpu_spr.enable_promotion_to_c1e = true;
+       }
+}
+
+/*
  * mwait_idle_state_table_update()
  *
  * Update the default state_table for this CPU-id
@@ -1261,6 +1303,9 @@ static void __init mwait_idle_state_tabl
        case INTEL_FAM6_SKYLAKE_X:
                skx_idle_state_table_update();
                break;
+       case INTEL_FAM6_SAPPHIRERAPIDS_X:
+               spr_idle_state_table_update();
+               break;
        }
 }
 
@@ -1402,6 +1447,8 @@ static int cf_check mwait_idle_cpu_init(
 
        if (icpu->disable_promotion_to_c1e)
                on_selected_cpus(cpumask_of(cpu), c1e_promotion_disable, NULL, 
1);
+       else if (icpu->enable_promotion_to_c1e)
+               on_selected_cpus(cpumask_of(cpu), c1e_promotion_enable, NULL, 
1);
 
        return NOTIFY_DONE;
 }




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.