[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v6 19/19] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC xen_sysctl_pm_op for amd-cppc driver
On 11.07.2025 05:51, Penny Zheng wrote: > Introduce helper set_amd_cppc_para() and get_amd_cppc_para() to > SET/GET CPPC-related para for amd-cppc/amd-cppc-epp driver. > > In get_cpufreq_cppc()/set_cpufreq_cppc(), we include > "processor_pminfo[cpuid]->init & XEN_CPPC_INIT" condition check to deal with > cpufreq driver in amd-cppc. > > Also, a new field "policy" has also been added in "struct xen_get_cppc_para" > to describe performance policy in active mode. It gets printed with other > cppc paras. Move manifest constants "XEN_CPUFREQ_POLICY_xxx" to public header > to let it be used in user space tools. Also add a new anchor > "XEN_CPUFREQ_POLICY_xxx" for array overrun check. If only they indeed had XEN_ prefixes. > Signed-off-by: Penny Zheng <Penny.Zheng@xxxxxxx> > --- > v1 -> v2: > - Give the variable des_perf an initializer of 0 > - Use the strncmp()s directly in the if() > --- > v3 -> v4 > - refactor comments > - remove double blank lines > - replace amd_cppc_in_use flag with XEN_PROCESSOR_PM_CPPC > --- > v4 -> v5: > - add new field "policy" in "struct xen_cppc_para" > - add new performamce policy XEN_CPUFREQ_POLICY_BALANCE > - drop string comparisons with "processor_pminfo[cpuid]->init & XEN_CPPC_INIT" > and "cpufreq.setpolicy == NULL" > - Blank line ahead of the main "return" of a function > - refactor comments, commit message and title > --- > v5 -> v6: > - remove duplicated manifest constants, and just move it to public header > - use "else if" to avoid confusion that it looks as if both paths could be > taken > - add check for legitimate perf values > - use "unknown" instead of "none" > - introduce "CPUFREQ_POLICY_END" for array overrun check in user space tools Please don't; use ARRAY_SIZE() (and if necessary further checking) instead. In fact I think ... > --- a/tools/misc/xenpm.c > +++ b/tools/misc/xenpm.c > @@ -38,6 +38,13 @@ > static xc_interface *xc_handle; > static unsigned int max_cpu_nr; > > +static const char cpufreq_policy_str[][12] = { > + [CPUFREQ_POLICY_UNKNOWN] = "unknown", > + [CPUFREQ_POLICY_POWERSAVE] = "powersave", > + [CPUFREQ_POLICY_PERFORMANCE] = "performance", > + [CPUFREQ_POLICY_ONDEMAND] = "ondemand", > +}; > + > /* help message */ > void show_help(void) > { > @@ -825,6 +832,10 @@ static void print_cppc_para(unsigned int cpuid, > printf(" : desired [%"PRIu32"%s]\n", > cppc->desired, > cppc->desired ? "" : " hw autonomous"); > + > + if ( cppc->policy < CPUFREQ_POLICY_END ) > + printf(" performance policy : %s\n", > + cpufreq_policy_str[cppc->policy]); ... you would want to print "unknown" in all other cases as well. It's not clear to me though how the printing is avoided for passive mode. > --- a/xen/arch/x86/acpi/cpufreq/amd-cppc.c > +++ b/xen/arch/x86/acpi/cpufreq/amd-cppc.c > @@ -561,6 +561,186 @@ static int cf_check amd_cppc_epp_set_policy(struct > cpufreq_policy *policy) > return 0; > } > > +int get_amd_cppc_para(const struct cpufreq_policy *policy, > + struct xen_get_cppc_para *cppc_para) > +{ > + const struct amd_cppc_drv_data *data = per_cpu(amd_cppc_drv_data, > + policy->cpu); > + > + if ( data == NULL ) > + return -ENODATA; > + > + cppc_para->policy = policy->policy; > + cppc_para->lowest = data->caps.lowest_perf; > + cppc_para->lowest_nonlinear = data->caps.lowest_nonlinear_perf; > + cppc_para->nominal = data->caps.nominal_perf; > + cppc_para->highest = data->caps.highest_perf; > + cppc_para->minimum = data->req.min_perf; > + cppc_para->maximum = data->req.max_perf; > + cppc_para->desired = data->req.des_perf; > + cppc_para->energy_perf = data->req.epp; > + > + return 0; > +} > + > +int set_amd_cppc_para(struct cpufreq_policy *policy, > + const struct xen_set_cppc_para *set_cppc) > +{ > + unsigned int cpu = policy->cpu; > + struct amd_cppc_drv_data *data = per_cpu(amd_cppc_drv_data, cpu); > + uint8_t max_perf, min_perf, des_perf = 0, epp; > + bool active_mode; > + > + if ( data == NULL ) > + return -ENOENT; > + > + if ( cpufreq_is_governorless(cpu) ) > + active_mode = true; Without "else" the variable will be left uninitialized. I'm surprised the compiler allowed you to get away with this. Why is the function call not simply the variable's initializer? > + /* Only allow values if params bit is set. */ > + if ( (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_DESIRED) && > + set_cppc->desired) || > + (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM) && > + set_cppc->minimum) || > + (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) && > + set_cppc->maximum) || > + (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_ENERGY_PERF) && > + set_cppc->energy_perf) ) > + return -EINVAL; > + > + /* > + * Validate all parameters > + * Maximum performance may be set to any performance value in the range > + * [Nonlinear Lowest Performance, Highest Performance], inclusive. > + */ > + if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM && Nit: Missing parentheses again. Which is particularly odd since in the immediately preceding if() they're there. More of this further down. > + (set_cppc->maximum > data->caps.highest_perf || > + set_cppc->maximum < data->caps.lowest_nonlinear_perf) ) > + return -EINVAL; > + /* > + * Minimum performance may be set to any performance value in the range > + * [Nonlinear Lowest Performance, Highest Performance], inclusive but > must > + * be set to a value that is less than or equal to Maximum Performance. > + */ > + if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM && > + (set_cppc->minimum < data->caps.lowest_nonlinear_perf || > + (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM && > + set_cppc->minimum > set_cppc->maximum) || > + (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) && Hmm, I find this confusing to read, and was first thinking the ! was wrong here. Imo such is better expressed with the conditional operator: set_cppc->minimum > (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM ? set_cppc->maximum : data->req.max_perf) Which also makes it easier to spot that here you use data->req, when in the minimum check you use data->caps. Why this difference? > + set_cppc->minimum > data->req.max_perf)) ) > + return -EINVAL; > + /* > + * Desired performance may be set to any performance value in the range > + * [Minimum Performance, Maximum Performance], inclusive. > + */ > + if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_DESIRED ) > + { > + if ( active_mode ) > + return -EOPNOTSUPP; > + > + if ( (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM && > + set_cppc->desired > set_cppc->maximum) || > + (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM && > + set_cppc->desired < set_cppc->minimum) || > + (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) && > + set_cppc->desired > data->req.max_perf) || > + (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM) && > + set_cppc->desired < data->req.min_perf) ) > + return -EINVAL; All of the above applies here as well. > + } > + /* > + * Energy Performance Preference may be set with a range of values > + * from 0 to 0xFF > + */ > + if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_ENERGY_PERF ) > + { > + if ( !active_mode ) > + return -EOPNOTSUPP; > + > + if ( set_cppc->energy_perf > UINT_MAX ) > + return -EINVAL; > + } > + > + /* Activity window not supported in MSR */ > + if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_ACT_WINDOW ) > + return -EOPNOTSUPP; > + > + /* Return if there is nothing to do. */ > + if ( set_cppc->set_params == 0 ) > + return 0; Why did this move so far down (I think it was sitting earlier)? The earlier 5 if()s are all needlessly carried out in this case (unless the compiler can reason about moving this check ahead). > + epp = per_cpu(epp_init, cpu); > + /* > + * Apply presets: > + * XEN_SYSCTL_CPPC_SET_PRESET_POWERSAVE/PERFORMANCE/ONDEMAND are > + * only available when CPPC in active mode > + */ > + switch ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_PRESET_MASK ) > + { > + case XEN_SYSCTL_CPPC_SET_PRESET_POWERSAVE: > + if ( !active_mode ) > + return -EINVAL; > + policy->policy = CPUFREQ_POLICY_POWERSAVE; > + min_perf = data->caps.lowest_nonlinear_perf; > + /* > + * Lower max_perf to nonlinear_lowest to achieve > + * ultmost power saviongs > + */ > + max_perf = data->caps.lowest_nonlinear_perf; Why not use the shorter "min_perf" here? > + epp = CPPC_ENERGY_PERF_MAX_POWERSAVE; > + break; > + > + case XEN_SYSCTL_CPPC_SET_PRESET_PERFORMANCE: > + if ( !active_mode ) > + return -EINVAL; > + policy->policy = CPUFREQ_POLICY_PERFORMANCE; > + /* Increase min_perf to highest to achieve ultmost performance */ > + min_perf = data->caps.highest_perf; > + max_perf = data->caps.highest_perf; And "max_perf" here? Furthermore if you moved ... > + epp = CPPC_ENERGY_PERF_MAX_PERFORMANCE; > + break; > + > + case XEN_SYSCTL_CPPC_SET_PRESET_ONDEMAND: > + if ( !active_mode ) > + return -EINVAL; > + policy->policy = CPUFREQ_POLICY_ONDEMAND; > + min_perf = data->caps.lowest_nonlinear_perf; > + max_perf = data->caps.highest_perf; ... these two ahead of the switch(), you could further reduce the number of assignments (overrides) done, including ... > + /* > + * Take medium value to show no preference over > + * performance or powersave > + */ > + epp = CPPC_ENERGY_PERF_BALANCE; > + break; > + > + case XEN_SYSCTL_CPPC_SET_PRESET_NONE: > + policy->policy = CPUFREQ_POLICY_UNKNOWN; > + min_perf = data->caps.lowest_nonlinear_perf; > + max_perf = data->caps.highest_perf; ... the dropping of these two lines. > --- a/xen/drivers/acpi/pm-op.c > +++ b/xen/drivers/acpi/pm-op.c > @@ -84,6 +84,8 @@ static int get_cpufreq_cppc(unsigned int cpu, > > if ( hwp_active() ) > ret = get_hwp_para(cpu, cppc_para); > + else if ( processor_pminfo[cpu]->init & XEN_CPPC_INIT ) > + ret = get_amd_cppc_para(per_cpu(cpufreq_cpu_policy, cpu), cppc_para); > > return ret; > } > @@ -325,10 +327,12 @@ static int set_cpufreq_cppc(struct xen_sysctl_pm_op *op) > if ( !policy || !policy->governor ) > return -ENOENT; > > - if ( !hwp_active() ) > - return -EOPNOTSUPP; > + if ( hwp_active() ) > + return set_hwp_para(policy, &op->u.set_cppc); > + else if ( processor_pminfo[op->cpuid]->init & XEN_CPPC_INIT ) As before, please can you avoid the use of "else" in such cases. Strictly speaking that's "dead code" in Misra's nomeclature. > --- a/xen/include/acpi/cpufreq/cpufreq.h > +++ b/xen/include/acpi/cpufreq/cpufreq.h > @@ -133,21 +133,6 @@ extern int cpufreq_register_governor(struct > cpufreq_governor *governor); > extern struct cpufreq_governor *__find_governor(const char *governor); > #define CPUFREQ_DEFAULT_GOVERNOR &cpufreq_gov_dbs > > -/* > - * Performance Policy > - * If cpufreq_driver->target() exists, the ->governor decides what frequency > - * within the limits is used. If cpufreq_driver->setpolicy() exists, these > - * following policies are available: > - * CPUFREQ_POLICY_PERFORMANCE represents maximum performance > - * CPUFREQ_POLICY_POWERSAVE represents least power consumption > - * CPUFREQ_POLICY_ONDEMAND represents no preference over performance or > - * powersave > - */ > -#define CPUFREQ_POLICY_UNKNOWN 0 > -#define CPUFREQ_POLICY_POWERSAVE 1 > -#define CPUFREQ_POLICY_PERFORMANCE 2 > -#define CPUFREQ_POLICY_ONDEMAND 3 > - > unsigned int cpufreq_policy_from_governor(const struct cpufreq_governor > *gov); > > /* pass a target to the cpufreq driver */ > @@ -294,6 +279,10 @@ int acpi_cpufreq_register(void); > > int amd_cppc_cmdline_parse(const char *s, const char *e); > int amd_cppc_register_driver(void); > +int get_amd_cppc_para(const struct cpufreq_policy *policy, > + struct xen_get_cppc_para *cppc_para); > +int set_amd_cppc_para(struct cpufreq_policy *policy, > + const struct xen_set_cppc_para *set_cppc); > > bool cpufreq_in_cppc_passive_mode(unsigned int cpuid); > bool cpufreq_is_governorless(unsigned int cpuid); > --- a/xen/include/public/sysctl.h > +++ b/xen/include/public/sysctl.h > @@ -336,8 +336,25 @@ struct xen_ondemand { > uint32_t up_threshold; > }; > > +/* > + * Performance Policy > + * If cpufreq_driver->target() exists, the ->governor decides what frequency > + * within the limits is used. If cpufreq_driver->setpolicy() exists, these > + * following policies are available: > + * CPUFREQ_POLICY_PERFORMANCE represents maximum performance > + * CPUFREQ_POLICY_POWERSAVE represents least power consumption > + * CPUFREQ_POLICY_ONDEMAND represents no preference over performance or > + * powersave > + */ I appreciate that you want to retain the comment, but implementation details of the hypervisor imo don't belong in public headers. That internal part may want to move to e.g. the respective (internal) struct field. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |