[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v6 12/19] xen/cpufreq: implement amd-cppc driver for CPPC in passive mode
On 11.07.2025 05:50, Penny Zheng wrote: > --- a/xen/arch/x86/acpi/cpufreq/amd-cppc.c > +++ b/xen/arch/x86/acpi/cpufreq/amd-cppc.c > @@ -14,7 +14,95 @@ > #include <xen/domain.h> > #include <xen/init.h> > #include <xen/param.h> > +#include <xen/percpu.h> > +#include <xen/xvmalloc.h> > #include <acpi/cpufreq/cpufreq.h> > +#include <asm/amd.h> > +#include <asm/msr-index.h> > + > +#define amd_cppc_err(cpu, fmt, args...) \ > + printk(XENLOG_ERR "AMD-CPPC: CPU%u error: " fmt, cpu, ## args) > +#define amd_cppc_warn(cpu, fmt, args...) \ > + printk(XENLOG_WARNING "AMD-CPPC: CPU%u warning: " fmt, cpu, ## args) > +#define amd_cppc_verbose(cpu, fmt, args...) \ > +({ \ > + if ( cpufreq_verbose ) \ > + printk(XENLOG_DEBUG "AMD-CPPC: CPU%u " fmt, cpu, ## args); \ > +}) > + > +/* > + * Field highest_perf, nominal_perf, lowest_nonlinear_perf, and lowest_perf > + * contain the values read from CPPC capability MSR. They represent the > limits > + * of managed performance range as well as the dynamic capability, which may > + * change during processor operation > + * Field highest_perf represents highest performance, which is the absolute > + * maximum performance an individual processor may reach, assuming ideal > + * conditions. This performance level may not be sustainable for long > + * durations and may only be achievable if other platform components > + * are in a specific state; for example, it may require other processors be > + * in an idle state. This would be equivalent to the highest frequencies > + * supported by the processor. > + * Field nominal_perf represents maximum sustained performance level of the > + * processor, assuming ideal operating conditions. All cores/processors are > + * expected to be able to sustain their nominal performance state\ Nit: Stray trailing backslash. > + * simultaneously. > + * Field lowest_nonlinear_perf represents Lowest Nonlinear Performance, which > + * is the lowest performance level at which nonlinear power savings are > + * achieved. Above this threshold, lower performance levels should be > + * generally more energy efficient than higher performance levels. So in > + * traditional terms, this represents the P-state range of performance > levels. > + * Field lowest_perf represents the absolute lowest performance level of the > + * platform. Selecting it may cause an efficiency penalty but should reduce > + * the instantaneous power consumption of the processor. So in traditional > + * terms, this represents the T-state range of performance levels. > + * > + * Field max_perf, min_perf, des_perf store the values for CPPC request MSR. > + * Software passes performance goals through these fields. > + * Field max_perf conveys the maximum performance level at which the platform > + * may run. And it may be set to any performance value in the range > + * [lowest_perf, highest_perf], inclusive. > + * Field min_perf conveys the minimum performance level at which the platform > + * may run. And it may be set to any performance value in the range > + * [lowest_perf, highest_perf], inclusive but must be less than or equal to > + * max_perf. > + * Field des_perf conveys performance level Xen governor is requesting. And > it > + * may be set to any performance value in the range [min_perf, max_perf], > + * inclusive. > + */ > +struct amd_cppc_drv_data > +{ > + const struct xen_processor_cppc *cppc_data; > + union { > + uint64_t raw; > + struct { > + unsigned int lowest_perf:8; > + unsigned int lowest_nonlinear_perf:8; > + unsigned int nominal_perf:8; > + unsigned int highest_perf:8; > + unsigned int :32; > + }; > + } caps; > + union { > + uint64_t raw; > + struct { > + unsigned int max_perf:8; > + unsigned int min_perf:8; > + unsigned int des_perf:8; > + unsigned int epp:8; > + unsigned int :32; > + }; > + } req; > + > + int err; > +}; > + > +static DEFINE_PER_CPU_READ_MOSTLY(struct amd_cppc_drv_data *, > + amd_cppc_drv_data); > +/* > + * Core max frequency read from PstateDef as anchor point > + * for freq-to-perf transition > + */ > +static DEFINE_PER_CPU_READ_MOSTLY(unsigned int, pxfreq_mhz); > > static bool __init amd_cppc_handle_option(const char *s, const char *end) > { > @@ -50,10 +138,327 @@ int __init amd_cppc_cmdline_parse(const char *s, const > char *e) > return 0; > } > > +/* > + * If CPPC lowest_freq and nominal_freq registers are exposed then we can > + * use them to convert perf to freq and vice versa. The conversion is > + * extrapolated as an linear function passing by the 2 points: > + * - (Low perf, Low freq) > + * - (Nominal perf, Nominal freq) > + * Parameter freq is always in kHz. > + */ > +static int amd_cppc_khz_to_perf(const struct amd_cppc_drv_data *data, > + unsigned int freq, uint8_t *perf) > +{ > + const struct xen_processor_cppc *cppc_data = data->cppc_data; > + unsigned int mul, div; > + int offset = 0, res; > + > + if ( cppc_data->cpc.lowest_mhz && cppc_data->cpc.nominal_mhz && > + data->caps.nominal_perf != data->caps.lowest_perf && > + cppc_data->cpc.nominal_mhz != cppc_data->cpc.lowest_mhz ) While I understand that required relations are being checked elsewhere, if you used > in place of != here, that would not only serve a doc aspect, but also allow to drop one part: if ( cppc_data->cpc.lowest_mhz && data->caps.nominal_perf > data->caps.lowest_perf && cppc_data->cpc.nominal_mhz > cppc_data->cpc.lowest_mhz ) > + { > + mul = data->caps.nominal_perf - data->caps.lowest_perf; > + div = cppc_data->cpc.nominal_mhz - cppc_data->cpc.lowest_mhz; > + > + /* > + * We don't need to convert to kHz for computing offset and can > + * directly use nominal_mhz and lowest_mhz as the division > + * will remove the frequency unit. > + */ > + offset = data->caps.nominal_perf - > + (mul * cppc_data->cpc.nominal_mhz) / div; > + } > + else > + { > + /* Read Processor Max Speed(MHz) as anchor point */ > + mul = data->caps.highest_perf; > + div = this_cpu(pxfreq_mhz); > + if ( !div ) > + return -EOPNOTSUPP; > + } > + > + res = offset + (mul * freq) / (div * 1000); > + if ( res > UINT8_MAX ) Why UINT8_MAX here but ... > + { > + printk_once(XENLOG_WARNING > + "Perf value exceeds maximum value 255: %d\n", res); > + *perf = 0xff; ... 0xff here? > + return 0; > + } > + if ( res < 0 ) > + { > + printk_once(XENLOG_WARNING > + "Perf value smaller than minimum value 0: %d\n", res); > + *perf = 0; > + return 0; > + } > + *perf = res; Considering that amd_cppc_init_msrs() rejects perf values of 0 as invalid, is 0 actually valid as an output here? > +/* > + * _CPC may define nominal frequecy and lowest frequency, if not, use > + * Processor Max Speed as anchor point to calculate. > + * Output freq stores cpc frequency in kHz > + */ > +static int amd_get_cpc_freq(const struct amd_cppc_drv_data *data, > + uint32_t cpc_mhz, uint8_t perf, unsigned int > *freq) Once again no need for uint32_t when unsigned int will do. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |