[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: [RFC] pvops domain0 Cx/Px info parser patch



Yu, Ke wrote:
Hi, Jeremy

Since now the pv_ops domain0 is approaching, we are considering to port the 
dom0 power management related code, or more specifically, the Cx/Px ACPI info 
parsing code, to pv_ops domain0 tree, so that people can utilize Xen power 
management capability (Cx/PX) under pv_ops domain0. This RFC is a draft version 
for comments.

Hi. I commented on the last patch, but saw no response. Did you get those comments?

Oh, perhaps they never got sent. I've attached them now. I have not looked at these patches yet. Do they differ much?

   J
--- Begin Message ---
Yu, Ke wrote:
Hi Jeremy,

Kevin and I originally write the domain0 host S3 code and Cx/Px state ACPI info parsing code in the linux-2.6.18-xen.hg tree. Since now the pv_ops domain0 is approaching, we are considering to port these code to pv_ops domain0 tree. It is very nice that you already port the host S3 code, so in this email we post the Cx/Px state ACPI info parsing code. This is a draft version intending for early comments. 
  

Thanks for doing this.  I looked at the changes in 2.6.18-xen and put them into the too-hard basket.

Some general comments about patch submission:
  • please send patches one-per-mail, with [PATCH N/M] in the subject line to indicate ordering
  • in each mail, include the full changelog description like you have below
  • each person signing off the mail should have a Signed-off-by: line
This makes it easier to do line-by-line commentary on the patches and also process it with the tools I have on hand.

I just committed these patches to my tree, rebased onto xen-tip/dom0/core as xen-tip/dom0/acpi-parser, but I haven't even compile-tested to see if the rebase worked.  But I did cut-and-paste the overview into the first patch comment so its recorded properly.

=== Overview ===

Requirement: Xen hypervisor need Cx/Px ACPI info to do the Cx/Px states power management. This info is provided by BIOS ACPI table. Since hypervisor has no ACPI parser, this info has to be parsed by domain0 kernel ACPI sub-system, and then passed to hypervisor by hypercall.
  

Couldn't this be done with a variant of the cpufreq_acpi driver?  It needs to parse the tables, but rather than acting on them directly it passes the information to Xen?

To make this happen, the key point is to add hook in the kernel ACPI sub-system. Fortunately, kernel already has good abstraction, and only several places need to add hook. To be more detail, there is an acpi_processor_driver (in drivers/acpi/processor_core.c) , which all the Cx/Px parsing event will go to. This driver will call its acpi processor event handler, e.g. add/remove, start/stop, notify to handle these events. These event handlers in turn will call some library functions (in drivers/acpi/processor_perflib.c), e.g. acpi_processor_ppc_has_changed, acpi_processor_ppc_has_changed, acpi_processor_cst_has_changed, to finish the acpi info parsing.

So the conclusion is: adding hooks in acpi_processor_driver and those related library functions will satisfy our requirement.

To make the added hook cleaner, we introduce an interface called external control operation (struct processor_extcntl_ops). All the hooks are encapsulated in this interface processor_extcntl_ops . Here the "external" means the acpi processor is controlled by external entity, e.g. VMM. Every kind of external entity can has its implementation of this interface. In this patch, the interface for Xen is implemented.
  

Is the issue that the hypervisor will change the CPUs states asynchronously under the kernel, and the kernel wants to be informed about those state changes, along with some kind of mapping from pcpu to vcpu?

What use does the kernel make of that information?

I guess I don't really understand what the larger problem domain is.

Do you have any plans for other hypervisor backends?  Does a kvm one make sense?

== Patch Set Description ==

This patch set is based the xen/dom0/hackery branch. It has three patches:

1.	acpi_parser_framework.patch
This patch introduces the interface of external control operation, and adds hooks to the acpi sub-system, including the acpi_processor_driver, and the related library functions.

2.	acpi_parser_xen_driver.patch
This patch implements the Xen version of the external control operation interface. 

3.	acpi_parser_platform_hypercall.patch
This patches implements the xen_platform_op hypercall, to pass the parsed ACPI info to hypervisor.

== Misc ==

To make this patch works correctly, there is a corner case need to consider: processor in domain0 is virtual CPU, while BIOS ACPI table provides the physical CPU info, since the vCPU and pCPU is not 1:1 mapped, this may caused unexpected result.  E.g. in UP domain0, Xen hypervisor may only get one physical processor info.  Current linux-2.6.18-xen.hg tree solve this issue by using acpi_id instead of processor_id. However, this code is a bit tricky and hacky. We are still trying to find a cleaner way to solve this issue. 
  

Is this a corner case?  I would think its the common case.

Anyway, this is draft version for your early comments, and after your reviewing, we will send it to wider scope, e.g. kernel ACPI sub-system developer in Intel.
  

OK, some more specific comments:

Cut down on the number of #ifdefs.  If CONFIG_PROCESSOR_EXTERNAL_CONTROL is not set, define the functions to appropriate no-op definitions.  For example, rather than:
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+			if (!processor_cntl_external()) {
+				return -ENODEV;
+			}
+#else
 			return -ENODEV;
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
just define processor_cntl_external() to always return 0.  (Oh, it looks like you already do this.  Why all the #ifdefs then?)

Also, rather than:
+		if (processor_cntl_external())
+			processor_notify_external(pr, PROCESSOR_HOTPLUG,
+							HOTPLUG_TYPE_REMOVE);
can't processor_notify_external() do the test itself?  (In addition to the #ifdef comments.)

This seems pointless:
-static int acpi_processor_get_performance_info(struct acpi_processor *pr)
+#ifndef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+static
+#endif
+int acpi_processor_get_performance_info(struct acpi_processor *pr)


 config CPU_FREQ
 	bool "CPU Frequency scaling"
+	depends on !PROCESSOR_EXTERNAL_CONTROL

This isn't acceptible.  A single kernel image could be running on bare hardware, or be booted as a Xen dom0 kernel.  It needs to be able to handle both CPUFREQ and external processor control.  Unless external processor control somehow subsumes all the cpufreq driver functions?  Anyway, all decisions need to be made at runtime.

Put all the Xen-specific code into a Xen-specific header.  I've already added include/xen/acpi.h for the S3 changes, so perhaps that would be suitable.


+ifneq ($(CONFIG_PROCESSOR_EXTERNAL_CONTROL),)
+obj-$(CONFIG_XEN)		+= processor_extcntl_xen.o
+endif
No, do this in Kconfig, with something like:
config XEN_PROCESSOR_EXTERNAL_CONTROL
	def_bool y
	depends on XEN && PROCESSOR_EXTERNAL_CONTROL
And maybe XEN_ACPI and/or XEN_DOM0?  If this is useful for non-x86 (ia64), then put it in drivers/xen.

    J

--- End Message ---
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.