[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] expose MWAIT to dom0



>>> On 26.08.11 at 04:18, "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
>>  From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx] 
>> Sent: Thursday, August 25, 2011 8:37 PM
>> 
>> >>> On 21.08.11 at 07:26, "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
>> >>  From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx] 
>> >> Sent: Friday, August 19, 2011 11:02 PM
>> >> > >> Yet another idea - why don't we simply pass the buffer passed to
>> >> > >> arch_acpi_set_pdc_bits() down to Xen, rather than fiddling with the
>> >> > >> bits in Dom0? That would at once allow to not set ACPI_PDC_T_FFH
>> >> > >> (which I don't think Xen really supports at present).
>> >> > >>
>> >> > >> Or really, depending on who controls what, the P, C, and T bits 
>> >> > >> should
>> >> > >> be set by either Dom0 or Xen (so e.g. let Dom0 do what it currently
>> >> > >> does, and then let Xen override the bits it ought to control).
>> >> > >
>> >> > > _PDC is encoded in AML language, and requires an ACPI parser which
>> >> > > is one thing we avoid in Xen. If Xen want to override those bits, then
>> >> > > whole ACPI component needs move down to Xen too.
>> >> >
>> >> > No, I'm not saying the evaluation should be happening there. Below is
>> >> > a draft hypervisor patch (only compile tested so far).
>> >>
>> >> Attached a patch that actually works (with a minimal Dom0 addition).
>> >>
>> >
>> > yes, this change looks more straightforward. :-)
>> 
>> With that in, we still have more deficiencies compared to native Linux.
> 
> definitely there'll be even more than what's revealed today, due to the
> way that dom0 ACPI processor driver is tightly bound. there're lots of
> factors in dom0 itself which may impact the verification/filtering on
> Cx entries provide by BIOS, while some of which should be avoided from
> Xen p.o.v, such as the 2nd example you just found. The more severe is
> that to work around those factors adds intrusive Xen awareness into
> generic ACPI processor driver, e.g. 
> 
> @@ -780,7 +780,7 @@ static int acpi_processor_get_power_info
>                         current_count));
>  
>       /* Validate number of power states discovered */
> -     if (current_count < 2)
> +     if (current_count < 1 + !processor_pm_external())
>               status = -EFAULT;
>  
>        end:
> 
> More changes like above are added, less possibilities for Xen PM
> changes to be accepted into upstream. Also such specific changes
> made on one dom0 version may be invalid in a new version quickly.
> Above change is one example which doesn't hold true in newer
> kernel. 

Afaict, the code is unchanged up to at least 3.0, and requires
the same adjustment (at least for the non-pvops case; the pvops
one clearly can't be reasonably viewed from any post-2.6.32
perspective).

> When working with Konrad on rebasing xen PM patches to latest
> Linux 3.0.0. we tried hard to avoid intrusive changes in generic
> ACPI processor driver, by trying to invoke existing interfaces in
> higher level as possible. The end result is that we skip handling
> those corner cases like above example for now, by at least making
> Xen PM working on majority boxes. Later after Xen PM is accepted
> upstream with more Xen awareness in Linux ACPI people, those
> corner cases handling may be improved gradually.
>  
> Another option Yang currently is working on is to port native intel-idle
> driver to Xen, which should avoid nasty dependency on dom0 ACPI
> bits and immune to various BIOS bugs.

That's good to hear.

>> For one, we don't use mwait when ACPI doesn't tell us to, while Linux
>> does (in the intel_idle driver for deeper C-states, and for C1 also via
>> mwait_idle()). This is likely a bit more work, but it should be possible to
>> construct C-state information from CPUID leaf 5 (and, if valid, ignore
>> information passed down from Dom0), which would match intel_idle's
>> taking precedence over acpi_idle in Linux.
> 
> yes. This should be a desired feature in Xen, with some limitations:
>       - not work with CPU hotplug
>       - not work with old boxes (starting from Nehalem)
>       - not work with Px/Cx state changes (_PPC, _CST e.g. from Node Manager)
> 
> So this will be a supplemented option to existing acpi_idle, and should
> work on most cases when above 3 factors are not concerned.
> 
>> 
>> Second, if only C1 gets announced by ACPI, we end up not using it
>> because Dom0 simply neglects to let the hypervisor know. This is
>> because acpi_processor_get_power_info_cst() (back to at least
>> 2.6.16) returns -EFAULT if less than two C-states were found. Simply
>> prefixing the check with "!processor_pm_external() && " fixes this
>> (but I don't know whether something similar could be done in Jeremy's
>> tree).
> 
> this is a very temporary problem which disappears quickly in subsequent
> versions. But if just taking 2.6.18-xen, it's a right fix.

Again - when did you see this disappear?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.