[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] RFC: Linux: disable APERF/MPERF feature in PV kernels

On 05/23/2012 03:21 PM, Andrew Cooper wrote:
On 23/05/12 13:18, Jan Beulich wrote:
On 23.05.12 at 13:11, Andrew Cooper<andrew.cooper3@xxxxxxxxxx>  wrote:
On 23/05/12 08:34, Jan Beulich wrote:
First of all I'm of the opinion that this indeed should not be
masked in the hypervisor - there's no reason to disallow the
guest to read these registers (but we should of course deny
writes as long as Xen is controlling P-states, which we do).
I am sorry but I am going to have to disagree with you on this point.

We should not be advertising this feature to any guest at all if we
can't provide an implementation which works as native expects.  Else we
are failing in our job of virtualisation.
That's perhaps a matter of the position you take - for HVM, I
would agree with yours, but there's many more aspects (not
the least related to accessing other MSRs) that we fail to
"properly" virtualize for PV guests - my position is that it is the
nature of PV that guest kernels have to be aware of being
virtualized (and hence stay away from doing certain things
unless [they think] they know what they're doing).

There is 'dom0_vcpus_pin'[1] which identity pins dom0 vcpus, and
prevents update of the affinity masks, and appears to conditionally
allow access to certain MSRs.  I think it would be fine to expose this
feature iff dom0s vcpus are pinned in this fashion.  That way, the
measurement should succeed, even if dom0 only has read access to the MSRs.
Restricting it to this case would be too restrictive - it really
makes sense at any time where the vCPU's affinity has exactly
one bit set (or to be precise, the intersection of it and the set
of online pCPU-s).


That is unfortunately too lax.  You also need to be able to guarantee
that the affinity mask is not updated (and vcpu rescheduled) while in
the middle of a measurement.  Xen cant sensibly work out if or when a
guest is taking a measurement, nor can dom0.  So the only safe solution
I can see is for Xen to prevent the affinity masks from ever being
updated.  With more thought, this would also preclude migration of a
guest to another host.

Iff we really care about this feature, we could as well emulate it:
On every VCPU migration we calculate the difference between the two pCPU's values of APERF and MPERF. On the trap this value is added to the current MSR value. Similar to what is done with the TSC in HVM. We trap on every MSR access anyway, so the performance impact is only four HV rdmsrs on every VCPU migration.

Only I am not sure if this is really a problem we should solve or if wouldn't be easier for us and clearer to the user to just discourage those accesses.


Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.