> However for HVM, we should keep its behavior the same as
> on native machine. So if hardware support rdtscp, we will also
> support it in HVM; if not, we will not expose that bit in cpuid
> to guest.
As I said, I think this a very bad idea because there
is no way to ensure the behavior of an app/OS in a VM
gives the same results as in a physical machine.
So I think the cpuid rdtscp bit should always be off.
> increase at save/restore/migration). This makes HVM support a bit
> Tricky because we need to save/restore guest/host TSC_AUX at every
> VMEXIT/VMENTRY. If both PV/HVM could put TSC_AUX writing in
> context_switch(), then things will become easier for HVM support.
If you are doing a full faithful implementation of
rdtscp (as if cpuid rdtscp bit is on), I agree this
is a problem. If not, and the only use of TSC_AUX
is for the pvrdtscp algorithm, I think setting
TSC_AUX in __update_vcpu_system_time() is fine
because TSC_AUX is not part of a VM's context,
it is a communication of information from system
software (Xen) to applications.
I expect that Keir will not support putting TSC_AUX
in the context switch code unless it is absolutely
necessary, as it is certainly expensive to read and
write to TSC_AUX and this cost will add to every
context switch of every VM even though very few will
actually use rdtscp/TSC_AUX.
So I think we need to decide first about approach (1),
the full faithful implementation of rdtscp.
> -----Original Message-----
> From: Xu, Dongxiao [mailto:dongxiao.xu@xxxxxxxxx]
> Sent: Thursday, December 10, 2009 6:23 PM
> To: Dan Magenheimer; Nakajima, Jun;
> xen-devel@xxxxxxxxxxxxxxxxxxx; Keir
> Fraser
> Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
>
>
> Dan,
> Thanks for reply, some comments below.
>
> Best Regards,
> -- Dongxiao
>
> Dan Magenheimer wrote:
> > Hi Dongxiao --
> >
> > There are two approaches to adding rdtscp support:
> >
> > 1) Faithful full implementation of rdtscp instruction
> > 2) Support pvrtdtscp algorithm
> >
> > For (1), you would enable the rdtscp bit in cpuid. Then
> > on hardware that supports rdtscp, you would do context
> > switching of TSC_AUX. On hardware that doesn't support
> > rdtscp, you would intercept the illegal instruction trap
> > and emulate the instruction. (TSC_AUX emulation
> > could be handled "lazily", no need to do context
> > switch for that.)
> >
> > BUT if you look at how TSC_AUX is used by a native
> > OS**, the OS sets TSC_AUX to each physical CPU number
> > so an application can easily determine if successive
> > rdtscp instructions were not executed on the same
> > processor. (This was important on older processors
> > that did not have invariant TSC.) Unfortunately,
> > on Xen, this mechanism is worthless and misleading
> > because the OS believes it is setting TSC_AUX to
> > a physical CPU number but it is actually setting
> > it to a virtual CPU number, and the physical CPU
> > number may change at any time due to scheduling
> > or migration. So an app using rdtscp will get a
> > wrong answer.
>
> However for HVM, we should keep its behavior the same as
> on native machine. So if hardware support rdtscp, we will also
> support it in HVM; if not, we will not expose that bit in cpuid
> to guest.
>
> >
> > As a result, I do NOT recommend (1) and do recommend
> > that Xen should continue to return zero for the rdtscp
> > bit in cpuid.
> >
> > For (2), setting TSC_AUX in __update_vcpu_system_time()
> > is fine (I think). On hardware that supports, for HVM
> > you would need to ensure that the rdtscp instruction
> > works natively (even though the rdtscp bit in cpuid
> > is not turned on for the guest). On hardware that
> > does not support rdtscp, you would intercept the illegal
> > instruction trap and call the existing code in
> > pv_soft_rdtsc().
>
> Put the writing of TSC_AUX MSR in __update_vcpu_system_time()
> has a problem that, Hypervisor will overwrite the value time to time,
> ( For example, at do_softirq()->local_time_calibration() ),
> even if the
> value didn't change (Currently the domain incarnation value only
> increase at save/restore/migration). This makes HVM support a bit
> Tricky because we need to save/restore guest/host TSC_AUX at every
> VMEXIT/VMENTRY. If both PV/HVM could put TSC_AUX writing in
> context_switch(), then things will become easier for HVM support.
> Do you have idea about It? Thanks! :-)
>
> >
> > Does that make sense?
> >
> > Thanks,
> > Dan
> >
> > ** I've looked at RHEL5. Windows actually always
> > returns 0 for TSC_AUX.
> >
> >> -----Original Message-----
> >> From: Xu, Dongxiao [mailto:dongxiao.xu@xxxxxxxxx]
> >> Sent: Thursday, December 10, 2009 4:22 AM
> >> To: Dan Magenheimer; Nakajima, Jun;
> >> xen-devel@xxxxxxxxxxxxxxxxxxx; Keir
> >> Fraser
> >> Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
> >>
> >>
> >> Hi, Dan,
> >> I am now trying to add the rdtscp support for Xen HVM guest.
> >> I have some questions about your pvrdtscp patch. See below.
> >>
> >> Dan Magenheimer wrote:
> >>> Hi Jun --
> >>>
> >>>> But it's possible that multiple domains use the pvrdtscp
> >>>> algorithm, and the incarnation number is domain specific.
> >>>
> >>> OK, I see. The code for writing TSC_AUX is in
> >>> __update_vcpu_system_time() not in context switch.
> >>
> >> Will you modify the place where Hypervisor writes TSC_AUX MSR?
> >> In the current pvrdtscp logic, I think this MSR should be
> >> written while
> >> vcpu context switch. Also, this will make HVM support much easier
> >> because that MSR would not be modified by Hypervisor time to time.
> >>
> >>>
> >>>> We also have the issue when adding RDTSCP support for
> >>>> HVM guests.
> >>>
> >>> Only if you expose the rdtscp bit via cpuid. This could
> >>> certainly be done but, as I said, is probably pointless.
> >>> (The pvrdtscp algorithm uses the instruction whether or
> >>> not the rdtscp bit is set in cpuid, since Xen emulates
> >>> it -- for PV domains only now -- if the physical machine
> >>> doesn't support the instruction.
> >>
> >> We are planning to add HVM support for RDTSCP, and the
> >> behavior for this instruction
> >> will follow the native way.
> >> This caused a problem that RDTSCP instruction in application
> >> has different experience
> >> upon PV and HVM domains. Do you have any comment about
> this? Thanks!
> >>
> >> Thanks!
> >> Dongxiao
> >>
> >>>
> >>> Dan
> >>>
> >>>> -----Original Message-----
> >>>> From: Nakajima, Jun [mailto:jun.nakajima@xxxxxxxxx]
> >>>> Sent: Wednesday, December 09, 2009 10:08 AM
> >>>> To: Dan Magenheimer; xen-devel@xxxxxxxxxxxxxxxxxxx
> >>>> Subject: RE: Saving/Restoring IA32_TSC_AUX MSR
> >>>>
> >>>>
> >>>> Dan Magenheimer wrote on Wed, 9 Dec 2009 at 08:59:59:
> >>>>
> >>>>> Hi Jun --
> >>>>>
> >>>>
> >>>> Dan,
> >>>>
> >>>>> Xen doesn't expose the TSC rdtscp bit so assumes that
> >>>>> no guests depend on it. So no save/restore of TSC_AUX
> >>>>> is necessary. Xen could provide support for the TSC
> >>>>
> >>>> But it's possible that multiple domains use the pvrdtscp
> >>>> algorithm, and the incarnation number is domain specific. We
> >>>> also have the issue when adding RDTSCP support for HVM guests.
> >>>>
> >>>>> rdtscp bit and allow a guest OS to manage TSC_AUX, but
> >>>>> the existing use of TSC_AUX by Linux would fail to
> >>>>> provide the desired result across migration, so there's
> >>>>> little point. Also the pvrdtscp algorithm now assumes
> >>>>> that Xen itself is responsible for updating TSC_AUX
> >>>>> whenever a migration (across physical machines) occurs.
> >>>>>
> >>>>> The #define for write_rdtscp_aux is from Linux source,
> >>>>> so I didn't change the code and define the constant.
> >>>>>
> >>>>> Dan
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Nakajima, Jun [mailto:jun.nakajima@xxxxxxxxx]
> >>>>>> Sent: Wednesday, December 09, 2009 9:42 AM
> >>>>>> To: xen-devel@xxxxxxxxxxxxxxxxxxx
> >>>>>> Cc: Dan Magenheimer
> >>>>>> Subject: Saving/Restoring IA32_TSC_AUX MSR
> >>>>>>
> >>>>>>
> >>>>>> I see the code like (in arch/x86/time.c), and wondering how
> >>>>>> IA32_TSC_AUX MSR is saved/restored at domain switch time.
> >>>>>>
> >>>>>> if ( (d->arch.tsc_mode == TSC_MODE_PVRDTSCP) &&
> >>>>>> boot_cpu_has(X86_FEATURE_RDTSCP) )
> >>>>>> write_rdtscp_aux(d->arch.incarnation);
> >>>>>>
> >>>>>> BTW,
> >>>>>>
> >>>>>> include/asm-x86/msr.h
> >>>>>> #define write_rdtscp_aux(val) wrmsr(0xc0000103, (val), 0)
> >>>>>>
> >>>>>> We should write like wrmsr(MSR_TSC_AUX, (val), 0) by adding
> >>>>>> +#define MSR_TSC_AUX 0xc0000103 /* Auxiliary TSC */
> >>>>>> in include/asm-x86/msr-index.h
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Jun
> >>>>>> ---
> >>>>>> Intel Open Source Technology Center
> >>>>>>
> >>>>>>
> >>>>
> >>>> Jun
> >>>> ___
> >>>> Intel Open Source Technology Center
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>> _______________________________________________
> >>> Xen-devel mailing list
> >>> Xen-devel@xxxxxxxxxxxxxxxxxxx
> >>> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|