[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR



> However for HVM, we should keep its behavior the same as
> on native machine. So if hardware support rdtscp, we will also
> support it in HVM; if not, we will not expose that bit in cpuid 
> to guest. 

As I said, I think this a very bad idea because there
is no way to ensure the behavior of an app/OS in a VM
gives the same results as in a physical machine.
So I think the cpuid rdtscp bit should always be off. 

> increase at save/restore/migration). This makes HVM support a bit
> Tricky because we need to save/restore guest/host TSC_AUX at every
> VMEXIT/VMENTRY. If both PV/HVM could put TSC_AUX writing in 
> context_switch(), then things will become easier for HVM support. 

If you are doing a full faithful implementation of
rdtscp (as if cpuid rdtscp bit is on), I agree this
is a problem.  If not, and the only use of TSC_AUX
is for the pvrdtscp algorithm, I think setting
TSC_AUX in __update_vcpu_system_time() is fine
because TSC_AUX is not part of a VM's context,
it is a communication of information from system
software (Xen) to applications.

I expect that Keir will not support putting TSC_AUX
in the context switch code unless it is absolutely
necessary, as it is certainly expensive to read and
write to TSC_AUX and this cost will add to every
context switch of every VM even though very few will
actually use rdtscp/TSC_AUX.

So I think we need to decide first about approach (1),
the full faithful implementation of rdtscp.

> -----Original Message-----
> From: Xu, Dongxiao [mailto:dongxiao.xu@xxxxxxxxx]
> Sent: Thursday, December 10, 2009 6:23 PM
> To: Dan Magenheimer; Nakajima, Jun; 
> xen-devel@xxxxxxxxxxxxxxxxxxx; Keir
> Fraser
> Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
> 
> 
> Dan, 
>       Thanks for reply, some comments below. 
> 
> Best Regards,
> -- Dongxiao
> 
> Dan Magenheimer wrote:
> > Hi Dongxiao --
> > 
> > There are two approaches to adding rdtscp support:
> > 
> > 1) Faithful full implementation of rdtscp instruction
> > 2) Support pvrtdtscp algorithm
> > 
> > For (1), you would enable the rdtscp bit in cpuid.  Then
> > on hardware that supports rdtscp, you would do context
> > switching of TSC_AUX.  On hardware that doesn't support
> > rdtscp, you would intercept the illegal instruction trap
> > and emulate the instruction.  (TSC_AUX emulation
> > could be handled "lazily", no need to do context
> > switch for that.)
> > 
> > BUT if you look at how TSC_AUX is used by a native
> > OS**, the OS sets TSC_AUX to each physical CPU number
> > so an application can easily determine if successive
> > rdtscp instructions were not executed on the same
> > processor.  (This was important on older processors
> > that did not have invariant TSC.)  Unfortunately,
> > on Xen, this mechanism is worthless and misleading
> > because the OS believes it is setting TSC_AUX to
> > a physical CPU number but it is actually setting
> > it to a virtual CPU number, and the physical CPU
> > number may change at any time due to scheduling
> > or migration.  So an app using rdtscp will get a
> > wrong answer.
> 
> However for HVM, we should keep its behavior the same as
> on native machine. So if hardware support rdtscp, we will also
> support it in HVM; if not, we will not expose that bit in cpuid 
> to guest. 
> 
> > 
> > As a result, I do NOT recommend (1) and do recommend
> > that Xen should continue to return zero for the rdtscp
> > bit in cpuid.
> > 
> > For (2), setting TSC_AUX in __update_vcpu_system_time()
> > is fine (I think).  On hardware that supports, for HVM
> > you would need to ensure that the rdtscp instruction
> > works natively (even though the rdtscp bit in cpuid
> > is not turned on for the guest).  On hardware that
> > does not support rdtscp, you would intercept the illegal
> > instruction trap and call the existing code in
> > pv_soft_rdtsc().
> 
> Put the writing of TSC_AUX MSR in __update_vcpu_system_time()
> has a problem that, Hypervisor will overwrite the value time to time,
> ( For example, at do_softirq()->local_time_calibration() ), 
> even if the
> value didn't change (Currently the domain incarnation value only
> increase at save/restore/migration). This makes HVM support a bit
> Tricky because we need to save/restore guest/host TSC_AUX at every
> VMEXIT/VMENTRY. If both PV/HVM could put TSC_AUX writing in 
> context_switch(), then things will become easier for HVM support. 
> Do you have idea about It? Thanks!  :-)
> 
> > 
> > Does that make sense?
> > 
> > Thanks,
> > Dan
> > 
> > ** I've looked at RHEL5.  Windows actually always
> > returns 0 for TSC_AUX.
> > 
> >> -----Original Message-----
> >> From: Xu, Dongxiao [mailto:dongxiao.xu@xxxxxxxxx]
> >> Sent: Thursday, December 10, 2009 4:22 AM
> >> To: Dan Magenheimer; Nakajima, Jun;
> >> xen-devel@xxxxxxxxxxxxxxxxxxx; Keir
> >> Fraser
> >> Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
> >> 
> >> 
> >> Hi, Dan,
> >>    I am now trying to add the rdtscp support for Xen HVM guest.
> >>    I have some questions about your pvrdtscp patch. See below.
> >> 
> >> Dan Magenheimer wrote:
> >>> Hi Jun --
> >>> 
> >>>> But it's possible that multiple domains use the pvrdtscp
> >>>> algorithm, and the incarnation number is domain specific.
> >>> 
> >>> OK, I see.  The code for writing TSC_AUX is in
> >>> __update_vcpu_system_time() not in context switch.
> >> 
> >> Will you modify the place where Hypervisor writes TSC_AUX MSR?
> >> In the current pvrdtscp logic, I think this MSR should be
> >> written while
> >> vcpu context switch. Also, this will make HVM support much easier
> >> because that MSR would not be modified by Hypervisor time to time.
> >> 
> >>> 
> >>>> We also have the issue when adding RDTSCP support for
> >>>> HVM guests.
> >>> 
> >>> Only if you expose the rdtscp bit via cpuid.  This could
> >>> certainly be done but, as I said, is probably pointless.
> >>> (The pvrdtscp algorithm uses the instruction whether or
> >>> not the rdtscp bit is set in cpuid, since Xen emulates
> >>> it -- for PV domains only now -- if the physical machine
> >>> doesn't support the instruction.
> >> 
> >> We are planning to add HVM support for RDTSCP, and the
> >> behavior for this instruction
> >> will follow the native way.
> >> This caused a problem that RDTSCP instruction in application
> >> has different experience
> >> upon PV and HVM domains. Do you have any comment about 
> this? Thanks!
> >> 
> >> Thanks!
> >> Dongxiao
> >> 
> >>> 
> >>> Dan
> >>> 
> >>>> -----Original Message-----
> >>>> From: Nakajima, Jun [mailto:jun.nakajima@xxxxxxxxx]
> >>>> Sent: Wednesday, December 09, 2009 10:08 AM
> >>>> To: Dan Magenheimer; xen-devel@xxxxxxxxxxxxxxxxxxx
> >>>> Subject: RE: Saving/Restoring IA32_TSC_AUX MSR
> >>>> 
> >>>> 
> >>>> Dan Magenheimer wrote on Wed, 9 Dec 2009 at 08:59:59:
> >>>> 
> >>>>> Hi Jun --
> >>>>> 
> >>>> 
> >>>> Dan,
> >>>> 
> >>>>> Xen doesn't expose the TSC rdtscp bit so assumes that
> >>>>> no guests depend on it.  So no save/restore of TSC_AUX
> >>>>> is necessary.  Xen could provide support for the TSC
> >>>> 
> >>>> But it's possible that multiple domains use the pvrdtscp
> >>>> algorithm, and the incarnation number is domain specific. We
> >>>> also have the issue when adding RDTSCP support for HVM guests.
> >>>> 
> >>>>> rdtscp bit and allow a guest OS to manage TSC_AUX, but
> >>>>> the existing use of TSC_AUX by Linux would fail to
> >>>>> provide the desired result across migration, so there's
> >>>>> little point.  Also the pvrdtscp algorithm now assumes
> >>>>> that Xen itself is responsible for updating TSC_AUX
> >>>>> whenever a migration (across physical machines) occurs.
> >>>>> 
> >>>>> The #define for write_rdtscp_aux is from Linux source,
> >>>>> so I didn't change the code and define the constant.
> >>>>> 
> >>>>> Dan
> >>>>> 
> >>>>>> -----Original Message-----
> >>>>>> From: Nakajima, Jun [mailto:jun.nakajima@xxxxxxxxx]
> >>>>>> Sent: Wednesday, December 09, 2009 9:42 AM
> >>>>>> To: xen-devel@xxxxxxxxxxxxxxxxxxx
> >>>>>> Cc: Dan Magenheimer
> >>>>>> Subject: Saving/Restoring IA32_TSC_AUX MSR
> >>>>>> 
> >>>>>> 
> >>>>>> I see the code like (in arch/x86/time.c), and wondering how
> >>>>>> IA32_TSC_AUX MSR is saved/restored at domain switch time.
> >>>>>> 
> >>>>>>     if ( (d->arch.tsc_mode ==  TSC_MODE_PVRDTSCP) &&
> >>>>>>          boot_cpu_has(X86_FEATURE_RDTSCP) )
> >>>>>>         write_rdtscp_aux(d->arch.incarnation);
> >>>>>> 
> >>>>>> BTW,
> >>>>>> 
> >>>>>> include/asm-x86/msr.h
> >>>>>> #define write_rdtscp_aux(val) wrmsr(0xc0000103, (val), 0)
> >>>>>> 
> >>>>>> We should write like wrmsr(MSR_TSC_AUX, (val), 0) by adding
> >>>>>> +#define MSR_TSC_AUX           0xc0000103 /* Auxiliary TSC */
> >>>>>> in include/asm-x86/msr-index.h
> >>>>>> 
> >>>>>> Thanks,
> >>>>>> Jun
> >>>>>> ---
> >>>>>> Intel Open Source Technology Center
> >>>>>> 
> >>>>>> 
> >>>> 
> >>>> Jun
> >>>> ___
> >>>> Intel Open Source Technology Center
> >>>> 
> >>>> 
> >>>> 
> >>>> 
> >>> 
> >>> _______________________________________________
> >>> Xen-devel mailing list
> >>> Xen-devel@xxxxxxxxxxxxxxxxxxx
> >>> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.