Dan,
Thanks for reply, some comments below.
Best Regards,
-- Dongxiao
Dan Magenheimer wrote:
> Hi Dongxiao --
>
> There are two approaches to adding rdtscp support:
>
> 1) Faithful full implementation of rdtscp instruction
> 2) Support pvrtdtscp algorithm
>
> For (1), you would enable the rdtscp bit in cpuid. Then
> on hardware that supports rdtscp, you would do context
> switching of TSC_AUX. On hardware that doesn't support
> rdtscp, you would intercept the illegal instruction trap
> and emulate the instruction. (TSC_AUX emulation
> could be handled "lazily", no need to do context
> switch for that.)
>
> BUT if you look at how TSC_AUX is used by a native
> OS**, the OS sets TSC_AUX to each physical CPU number
> so an application can easily determine if successive
> rdtscp instructions were not executed on the same
> processor. (This was important on older processors
> that did not have invariant TSC.) Unfortunately,
> on Xen, this mechanism is worthless and misleading
> because the OS believes it is setting TSC_AUX to
> a physical CPU number but it is actually setting
> it to a virtual CPU number, and the physical CPU
> number may change at any time due to scheduling
> or migration. So an app using rdtscp will get a
> wrong answer.
However for HVM, we should keep its behavior the same as
on native machine. So if hardware support rdtscp, we will also
support it in HVM; if not, we will not expose that bit in cpuid
to guest.
>
> As a result, I do NOT recommend (1) and do recommend
> that Xen should continue to return zero for the rdtscp
> bit in cpuid.
>
> For (2), setting TSC_AUX in __update_vcpu_system_time()
> is fine (I think). On hardware that supports, for HVM
> you would need to ensure that the rdtscp instruction
> works natively (even though the rdtscp bit in cpuid
> is not turned on for the guest). On hardware that
> does not support rdtscp, you would intercept the illegal
> instruction trap and call the existing code in
> pv_soft_rdtsc().
Put the writing of TSC_AUX MSR in __update_vcpu_system_time()
has a problem that, Hypervisor will overwrite the value time to time,
( For example, at do_softirq()->local_time_calibration() ), even if the
value didn't change (Currently the domain incarnation value only
increase at save/restore/migration). This makes HVM support a bit
Tricky because we need to save/restore guest/host TSC_AUX at every
VMEXIT/VMENTRY. If both PV/HVM could put TSC_AUX writing in
context_switch(), then things will become easier for HVM support.
Do you have idea about It? Thanks! :-)
>
> Does that make sense?
>
> Thanks,
> Dan
>
> ** I've looked at RHEL5. Windows actually always
> returns 0 for TSC_AUX.
>
>> -----Original Message-----
>> From: Xu, Dongxiao [mailto:dongxiao.xu@xxxxxxxxx]
>> Sent: Thursday, December 10, 2009 4:22 AM
>> To: Dan Magenheimer; Nakajima, Jun;
>> xen-devel@xxxxxxxxxxxxxxxxxxx; Keir
>> Fraser
>> Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
>>
>>
>> Hi, Dan,
>> I am now trying to add the rdtscp support for Xen HVM guest.
>> I have some questions about your pvrdtscp patch. See below.
>>
>> Dan Magenheimer wrote:
>>> Hi Jun --
>>>
>>>> But it's possible that multiple domains use the pvrdtscp
>>>> algorithm, and the incarnation number is domain specific.
>>>
>>> OK, I see. The code for writing TSC_AUX is in
>>> __update_vcpu_system_time() not in context switch.
>>
>> Will you modify the place where Hypervisor writes TSC_AUX MSR?
>> In the current pvrdtscp logic, I think this MSR should be
>> written while
>> vcpu context switch. Also, this will make HVM support much easier
>> because that MSR would not be modified by Hypervisor time to time.
>>
>>>
>>>> We also have the issue when adding RDTSCP support for
>>>> HVM guests.
>>>
>>> Only if you expose the rdtscp bit via cpuid. This could
>>> certainly be done but, as I said, is probably pointless.
>>> (The pvrdtscp algorithm uses the instruction whether or
>>> not the rdtscp bit is set in cpuid, since Xen emulates
>>> it -- for PV domains only now -- if the physical machine
>>> doesn't support the instruction.
>>
>> We are planning to add HVM support for RDTSCP, and the
>> behavior for this instruction
>> will follow the native way.
>> This caused a problem that RDTSCP instruction in application
>> has different experience
>> upon PV and HVM domains. Do you have any comment about this? Thanks!
>>
>> Thanks!
>> Dongxiao
>>
>>>
>>> Dan
>>>
>>>> -----Original Message-----
>>>> From: Nakajima, Jun [mailto:jun.nakajima@xxxxxxxxx]
>>>> Sent: Wednesday, December 09, 2009 10:08 AM
>>>> To: Dan Magenheimer; xen-devel@xxxxxxxxxxxxxxxxxxx
>>>> Subject: RE: Saving/Restoring IA32_TSC_AUX MSR
>>>>
>>>>
>>>> Dan Magenheimer wrote on Wed, 9 Dec 2009 at 08:59:59:
>>>>
>>>>> Hi Jun --
>>>>>
>>>>
>>>> Dan,
>>>>
>>>>> Xen doesn't expose the TSC rdtscp bit so assumes that
>>>>> no guests depend on it. So no save/restore of TSC_AUX
>>>>> is necessary. Xen could provide support for the TSC
>>>>
>>>> But it's possible that multiple domains use the pvrdtscp
>>>> algorithm, and the incarnation number is domain specific. We
>>>> also have the issue when adding RDTSCP support for HVM guests.
>>>>
>>>>> rdtscp bit and allow a guest OS to manage TSC_AUX, but
>>>>> the existing use of TSC_AUX by Linux would fail to
>>>>> provide the desired result across migration, so there's
>>>>> little point. Also the pvrdtscp algorithm now assumes
>>>>> that Xen itself is responsible for updating TSC_AUX
>>>>> whenever a migration (across physical machines) occurs.
>>>>>
>>>>> The #define for write_rdtscp_aux is from Linux source,
>>>>> so I didn't change the code and define the constant.
>>>>>
>>>>> Dan
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Nakajima, Jun [mailto:jun.nakajima@xxxxxxxxx]
>>>>>> Sent: Wednesday, December 09, 2009 9:42 AM
>>>>>> To: xen-devel@xxxxxxxxxxxxxxxxxxx
>>>>>> Cc: Dan Magenheimer
>>>>>> Subject: Saving/Restoring IA32_TSC_AUX MSR
>>>>>>
>>>>>>
>>>>>> I see the code like (in arch/x86/time.c), and wondering how
>>>>>> IA32_TSC_AUX MSR is saved/restored at domain switch time.
>>>>>>
>>>>>> if ( (d->arch.tsc_mode == TSC_MODE_PVRDTSCP) &&
>>>>>> boot_cpu_has(X86_FEATURE_RDTSCP) )
>>>>>> write_rdtscp_aux(d->arch.incarnation);
>>>>>>
>>>>>> BTW,
>>>>>>
>>>>>> include/asm-x86/msr.h
>>>>>> #define write_rdtscp_aux(val) wrmsr(0xc0000103, (val), 0)
>>>>>>
>>>>>> We should write like wrmsr(MSR_TSC_AUX, (val), 0) by adding
>>>>>> +#define MSR_TSC_AUX 0xc0000103 /* Auxiliary TSC */
>>>>>> in include/asm-x86/msr-index.h
>>>>>>
>>>>>> Thanks,
>>>>>> Jun
>>>>>> ---
>>>>>> Intel Open Source Technology Center
>>>>>>
>>>>>>
>>>>
>>>> Jun
>>>> ___
>>>> Intel Open Source Technology Center
>>>>
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|