WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] RE: TSC scaling and softtsc reprise, and PROPOSAL

To: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] RE: TSC scaling and softtsc reprise, and PROPOSAL
From: "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx>
Date: Tue, 28 Jul 2009 08:55:46 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: Ian Pratt <Ian.Pratt@xxxxxxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>, John Levon <levon@xxxxxxxxxxxxxxxxx>
Delivery-date: Mon, 27 Jul 2009 17:56:57 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <71db2628-22c7-40b6-8078-ff84c6fe60f6@default>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <706158FABBBA044BAD4FE898A02E4BC201BDA446BC@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <71db2628-22c7-40b6-8078-ff84c6fe60f6@default>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcoLmPj7BoeqAjELQnyvnMIeWo1hoQDg/2Cw
Thread-topic: TSC scaling and softtsc reprise, and PROPOSAL
Hi, Dan
        Sorry for late reply!  See my comments below. 
> 
> Thanks very much for the additional detail on the 10%
> performance loss.  What is this oltp benchmark?  Is
> it available for others to run?  Also is the rdtsc
> rate 120000/sec on EACH processor?

OLTP benchmark is a test case of sysbench, and you can get it through the 
following link:
http://sysbench.sourceforge.net/

And we only configured one virtual processor for one VM,  and I don't know oltp 
whether can use two virtual processors. 

> 
> Assuming a 3GHz machine, your results seem to show that
> emulating a rdtsc with softtsc takes about 2500 cycles.
> This agrees with my approximation of about 1 usec.
> 
> Have you analyzed where this 2500 cycles is being used?
> My suggestion about performance optimization was not
> to try a different algorithm but to see if it is possible
> to code the existing algorithm much faster using a
> special trap path and assembly code. (We called this
> a "fast path" on Xen/ia64.)  Even if the 2500 cycles
> can be cut in half, that would be a big win.

It should have no fast path for emulating rdtsc in x86 side, and the main cost 
should be from hardware context switch. Since I am using an old machine when 
run this benchmark, the cost should be reduced sharply in latest processors 
where I haven't done the test. 

> Am I correct in reading that your patch is ONLY for
> HVM guests?  If so, since some (maybe most) workloads
> that rely on tsc for transaction timestamps will be
> PV, your patch doesn't solve the whole problem.

Yes, this patch is only for HVM guest, because only HVM guest can use TSC 
offset feature(one of VT features) ,and also I don't think PV guest need it. 

> Can someone at Intel confirm or deny that VMware ESX
> always traps rdtsc?  If so, it is probably not hard
> to write an application that works on VMware ESX (on
> certain hardware) but fails on Xen.
\
> 
>> -----Original Message-----
>> From: Zhang, Xiantao [mailto:xiantao.zhang@xxxxxxxxx]
>> Sent: Tuesday, July 21, 2009 11:05 PM
>> To: Keir Fraser; Dan Magenheimer; Xen-Devel (E-mail)
>> Cc: John Levon; Ian Pratt; Dong, Eddie
>> Subject: RE: TSC scaling and softtsc reprise, and PROPOSAL
>> 
>> 
>> Keir Fraser wrote:
>>> On 20/07/2009 21:02, "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx>
>>> wrote: 
>>> 
>>>> I agree that if the performance is *really bad*, the default
>>>> should not change.  But I think we are still flying on rumors
>>>> of data collected years ago in a very different world, and
>>>> the performance data should be re-collected to prove that
>>>> it is still *really bad*.  If the degradation is a fraction
>>>> of a percent even in worst case analysis, I think the default
>>>> should be changed so that correctness prevails.
>>>> 
>>>> Why now?  Because more and more real-world applications are
>>>> built on top of multi-core platforms where TSC is reliable
>>>> and (by far) the best timesource.  And I think(?) we all agree
>>>> now that softtsc is the only way to guarantee correctness
>>>> in a virtual environment.
>>> 
>>> So how bad is the non-softtsc default mode anyway? Our default
>>> timer_mode has guest TSCs track host TSC (plus a fixed per-vcpu
>>> offset that defaults to having all vcpus of a domain aligned to
>>> vcpu0 boot = zero tsc). 
>>> 
>>> Looking at the email thread you cited, all I see is someone from
>>> Intel saying something about how their code to improve TSC
>>> consistency across migration avoids RDTSC exiting where possible
>>> (which I do not see -- if the TSC rates across the hosts do not
>>> match closely then RDTSC exiting is enabled forever for that
>>> domain), and, most bizarrely, that their 'solution' may have a tsc
>>> drift >10^5 cycles. Where did this huge number come from? What
>>> solution is being talked about, and under what conditions might the
>>> claim hold? Who knows! 
>> 
>> We had done the experiment to measure the performance impact
>> with softtsc using oltp workload, and we saw ~10% performance
>> loss if rdtsc rate is more than 120,000/second. And we also
>> did some other tests, and the results show that ~1%
>> perfomance loss is caused by 10000 rdtsc instructions.  So if
>> the rdtsc rate is not that high(>10000/second), the
>> performance impact can be ignored.
>> 
>> We also introduced some performance optimization solutions,
>> but as we claimed before, they may bring some TSC drift (
>> 10^5~10^6 cycles) between virtual processors in SMP cases.
>> One solution is described below, for example, the guest is
>> migrated from low TSC freq(low_freq) machine to a high TSC
>> freq one(high_freq), you know, the low frequency is guest's
>> expected frequency(exp_freq), and we should let guest be
>> aware that it is running on the machine with exp_freq TSC to
>> avoid possbile issues caused by faster TSC in any
>> optimization solution.
>> 
>> 1. In this solution, we only guarantee guest's TSC is
>> increasing monotonically and the average frequency equals
>> guest's expected frequency(exp_freq) in a fixed time slot (eg. ~1ms).
>> 2. To be simple,  let guest running in high_freq TSC (with
>> hardware TSC offset feature, no perfomrance loss) for 1ms,
>> and then enable rdtsc exiting and use trap and emulation
>> method(suffers perfomance loss) to let guest running in a
>> *VERY VERY* low frequency TSC(e.g 0.2 G Hz) for some time,
>> and the specific value can be calculated with the formula to
>> guarantee average TSC frquency == exp_freq:
>>              time = (high_freq - low_freq) / (low_freq - 0.2).
>> 
>> 3. If the guest migrate from 2.4G machine to 3.0G machine,
>> only in (3.0-2.4) /(2.4-0.2) == ~0.273ms guest has to suffer
>> performance loss in the total time 1ms+0.273ms ,and that is
>> also to say, in most of the time guest can leverage
>> hardware's TSC offset feature to reduce perfomrance loss.
>> 
>> 4.  In the 1.273ms, we can say guest's TSC frequency is
>> emulated to its expected one through the hardware and
>> software's co-emulation. And the perfomance loss is very
>> minor compared with purely softtsc solution.
>> 5.  But at the same time, since each vcpu's TSC is emulated
>> indpendently for SMP guest, and they may generate a drift
>> value between vcpus, and the drift vaule's range should be
>> 10^5 ~10^6 cycles, and we don't know such drift between vcpus
>> whether can bring other side-effects.  At least, one
>> side-effect case we can figure out is when one application
>> running on one vcpu, and it may see backward TSC value after
>> its migrating to another vcpu.  Not sure this is a real
>> problem, but it should exist in theory.
>> 
>> Attached the draft patch to implement the solution based on
>> an old #Cset19591.
>> 
>> Xiantao


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel