[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: [PATCH] rendezvous-based local time calibration WOW!

I'll take a look and see if it can be worked out for 3.3.0. It'd be nicer
than clocksource=tsc.

 -- Keir

On 4/8/08 16:24, "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx> wrote:

> OK, how about this version.  The rendezvous only collects
> the key per-cpu time data then sets up a per-cpu 1ms timer
> to later update the timestamp record and vcpu system time,
> so neither should have racing issues.
> I've only run it for about an hour but still haven't seen
> any skew over 600nsec so apparently it is the collection of
> the key time data that must be closely synchronized (probably
> to ensure the slope is correct) while exact synchronization
> of setting the timestamp records is less important.
> Note that I'm not positive I got the clocksource=tsc part
> correct... but am interested in your opinion on whether
> clocksource=tsc can now be eliminated anyway (as the
> main reason I pushed for it was because of unacceptable
> skew which with this patch appears to be fixed).
> Signed-off-by: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
>> -----Original Message-----
>> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
>> Sent: Sunday, August 03, 2008 11:25 AM
>> To: dan.magenheimer@xxxxxxxxxx; Xen-Devel (E-mail)
>> Cc: Ian Pratt; Dave Winchell
>> Subject: Re: [PATCH] rendezvous-based local time calibration WOW!
>> It's not safe to poke a new timestamp record from an interrupt handler
>> (which is what the smp_call_function() callback functions
>> are). Users of the
>> timestamp records (e.g., get_s_time) need
>> local_irq_save/restore() or an
>> equivalent of the Linux seqlock. The latter is likely faster.
>> I'm dubious
>> about update_vcpu_system_time() from an interrupt handler
>> too. It needs
>> thought about how it might race with a context switch (change
>> of 'current')
>> or if it interrupts an existing invocation of
>> update_vcpu_system_time().
>>  -- Keir
>> On 3/8/08 17:50, "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx> wrote:
>>> The synchronization of local_time_calibration (l_t_c) via
>>> round-to-nearest-epoch provided some improvement, but I was
>>> still seeing skew up to 16usec and higher.  I measured the
>>> temporal distance between the rounded-epoch vs when ltc
>>> was actually running to ensure there wasn't some kind of
>>> bug and found that l_t_c was running up to 150us after the
>>> round-epoch and sometimes up to 50us before.  I guess this
>>> is the granularity of setting a Xen timer.  While it seemed
>>> that +/- 100us shouldn't cause that much skew, I finally
>>> decided to try synchronization-via-rendezvous, as suggested
>>> by Ian here:
>> http://lists.xensource.com/archives/html/xen-devel/2008-07/msg
> 01074.html
>> http://lists.xensource.com/archives/html/xen-devel/2008-07/msg01080.html
>> The result is phenomenal... using this approach (in attached
>> patch), I have yet to see a skew exceed 1usec!!!  So this is
>> about a 10-fold increase in accuracy vs the rounded-epoch
>> method and about 20-fold over the one-epoch-from-NOW() method.
>> The platform time is now read once for all processors rather
>> than once per processor.  (Actually, it is read once again
>> in platform_time_calibration()... by "inlining" that routine
>> into master_local_time_calibration() that extra read can
>> be -- and probably should be -- avoided too.)
>> It may be too late to get this into 3.3.0 but, if so, please
>> consider it asap for 3.3.1 rather than just xen-unstable/3.4.
>> Dan
>> ===================================
>> Thanks... for the memory
>> I really could use more / My throughput's on the floor
>> The balloon is flat / My swap disk's fat / I've OOM's in store
>> Overcommitted so much
>> (with apologies to the late great Bob Hope)

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.