[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] RE: [PATCH] rendezvous-based local time calibration WOW!



After two hours of constant samples with c/s 18229, max
skew is at 251ns!  That's 70-150x better than I was
measuring just a couple of weeks ago.  YMMV of course.

If you are looking for another marketing-speak bullet for
the 4.0 release announcement, you can call this:

* Greatly improved precision for time-sensitive SMP VMs

or as I am subject to American hyperbole:

* Dramatically improved precision for time-sensitive SMP VMs

Thanks again!
Dan

> -----Original Message-----
> From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
> Sent: Monday, August 04, 2008 11:37 AM
> To: 'Keir Fraser'; 'Xen-Devel (E-mail)'
> Cc: 'Ian Pratt'; 'Dave Winchell'
> Subject: RE: [PATCH] rendezvous-based local time calibration WOW!
> 
> 
> Looks good to me (and much cleaner).  I've booted it and
> will leave it running for a few hours.
> 
> Thanks!
> Dan
> 
> > -----Original Message-----
> > From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> > Sent: Monday, August 04, 2008 11:10 AM
> > To: dan.magenheimer@xxxxxxxxxx; Xen-Devel (E-mail)
> > Cc: Ian Pratt; Dave Winchell
> > Subject: Re: [PATCH] rendezvous-based local time calibration WOW!
> > 
> > 
> > Applied as c/s 18229. I rewrote it quite a bit, although 
> the principle
> > remains the same.
> > 
> >  -- Keir
> > 
> > On 4/8/08 16:24, "Dan Magenheimer" 
> <dan.magenheimer@xxxxxxxxxx> wrote:
> > 
> > > OK, how about this version.  The rendezvous only collects
> > > the key per-cpu time data then sets up a per-cpu 1ms timer
> > > to later update the timestamp record and vcpu system time,
> > > so neither should have racing issues.
> > >
> > > I've only run it for about an hour but still haven't seen
> > > any skew over 600nsec so apparently it is the collection of
> > > the key time data that must be closely synchronized (probably
> > > to ensure the slope is correct) while exact synchronization
> > > of setting the timestamp records is less important.
> > >
> > > Note that I'm not positive I got the clocksource=tsc part
> > > correct... but am interested in your opinion on whether
> > > clocksource=tsc can now be eliminated anyway (as the
> > > main reason I pushed for it was because of unacceptable
> > > skew which with this patch appears to be fixed).
> > >
> > > Signed-off-by: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
> > >
> > >> -----Original Message-----
> > >> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> > >> Sent: Sunday, August 03, 2008 11:25 AM
> > >> To: dan.magenheimer@xxxxxxxxxx; Xen-Devel (E-mail)
> > >> Cc: Ian Pratt; Dave Winchell
> > >> Subject: Re: [PATCH] rendezvous-based local time calibration WOW!
> > >>
> > >>
> > >> It's not safe to poke a new timestamp record from an 
> > interrupt handler
> > >> (which is what the smp_call_function() callback functions
> > >> are). Users of the
> > >> timestamp records (e.g., get_s_time) need
> > >> local_irq_save/restore() or an
> > >> equivalent of the Linux seqlock. The latter is likely faster.
> > >> I'm dubious
> > >> about update_vcpu_system_time() from an interrupt handler
> > >> too. It needs
> > >> thought about how it might race with a context switch (change
> > >> of 'current')
> > >> or if it interrupts an existing invocation of
> > >> update_vcpu_system_time().
> > >>
> > >>  -- Keir
> > >>
> > >> On 3/8/08 17:50, "Dan Magenheimer" 
> > <dan.magenheimer@xxxxxxxxxx> wrote:
> > >>
> > >>> The synchronization of local_time_calibration (l_t_c) via
> > >>> round-to-nearest-epoch provided some improvement, but I was
> > >>> still seeing skew up to 16usec and higher.  I measured the
> > >>> temporal distance between the rounded-epoch vs when ltc
> > >>> was actually running to ensure there wasn't some kind of
> > >>> bug and found that l_t_c was running up to 150us after the
> > >>> round-epoch and sometimes up to 50us before.  I guess this
> > >>> is the granularity of setting a Xen timer.  While it seemed
> > >>> that +/- 100us shouldn't cause that much skew, I finally
> > >>> decided to try synchronization-via-rendezvous, as suggested
> > >>> by Ian here:
> > >>>
> > >>>
> > >> http://lists.xensource.com/archives/html/xen-devel/2008-07/msg
> > > 01074.html
> > >> 
> http://lists.xensource.com/archives/html/xen-devel/2008-07/msg
01080.html
>>
>> The result is phenomenal... using this approach (in attached
>> patch), I have yet to see a skew exceed 1usec!!!  So this is
>> about a 10-fold increase in accuracy vs the rounded-epoch
>> method and about 20-fold over the one-epoch-from-NOW() method.
>>
>> The platform time is now read once for all processors rather
>> than once per processor.  (Actually, it is read once again
>> in platform_time_calibration()... by "inlining" that routine
>> into master_local_time_calibration() that extra read can
>> be -- and probably should be -- avoided too.)
>>
>> It may be too late to get this into 3.3.0 but, if so, please
>> consider it asap for 3.3.1 rather than just xen-unstable/3.4.
>>
>> Dan
>>
>> ===================================
>> Thanks... for the memory
>> I really could use more / My throughput's on the floor
>> The balloon is flat / My swap disk's fat / I've OOM's in store
>> Overcommitted so much
>> (with apologies to the late great Bob Hope)
>
>




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.