[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Xen system skew MUCH worse than tsc skew (was RE: [Xen-devel]RE: [PATCH] record max stime skew (was RE: [PATCH] strictly increasinghvm guest time))



> If you want to test this theory, you can easily get all the CPUs to
> recalibrate at the same instant, though it's a bit expensive:
> 
> Get one CPU to issue an smp_call_function on all CPUs (including
> itself). The called function should atomic_inc a variable and 
> then spin
> waiting reading the count until all CPUs have reached this point. When
> this happens, turn interrupts off, atomic_dec the same counter, spin
> until it hits zero, then read the TSC, re-enable interrupts, finish.
> The TSC reads should all happen very close to each other. 

The code invoked by "xm debug-key t" does exactly that and I've been
using it (as one way) to measure skew.  Any idea how expensive it is?
Is it too expensive to do once/second?  If it's not more expensive
than the (1Hz per processor) local_time_calibration(), perhaps we
should just use it to set TSC on all processors once/second and dispense
with the existing (beautiful but one additional frequency to resonate)
platform-timer-interpolated-by-tsc approach?

On the other hand, I'll bet the bigger the system, the more difficult
it is to rendezvous them... and the more natural skew there will be
between the sockets.
 
> The only thing that could mess this up would be NMI's or SMI's. You
> could at least detect that by reading the TSC after all CPUs have
> incremented the counter, and check that only a "reasonable" amount of
> time had elapsed. If not, set a flag to indicate that a 
> recalibration is
> required (you'd need to add another gather loop to enable all CPUs to
> vote on whether they're happy).

I think I've seen this code in recent Linux.

But assuming we stay with the existing approach, I'm not sure
the processors need to be calibrated at "exactly" the same time,
just "close".  Something similar to "round jiffies" (see
http://lkml.org/lkml/2006/10/10/189) may be enough... though
I guess that depends on the character of the timesource jitter.

Thanks,
Dan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.