[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] BUG: NOW() seems to (sometimes) go backwards!



>>> On 08.06.16 at 22:36, <joao.m.martins@xxxxxxxxxx> wrote:
> On 06/08/2016 02:43 PM, Dario Faggioli wrote:
>> The code you're referring to should be this:
>> 
>>     /* If we have constant-rate TSCs then scale factor can be shared. */
>>     if ( boot_cpu_has(X86_FEATURE_CONSTANT_TSC) )
>>     {
>>         /* If TSCs are not marked as 'reliable', re-sync during rendezvous. 
>> */
>>         if ( !boot_cpu_has(X86_FEATURE_TSC_RELIABLE) )
>>             time_calibration_rendezvous_fn = time_calibration_tsc_rendezvous;
>>     }
>> 
>> And I have both X86_FEATURE_CONSTANT_TSC and X86_FEATURE_TSC_RELIABLE.
>> 
>> I've again instrumented the code in order to check whether it is
>> time_calibration_tsc_rendezvous() or time_calibration_std_rendezvous()
>> that is being used, and it's the latter:
>> 
>> (XEN) [    1.795916] TSC reliable. Yay!! Using ffff82d080198362 for 
>> rendevousez
>> 
>> [dario@Solace xen.git] $ objdump -D xen/xen-syms |grep ffff82d080198362
>> ffff82d080198362 <time_calibration_std_rendezvous>:
>> 
>> which looks correct to me.
> Probably one other codepath that you would be interested is on
> local_time_calibration() commenting the chunk on this conditional:
> 
> if ( boot_cpu_has(X86_FEATURE_CONSTANT_TSC )
> {
> ...
> }
> 
> And with that you rarely see time going backwards since the scaling would be
> adjusted to each cpu calibration error - but it would be a (very very close)
> approximation on software not exactly guaranteed as hardware TSC. But trying 
> this out
> would be just for experimentation - doesn't look to be a correct way of 
> addressing this.

Wait - I would think this should make the situation worse, not better.
If we let execution flow past this conditional, ->tsc_scale will get
updated, whereas when the if() path above is taken, all CPUs will
always use the same scale values. If TSC values are monotonic and
the scaling works identically on all CPUs, the scaled values _should_
be monotonic too. Yet they aren't, so some piece is still missing in this
picture.

>>> Yet when the scaling values get set only once at boot, monotonic
>>> (cross-CPU) TSC means monotonic (cross-CPU) returns from NOW().
>>>
>> Yep. And at this point, this is what needs to be verified, I guess...
> I think get_s_time_fixed doesn't guarantee monotonicity across CPUs being it
> different socket or (SMT) siblings. local_tsc_stamp is seeded differently on 
> each CPU
> i.e. rdtsc() right after doing the platform time read on the calibration 
> rendezvous.
> Plus stime_local_stamp is seeded with values taken from platform timer 
> (HPET, ACPI,
> PIT) on local_time_calibration which means that get_s_time isn't solely 
> based on TSC
> and that there will always be a gap between stime_local_stamp and 
> local_tsc_stamp.
> TSC is indeed monotonic on a TSC invariant box, but the delta that is 
> computed
> differs from cpu to cpu according to when time calibration happens on each 
> CPU - thus
> not guaranteeing the desired monotonicity property. Having stime_local_stamp 
> be based
> on the same timestamp that of the local_tsc_stamp plus having a single
> local_tsc_stamp as reference would address this behaviour - see also below.

The quality of get_s_time_fixed() output indeed heavily depends on
t->local_tsc_stamp and t->stime_local_stamp being a properly
matched pair. Yet in local_time_calibration()'s constant-TSC case,
they're a direct copy from the respective cpu_calibration fields. The
main issue I could see here is that on SMT siblings the hardware
switching between the two may introduce arbitrary delays between
them. And with CPU frequency changes, the latency between the
rdtsc() and the execution of get_s_time() finishing could also be
pretty variable. I wonder whether c->local_tsc_stamp wouldn't
better be written with the TSC value used by get_s_time() (or,
which should amount to the same effect, whether we shouldn't
simply call get_s_time_fixed() here with the just sampled TSC value).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.