|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 3/3] x86/time: don't move TSC backwards in time_calibration_tsc_rendezvous()
On 08.02.2021 10:38, Roger Pau Monné wrote:
> On Mon, Feb 01, 2021 at 01:43:28PM +0100, Jan Beulich wrote:
>> ---
>> Since CPU0 reads its TSC last on the first iteration, if TSCs were
>> perfectly sync-ed there shouldn't ever be a need to update. However,
>> even on the TSC-reliable system I first tested this on (using
>> "tsc=skewed" to get this rendezvous function into use in the first
>> place) updates by up to several thousand clocks did happen. I wonder
>> whether this points at some problem with the approach that I'm not (yet)
>> seeing.
>
> I'm confused by this, so on a system that had reliable TSCs, which
> you forced to remove the reliable flag, and then you saw big
> differences when doing the rendezvous?
>
> That would seem to indicate that such system doesn't really have
> reliable TSCs?
I don't think so, no. This can easily be a timing effect from the
heavy cache line bouncing involved here.
What I'm worried here seeing these updates is that I might still
be moving TSCs backwards in ways observable to the rest of the
system (i.e. beyond the inherent property of the approach), and
this then getting corrected by a subsequent rendezvous. But as
said - I can't see what this could result from, and hence I'm
inclined to assume these are merely effects I've not found a
good explanation for so far.
>> Considering the sufficiently modern CPU it's using, I suspect the
>> reporter's system wouldn't even need to turn off TSC_RELIABLE, if only
>> there wasn't the boot time skew. Hence another approach might be to fix
>> this boot time skew. Of course to recognize whether the TSCs then still
>> aren't in sync we'd need to run tsc_check_reliability() sufficiently
>> long after that adjustment. Which is besides the need to have this
>> "fixing" be precise enough for the TSCs to not look skewed anymore
>> afterwards.
>
> Maybe it would make sense to do a TSC counter sync after APs are up
> and then disable the rendezvous if the next calibration rendezvous
> shows no skew?
Yes, that's what I was hinting at with the above. For the next
rendezvous to not observe any skew, our adjustment would need to
be far more precise than it is today, though.
> I also wonder, we test for skew just after the APs have been booted,
> and decide at that point whether we need a calibration rendezvous.
>
> Maybe we could do a TSC sync just after APs are up (to hopefully bring
> them in sync), and then do the tsc_check_reliability just before Xen
> ends booting (ie: before handing control to dom0?)
>
> What we do right now (ie: do the tsc_check_reliability so early) is
> also likely to miss small skews that will only show up after APs have
> been running for a while?
The APs' TSCs will have been running for about as long as the
BSP's, as INIT does not affect them (and in fact they ought to
be running for _exactly_ as long, or else tsc_check_reliability()
would end up turning off TSC_RELIABLE). So I expect skews to be
large enough at this point to be recognizable.
>> @@ -1712,6 +1720,16 @@ static void time_calibration_tsc_rendezv
>> while ( atomic_read(&r->semaphore) < total_cpus )
>> cpu_relax();
>>
>> + if ( tsc == 0 )
>> + {
>> + uint64_t cur;
>> +
>> + tsc = rdtsc_ordered();
>> + while ( tsc > (cur = r->max_tsc_stamp) )
>> + if ( cmpxchg(&r->max_tsc_stamp, cur, tsc) == cur )
>> + break;
>
> I think you could avoid reading cur explicitly for each loop and
> instead do?
>
> cur = ACCESS_ONCE(r->max_tsc_stamp)
> while ( tsc > cur )
> cur = cmpxchg(&r->max_tsc_stamp, cur, tsc);
Ah yes. I tried something similar, but not quite the same,
and it looked wrong, so I gave up re-arranging.
>> @@ -1719,9 +1737,12 @@ static void time_calibration_tsc_rendezv
>> while ( atomic_read(&r->semaphore) > total_cpus )
>> cpu_relax();
>> }
>> +
>> + /* Just in case a read above ended up reading zero. */
>> + tsc += !tsc;
>
> Won't that be worthy of an ASSERT_UNREACHABLE? I'm not sure I see how
> tsc could be 0 on a healthy system after the loop above.
It's not forbidden for the firmware to set the TSCs to some
huge negative value. Considering the effect TSC_ADJUST has on
the actual value read by RDTSC, I think I did actually observe
a system coming up this way, because of (not very helpful)
TSC_ADJUST setting by firmware. So no, no ASSERT_UNREACHABLE()
here.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |