|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 RFC] x86/time: avoid early uses of NOW() to return zero
On 14.05.2026 17:56, Roger Pau Monné wrote:
> On Wed, May 13, 2026 at 08:44:46AM +0200, Jan Beulich wrote:
>> Waiting loops like the one in flush_command_buffer() will degenerate to
>> infinite ones when used early enough for NOW() to still return constant
>> zero. Make sure the returned value at least monotonically increases. When
>> available, use nominal frequency values as initial approximation.
>>
>> Do this only in get_s_time(), as producing a sane value in
>> get_s_time_fixed() for non-zero inputs won't be reasonably possible.
>> Put an assertion there.
>>
>> Reported-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>> ---
>> RFC: This breaks at least the TSM_BOOT case printk_start_of_line(), which
>> checks for NOW() returning 0 (falling back to TSM_RAW in this case).
>> For now I have no idea how to avoid this; perhaps that's tolerable at
>> least in the case where we put in place an early estimate? Should we
>> maybe weaken the fallback condition to take effect for any value
>> below 1μs?
>
> Maybe it's fine to print cycles unconditionally until we reach
> SYS_STATE_smp_boot when we know the per-cpu scale is correctly set?
I remain of the opinion (as said in reply to your similar v1 comment) that
this isn't very desirable. Tying to SYS_STATE_smp_boot also would feel
pretty arbitrary. Other ports may have NOW() properly working much earlier.
If anything we may want to add a global indicator of NOW() properly working.
>> RFC: While generally the mentioned waiting loops will take longer to time
>> out, on a very fast CPU tight loops may time out too early.
>>
>> RFC: For the AMD/Hygon case, if the "nominal" value isn't available, we
>> could use the "high" one. That would cause NOW() to run too slowly
>> (until the scale is properly set), but maybe that's still better than
>> it returning 0? (As it stands, I can't really test the new code
>> there, as my Rome system only supplies the lo/hi pair of values.)
>
> Using the "high" frequency would seem fine to me.
Okay, will do then for v3.
Related aspect: With these family/model specific additions for AMD, we could
also separate out intel_log_freq()'s model specific part, to leverage from
here as well.
>> @@ -2623,6 +2640,21 @@ int __init init_xen_time(void)
>> return 0;
>> }
>>
>> +/* BSP-only function to pre-set an approximate TSC scale. */
>> +void __init preset_tsc_scale(unsigned long freq)
>> +{
>> + struct cpu_time *t = &this_cpu(cpu_time);
>> +
>> + /*
>> + * The incoming frequency is only approximate (nominal). Increase it by
>> + * 1% to make NOW() output rather a little too slow than too fast, thus
>> + * avoiding a possible backwards jump once the final scale is set.
>> + */
>> + freq += DIV_ROUND_UP(freq, 100);
>
> To avoid such possible jump backwards, won't it safer to also update
> the ->local_stime and ->local_tsc fields at the time the new scale is
> set? Updatign those ahead of setting the new scale should avoid any
> backward jumps.
->stamp.local_tsc does get updated; you merely dropped that line from reply
context. As to local_stime - how could we possibly set that, when we didn't
get through init_platform_timer() yet? Leaving it at 0 is the correct
match for setting local_tsc to boot_tsc_stamp.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |