|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 RFC] x86/time: avoid early uses of NOW() to return zero
On Fri, May 15, 2026 at 09:15:40AM +0200, Jan Beulich wrote:
> On 14.05.2026 17:56, Roger Pau Monné wrote:
> > On Wed, May 13, 2026 at 08:44:46AM +0200, Jan Beulich wrote:
> >> Waiting loops like the one in flush_command_buffer() will degenerate to
> >> infinite ones when used early enough for NOW() to still return constant
> >> zero. Make sure the returned value at least monotonically increases. When
> >> available, use nominal frequency values as initial approximation.
> >>
> >> Do this only in get_s_time(), as producing a sane value in
> >> get_s_time_fixed() for non-zero inputs won't be reasonably possible.
> >> Put an assertion there.
> >>
> >> Reported-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> >> ---
> >> RFC: This breaks at least the TSM_BOOT case printk_start_of_line(), which
> >> checks for NOW() returning 0 (falling back to TSM_RAW in this case).
> >> For now I have no idea how to avoid this; perhaps that's tolerable at
> >> least in the case where we put in place an early estimate? Should we
> >> maybe weaken the fallback condition to take effect for any value
> >> below 1μs?
> >
> > Maybe it's fine to print cycles unconditionally until we reach
> > SYS_STATE_smp_boot when we know the per-cpu scale is correctly set?
>
> I remain of the opinion (as said in reply to your similar v1 comment) that
> this isn't very desirable. Tying to SYS_STATE_smp_boot also would feel
> pretty arbitrary. Other ports may have NOW() properly working much earlier.
> If anything we may want to add a global indicator of NOW() properly working.
I would be fine with such indicator.
> >> RFC: While generally the mentioned waiting loops will take longer to time
> >> out, on a very fast CPU tight loops may time out too early.
> >>
> >> RFC: For the AMD/Hygon case, if the "nominal" value isn't available, we
> >> could use the "high" one. That would cause NOW() to run too slowly
> >> (until the scale is properly set), but maybe that's still better than
> >> it returning 0? (As it stands, I can't really test the new code
> >> there, as my Rome system only supplies the lo/hi pair of values.)
> >
> > Using the "high" frequency would seem fine to me.
>
> Okay, will do then for v3.
>
> Related aspect: With these family/model specific additions for AMD, we could
> also separate out intel_log_freq()'s model specific part, to leverage from
> here as well.
Hm, yes, that would reduce the duplication of the added logic.
> >> @@ -2623,6 +2640,21 @@ int __init init_xen_time(void)
> >> return 0;
> >> }
> >>
> >> +/* BSP-only function to pre-set an approximate TSC scale. */
> >> +void __init preset_tsc_scale(unsigned long freq)
> >> +{
> >> + struct cpu_time *t = &this_cpu(cpu_time);
> >> +
> >> + /*
> >> + * The incoming frequency is only approximate (nominal). Increase it
> >> by
> >> + * 1% to make NOW() output rather a little too slow than too fast,
> >> thus
> >> + * avoiding a possible backwards jump once the final scale is set.
> >> + */
> >> + freq += DIV_ROUND_UP(freq, 100);
> >
> > To avoid such possible jump backwards, won't it safer to also update
> > the ->local_stime and ->local_tsc fields at the time the new scale is
> > set? Updatign those ahead of setting the new scale should avoid any
> > backward jumps.
>
> ->stamp.local_tsc does get updated; you merely dropped that line from reply
> context. As to local_stime - how could we possibly set that, when we didn't
> get through init_platform_timer() yet? Leaving it at 0 is the correct
> match for setting local_tsc to boot_tsc_stamp.
Please bear with me, maybe I'm not understanding exactly to what the
code comment refers to as "possible backwards jump once the final
scale is set". I assume you refer to the setting of scale
early_time_init()? The ->stamp.local_tsc value also gets updated at
that point, so it's not possible for the timer going backwards?
This changed with the addition of the init_percpu_time() call in
early_time_init(), and makes the setting of "t->stamp.local_tsc =
boot_tsc_stamp" pointless, as it will get overwritten by the logic in
init_percpu_time() a couple of lines after?
Thanks, _Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |