[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 RFC] x86/time: avoid early uses of NOW() to return zero


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Fri, 15 May 2026 15:12:53 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dbeEDFUBqb23n3nj9F6l/iKheCEK7+Rekln3WihDiF4=; b=ZVCAbbLA9km6K4lXvMH2BnkW9kO6UFKhhpF0rAdvTgBdY5gVaSKCn51QS9dkVaTs4FKbACnF+OxNhSQMOCOaKGcER8hQ9yvkWfKW2I8jiJWcm/8I7Z8EvY70bDoGm9y+wSLhhAnYhUgXuzRUN+jXWIqOY+XiBF34Cmv0cD0Xk9RxSICR/YoY22FyD13EP8qRLsWXYWdF6mAFODQbGXYxoljVftfC+j4gSAcXu+yc4T7AC7fZVwe5OsGmar/NLf7Py61mBi8eJfgUOvVpYI0mkSxZaoOvVRb7zCBbAN9VD1CJ54/xLAjhH/7BTbRBDB4MHn9eyUO71xDGV2iwO1Is9A==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=U18Y2Gqd11I+WoVgeT9i3f7umP6zSyCxkypjP0e1uEwSP5q3hG/LOhUZWN6NOCqRVBNHbVpC/2IxERpj7or8UaMqhLJbVNR8Ow9BWY9NWSzD4HcgLpsz5co8gK02KF/GKw6ZEsb0q7FRpCFr70tyi4iEUmQuWANmOA1hcNg5g8p9ynFX0EWK1IgcgZwhJisU/PN3WngU4tv2cv1mw9ZIxB7WyujlsblNbaUB3sdoSjk9yVmnHMFtcHlUmQzY3jhIA+gPFRJOV2x5uropPhXbeeqpCQDFS034+5xdRHtE566TPITYr/kDTYgvN0GZisjj9O25w+5EU5pZbkR71IzJeA==
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=citrix.com header.i="@citrix.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Teddy Astie <teddy.astie@xxxxxxxxxx>
  • Delivery-date: Fri, 15 May 2026 13:13:10 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, May 15, 2026 at 09:15:40AM +0200, Jan Beulich wrote:
> On 14.05.2026 17:56, Roger Pau Monné wrote:
> > On Wed, May 13, 2026 at 08:44:46AM +0200, Jan Beulich wrote:
> >> Waiting loops like the one in flush_command_buffer() will degenerate to
> >> infinite ones when used early enough for NOW() to still return constant
> >> zero. Make sure the returned value at least monotonically increases. When
> >> available, use nominal frequency values as initial approximation.
> >>
> >> Do this only in get_s_time(), as producing a sane value in
> >> get_s_time_fixed() for non-zero inputs won't be reasonably possible.
> >> Put an assertion there.
> >>
> >> Reported-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> >> ---
> >> RFC: This breaks at least the TSM_BOOT case printk_start_of_line(), which
> >>      checks for NOW() returning 0 (falling back to TSM_RAW in this case).
> >>      For now I have no idea how to avoid this; perhaps that's tolerable at
> >>      least in the case where we put in place an early estimate? Should we
> >>      maybe weaken the fallback condition to take effect for any value
> >>      below 1μs?
> > 
> > Maybe it's fine to print cycles unconditionally until we reach
> > SYS_STATE_smp_boot when we know the per-cpu scale is correctly set?
> 
> I remain of the opinion (as said in reply to your similar v1 comment) that
> this isn't very desirable. Tying to SYS_STATE_smp_boot also would feel
> pretty arbitrary. Other ports may have NOW() properly working much earlier.
> If anything we may want to add a global indicator of NOW() properly working.

I would be fine with such indicator.

> >> RFC: While generally the mentioned waiting loops will take longer to time
> >>      out, on a very fast CPU tight loops may time out too early.
> >>
> >> RFC: For the AMD/Hygon case, if the "nominal" value isn't available, we
> >>      could use the "high" one. That would cause NOW() to run too slowly
> >>      (until the scale is properly set), but maybe that's still better than
> >>      it returning 0? (As it stands, I can't really test the new code
> >>      there, as my Rome system only supplies the lo/hi pair of values.)
> > 
> > Using the "high" frequency would seem fine to me.
> 
> Okay, will do then for v3.
> 
> Related aspect: With these family/model specific additions for AMD, we could
> also separate out intel_log_freq()'s model specific part, to leverage from
> here as well.

Hm, yes, that would reduce the duplication of the added logic.

> >> @@ -2623,6 +2640,21 @@ int __init init_xen_time(void)
> >>      return 0;
> >>  }
> >>  
> >> +/* BSP-only function to pre-set an approximate TSC scale. */
> >> +void __init preset_tsc_scale(unsigned long freq)
> >> +{
> >> +    struct cpu_time *t = &this_cpu(cpu_time);
> >> +
> >> +    /*
> >> +     * The incoming frequency is only approximate (nominal).  Increase it 
> >> by
> >> +     * 1% to make NOW() output rather a little too slow than too fast, 
> >> thus
> >> +     * avoiding a possible backwards jump once the final scale is set.
> >> +     */
> >> +    freq += DIV_ROUND_UP(freq, 100);
> > 
> > To avoid such possible jump backwards, won't it safer to also update
> > the ->local_stime and ->local_tsc fields at the time the new scale is
> > set?  Updatign those ahead of setting the new scale should avoid any
> > backward jumps.
> 
> ->stamp.local_tsc does get updated; you merely dropped that line from reply
> context. As to local_stime - how could we possibly set that, when we didn't
> get through init_platform_timer() yet? Leaving it at 0 is the correct
> match for setting local_tsc to boot_tsc_stamp.

Please bear with me, maybe I'm not understanding exactly to what the
code comment refers to as "possible backwards jump once the final
scale is set".  I assume you refer to the setting of scale
early_time_init()?  The ->stamp.local_tsc value also gets updated at
that point, so it's not possible for the timer going backwards?

This changed with the addition of the init_percpu_time() call in
early_time_init(), and makes the setting of "t->stamp.local_tsc =
boot_tsc_stamp" pointless, as it will get overwritten by the logic in
init_percpu_time() a couple of lines after?

Thanks, _Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.