[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [RFC] Correct/fast timestamping in apps under Xen [1 of 4]: Reliable TSC

> On 08/10/2009 10:13, "Tim Deegan" <Tim.Deegan@xxxxxxxxxxxxx> wrote:
> > So system designers
> >> (other than perhaps for the very largest superNUMA
> >> machines) would be silly to not use it.
> > 
> > Oh, that's reassuring.  System designers would never do 
> > something that silly.  :)

Tongue-in-cheek noted. ;-)  But seriously, what I'm proposing
is that now that this is architected by the processor, poorly
designed systems (or extremely large systems) should be the rare
exception, not the rule.  Specifically I'm proposing that
(at least for Intel... AMD TBD) if the architectural bit is
set Xen should trust it by default, but provide a boot-time
parameter (e.g. "tsc_broken") to override the default for
any rare poorly-designed or superNUMA systems.

> > If linux relies on it, that's a good sign, but surely we 
> shouldn't get
> > rid of any existing correction mechanisms.

Unfortunately, Xen has no existing detection mechanism so
also has no existing correction mechanism.  Xen currently
blindly assumes tsc is wrong and overwrites all tscs at
boottime, after deep C-state, and at 1Hz if the boottime
consistent_tscs option is set.

> I think at the very least this new 'reliable tsc' mode must be self
> contained, not impact the existing modes, and continue to be 
> switchable via a boot parameter.

OK, let me suggest the following taxonomy of tsc "safeness":

A) unsafe (neither constant nor power-invariant)
B) semi-safe (constant = P-,T-state invariant, C-state may stop)
C) safe (constant+non-stop = P-,T-,and C-state invariant)
D) false-positive safe (CPUs safe, system-wide is not)

Xen currently assumes A.  This is sufficient for Xen's needs,
and for the pvclock algorithm, but insufficient for my
plans to expose "TSC reliability" to usermode.

B (constant) is now determined in Xen by checking family ids
but only used to override consistent_tscs if constant is
NOT set.

C is architecturally-defined by a cpuid bit but Xen doesn't
currently use it.  Intel guarantees TSC invariance across
P-, T-, and C-states when it is set (AMD TBD).

I'm proposing that:
1) for case C, Xen shall never overwrite TSC
2) for case D, a new "tsc_broken" boot option must be specified
   when Xen is booted on a broken machine
3) for case B, always use it when the hardware supports it
   (unless overridden by "tsc_broken")

We are also investigating whether the write_tsc() in
the cstate recovery code obviates the need for the
write_tsc in time_calibration_tsc_rendezvous.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.