[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] rdtscP and xen (and maybe the app-tsc answer I've been looking for)



> >> Keir, is there any reason that consistent_tscs shouldn't
> >> default to enabled?
>
> There is a question mark over this, since it's not really 
> clear what the
> CONSTANT_TSC feature flag actually means. For example, it is set if
> CPUID:0x80000007:EDX:8 is set, and that flag merely means that this
> particular core's TSC rate is invariant across all Cx/Px/Tx 
> power-saving
> states. It doesn't directly say anything about TSC 
> consistency across cores
> or sockets unless we are prepared to assume a couple of 
> things: primarily
> that all packages run their TSCs at the same rate, and that 
> they are clocked
> from the same mainboard oscillator. Is that reasonable to 
> assume? We at
> least know the latter is not likely to be true for big-iron 
> NUMA systems,
> across NUMA nodes.

Both Intel and AMD have confirmed that constant_tsc means
that TSC is consistent across all cores and even across
multiple sockets; and at least one major system vendor (HP)
with multi-enclosure "big iron" AMD-based NUMA systems has
confirmed that TSC is consistent across all nodes.   So
by applying the Xen rendezvous-sync algorithm (that writes
tsc every second) on such machines, Xen has actually been
creating a tsc-sync problem, not alleviating one!

I've cc'ed key AMD/Intel/HP experts who can confirm or
correct/clarify any misassumptions I might have.

I *think* "CPU reports tsc_is_constant but it's not
really constant across all sockets/enclosures/nodes" does
exist, but may be limited to a few older exceptions such
as IBM Summit systems.  Upstream Linux now assumes that
constant_tsc applies across all CPUs unless the kernel
is compiled with CONFIG_X86_NUMAQ (note NOT CONFIG_X86_NUMA),
so Linux has now embraced constant_tsc.

So I'm thinking we should treat consistent_tscs as the
rule rather than the exception, and place the onus on
"broken" systems to disable consistent_tscs with the
boot option when necessary.  To be extremely safe,
we could also add some code in
time_calibration_std_rendezvous() to check
for "signficant" tsc differences and report it (and
maybe even auto-disable consistent_tscs).

(One minor correction also:  constant_tsc does NOT
guarantee tsc continues to increment across deep-C-
states... that requires nonstop_tsc.  But Xen already
has the logic to correct deep-C-states in
cstate_restore_tsc().)

Dan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.