[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 2/3] x86/time: adjust time recording time_calibration_tsc_rendezvous()
On Mon, Feb 08, 2021 at 12:50:09PM +0100, Jan Beulich wrote: > On 08.02.2021 12:05, Roger Pau Monné wrote: > > On Mon, Feb 08, 2021 at 11:56:01AM +0100, Jan Beulich wrote: > >> On 05.02.2021 17:15, Roger Pau Monné wrote: > >>> I've been thinking this all seems doomed when Xen runs in a virtualized > >>> environment, and should likely be disabled. There's no point on trying > >>> to sync the TSC over multiple vCPUs as the scheduling delay between > >>> them will likely skew any calculations. > >> > >> We may want to consider to force the equivalent of > >> "clocksource=tsc" in that case. Otoh a well behaved hypervisor > >> underneath shouldn't lead to us finding a need to clear > >> TSC_RELIABLE, at which point this logic wouldn't get engaged > >> in the first place. > > > > I got the impression that on a loaded system guests with a non-trivial > > amount of vCPUs might be in trouble to be able to schedule them all > > close enough for the rendezvous to not report a big skew, and thus > > disable TSC_RELIABLE? > > No, check_tsc_warp() / tsc_check_reliability() don't have a > problem there. Every CPU reads the shared "most advanced" > stamp before reading its local one. So it doesn't matter how > large the gaps are here. In fact the possible bad effect is > the other way around here - if the scheduling effects are > too heavy, we may mistakenly consider TSCs reliable when > they aren't. > > A problem of the kind you describe exists in the actual > rendezvous function. And actually any problem of this kind > can, on a smaller scale, already be be observed with SMT, > because the individual hyperthreads of a core can't > possibly all run at the same time. Indeed I got confused between tsc_check_reliability and the actual rendezvous function, so it's likely the adjustments done by the rendezvous are pointless when running virtualized IMO, due to the inability to likely schedule all the vCPUs at one to execute the rendezvous. > As occurs to me only now, I think we can improve accuracy > some (in particular on big systems) by making sure > struct calibration_rendezvous's master_tsc_stamp is not > sharing its cache line with semaphore and master_stime. The > latter get written by (at least) the BSP, while > master_tsc_stamp is stable after the 2nd loop iteration. > Hence on the 3rd and 4th iterations we could even prefetch > it to reduce the delay on the last one. Seems like a possibility indeed. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |