Xen project Mailing List

Re: xen 4.14.3 incorrect (~3x) cpu frequency reported

Date: Tue, 11 Jan 2022 08:09:53 +0100

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+Rg5aORLBy2VyxCSBoNlGxFi+WpuKQMtr44s6uJCrDc=; b=ZmYf5NiRz6Z8PAbQMHMMoapk8fg3aMuTaAS2rHPgqe9EAmkFuns14bXrhMtO6Cw8VxQnGPnO7wknmEFb7euiTSIV1+ToU01aoOeq7osDcdQmxZ+Zy0tHTaXxvk921QtBxgBF9JpZ39BzrIS311jDHeibb6MRNfG3t9FJV/wffxb9tsnAB9D4507B2mfYDjQi9OpIfiPXH3vh0nyCZkue6pylLB0syD/RWdYVo0lor2Ghmm1C5sZH90hHu840U1fI3pFRXw0LGrU6WXGfS5weOh3RvYyA0wMhL7W+pIv17whmGZY5AykV8kdzsr6pTfTkaYPy4znFwB/CFWaqMW6v0g==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hYS8RRQvJHXzVPr6N55WjxIGO/QSfVg/qmX26Ml4zUsN2ksaKNIARl/9h7SluGlJgJw0nWCIqNZPlefBLhN0K8l5IzSh5M2SB+7sV7zv5FWOcPshee86O59yO74C42Bf2c/Cnliv2a++ce678yyqlKelUArJiIRpjFFUEVrGGZxawjI8ZdGoIFrve/WarRy0qKzUGLsvNigo/oCMu22VTgTSeTDTjgVIm6S3c44q+ZNyBRsndlFxoxRGxyuLawV9dt6TU5eOiZqYL4sUSTh1AegkBZfYp1RbFIpvAwJ/ZGB7lnH6+i4oEGObUGV4oTs5rNFDipLoJSd1IlifOu3uHw==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: James Dingwall <james-xen@xxxxxxxxxxxxxx>, alexander.rossa@xxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Andrew Cooper <amc96@xxxxxxxx>

Delivery-date: Tue, 11 Jan 2022 07:10:20 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 11.01.2022 06:32, Juergen Gross wrote: > On 10.01.22 18:04, Jan Beulich wrote: >> On 10.01.2022 16:43, Andrew Cooper wrote: >>> On 10/01/2022 13:11, Jan Beulich wrote: >>>> On 10.01.2022 13:37, Roger Pau Monné wrote: >>>>> On Fri, Jan 07, 2022 at 12:39:04PM +0100, Jan Beulich wrote: >>>>>> @@ -415,16 +416,35 @@ static int64_t __init init_hpet(struct p >>>>>> >>>>>> pts->frequency = hpet_rate; >>>>>> >>>>>> +for(i = 0; i < 16; ++i) {//temp >>>>>> count = hpet_read32(HPET_COUNTER); >>>>>> start = rdtsc_ordered(); >>>>>> target = count + CALIBRATE_VALUE(hpet_rate); >>>>>> if ( target < count ) >>>>>> while ( hpet_read32(HPET_COUNTER) >= count ) >>>>>> continue; >>>>>> - while ( hpet_read32(HPET_COUNTER) < target ) >>>>>> + while ( (count = hpet_read32(HPET_COUNTER)) < target ) >>>>>> continue; >>>>>> >>>>>> - return (rdtsc_ordered() - start) * CALIBRATE_FRAC; >>>>>> + expired = rdtsc_ordered() - start; >>>>> There's also a window between the HPET read and the TSC read where an >>>>> SMI/NMI could cause a wrong frequency detection. >>>> Right, as said in an earlier reply I did notice this myself (actually >>>> on the way home on Friday). As also said there, for now I can't see >>>> any real (i.e. complete) solution to this and the similar instances >>>> of the issue elsewhere. >>> >>> Calibration loops like this cannot be made robust. This is specifically >>> why frequency information is being made available via architectural >>> means, and is available via non-architectural means in other cases. >>> >>> There's a whole bunch of situations (#STOPCLK, TERM and PSMI off the top >>> of my head, but I bet HDC will screw with things too) which will mess >>> with any calibration loop, even if you could disable SMIs. While, >>> mechanically, we can disable SMIs on AMD with CLGI, we absolutely >>> shouldn't run a calibration loop like this with SMIs disabled. >>> >>> >>> Much as I hate to suggest it, we should parse the frequency out of the >>> long CPUID string, and compare it to the calibration time. (This >>> technique is mandated in the Intel BWG during very early setup.) >> >> This, according to some initial checking, might work for Intel CPUs, >> but apparently won't work for AMD ones (and I don't dare to think of >> what might happen if we run under, say, qemu). Imo we'd need to deal >> gracefully with the case that we can't parse the frequency out of >> that string, with "gracefully" including that our calibration still >> won't be too far off. >> >> Also I wonder if we wouldn't better prefer CPUID leaf 0x15 / 0x16 >> data over parsing extended leaf. >> >>> If it is different by a large margin, rerun the calibration, and if it >>> is still different, complain loudly into the logs. This will fix a >>> one-off-spurious event, whereas if e.g. the system is thermally >>> throttling due to a bad heat sync, there is nothing software can do to >>> recover the system. >> >> I was already considering to use reference data like that from CPUID >> leaves 0x15 / 0x16, but I couldn't really settle on what "large >> margin" would really want to be. Imo we should try to correct not- >> just-as-large errors as well, if we can. >> >> For the moment I continue to consider my plan (outlined in another >> reply on this thread) superior to what you suggest, but I'll be >> looking forward to further replies from you or others. > > Didn't Andrew mention that SMIs can be detected to have happened via > an SMI counter? Yes, but that's again an Intel-only thing. Plus, as said elsewhere, I don't think it can be considered sufficient to account for SMI and NMI alone - there can be any number of other things causing delays. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.