| 
    
 [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: xen 4.14.3 incorrect (~3x) cpu frequency reported
 On 10.01.2022 18:04, Jan Beulich wrote:
> On 10.01.2022 16:43, Andrew Cooper wrote:
>> On 10/01/2022 13:11, Jan Beulich wrote:
>>> On 10.01.2022 13:37, Roger Pau Monné wrote:
>>>> On Fri, Jan 07, 2022 at 12:39:04PM +0100, Jan Beulich wrote:
>>>>> @@ -415,16 +416,35 @@ static int64_t __init init_hpet(struct p
>>>>>  
>>>>>      pts->frequency = hpet_rate;
>>>>>  
>>>>> +for(i = 0; i < 16; ++i) {//temp
>>>>>      count = hpet_read32(HPET_COUNTER);
>>>>>      start = rdtsc_ordered();
>>>>>      target = count + CALIBRATE_VALUE(hpet_rate);
>>>>>      if ( target < count )
>>>>>          while ( hpet_read32(HPET_COUNTER) >= count )
>>>>>              continue;
>>>>> -    while ( hpet_read32(HPET_COUNTER) < target )
>>>>> +    while ( (count = hpet_read32(HPET_COUNTER)) < target )
>>>>>          continue;
>>>>>  
>>>>> -    return (rdtsc_ordered() - start) * CALIBRATE_FRAC;
>>>>> +    expired = rdtsc_ordered() - start;
>>>> There's also a window between the HPET read and the TSC read where an
>>>> SMI/NMI could cause a wrong frequency detection.
>>> Right, as said in an earlier reply I did notice this myself (actually
>>> on the way home on Friday). As also said there, for now I can't see
>>> any real (i.e. complete) solution to this and the similar instances
>>> of the issue elsewhere.
>>
>> Calibration loops like this cannot be made robust.  This is specifically
>> why frequency information is being made available via architectural
>> means, and is available via non-architectural means in other cases.
>>
>> There's a whole bunch of situations (#STOPCLK, TERM and PSMI off the top
>> of my head, but I bet HDC will screw with things too) which will mess
>> with any calibration loop, even if you could disable SMIs.  While,
>> mechanically, we can disable SMIs on AMD with CLGI, we absolutely
>> shouldn't run a calibration loop like this with SMIs disabled.
>>
>>
>> Much as I hate to suggest it, we should parse the frequency out of the
>> long CPUID string, and compare it to the calibration time.  (This
>> technique is mandated in the Intel BWG during very early setup.)
> 
> This, according to some initial checking, might work for Intel CPUs,
> but apparently won't work for AMD ones (and I don't dare to think of
> what might happen if we run under, say, qemu). Imo we'd need to deal
> gracefully with the case that we can't parse the frequency out of
> that string, with "gracefully" including that our calibration still
> won't be too far off.
May I further (re-)emphasize the fact that the issue exists not only
in the calibration loops. There we wouldn't have a result that could
(in principle) be compared against some parsed value. Whereas my
proposed model can be adopted for use there as well (I think; I
didn't actually try it out yet, as I was hoping I was overlooking
some other, fully reliable approach).
Jan
 
  | 
  
![]()  | 
            
         Lists.xenproject.org is hosted with RackSpace, monitoring our  |