[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: xen 4.14.3 incorrect (~3x) cpu frequency reported



On 06/01/2022 16:00, Jan Beulich wrote:
> On 06.01.2022 16:08, James Dingwall wrote:
>>>> On Wed, Jul 21, 2021 at 12:59:11PM +0200, Jan Beulich wrote:               
>>>>                                                              
>>>>> On 21.07.2021 11:29, James Dingwall wrote:                                
>>>>>                                                              
>>>>>> We have a system which intermittently starts up and reports an incorrect 
>>>>>> cpu frequency:                                               
>> ...
>>>> I'm sorry to ask, but have you got around to actually doing that? Or
>>>> else is resolving this no longer of interest?
>> We have experienced an occurence of this issue on 4.14.3 with 'loglvl=all'
>> present on the xen command line.  I have attached the 'xl dmesg' output for
>> the fast MHz boot, the diff from the normal case is small so I've not added
>> that log separately:
>>
>> --- normal-mhz/xl-dmesg.txt     2022-01-06 14:13:47.231465234 +0000
>> +++ funny-mhz/xl-dmesg.txt      2022-01-06 13:45:43.825148510 +0000
>> @@ -211,7 +211,7 @@
>>  (XEN)  cap enforcement granularity: 10ms
>>  (XEN) load tracking window length 1073741824 ns
>>  (XEN) Platform timer is 24.000MHz HPET
>> -(XEN) Detected 2294.639 MHz processor.
>> +(XEN) Detected 7623.412 MHz processor.
>>  (XEN) EFI memory map:
>>  (XEN)  0000000000000-0000000007fff type=3 attr=000000000000000f
>>  (XEN)  0000000008000-000000003cfff type=7 attr=000000000000000f
>> @@ -616,6 +616,7 @@
>>  (XEN) PCI add device 0000:b7:00.1
>>  (XEN) PCI add device 0000:b7:00.2
>>  (XEN) PCI add device 0000:b7:00.3
>> +(XEN) Platform timer appears to have unexpectedly wrapped 10 or more times.
>>  (XEN) [VT-D]d0:PCIe: unmap 0000:65:00.2
>>  (XEN) [VT-D]d32753:PCIe: map 0000:65:00.2
>>  (XEN) [VT-D]d0:PCIe: unmap 0000:65:00.1
> Thanks. In an earlier mail the reported value was 6895.384 MHz, but I
> guess that was on a different system (with a base freq of 2200 MHz).
> I wonder how stable the too high value is ...
>
> For the moment I have only one possibly explanation: A SMI hitting in
> the middle of the tail of init_hpet() (or init_pmtimer()), taking long
> enough to cause the function to return way too large a number. With a
> 50ms calibration period that would be about 166ms. I vaguely recall
> having heard of SMI potentially taking this long.

SMI's are stupidly long.  To avoid leaking secrets via speculation, SMIs
have to rendezvous at least the sibling threads, and SMM entry/exit
undergoes every flushing action which has been slowing down software
since 2018.

You can confirm SMIs using MSR_SMI_COUNT (0x34).  It's
non-architectural, but is present in Nehalem and later.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.