[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/nmi: lower initial watchdog frequency to avoid boot hangs



On Thu, 8 Feb 2018 10:47:45 +0000
Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> wrote:
>I've done this measurement before. So what we are seeing exactly is
>that the time we are spending in SMI is spiking (sometimes up to
>200ms) at the moment we go through INIT-SIPI-SIPI sequence. Looks like
>this is enough to push the system into a livelock spiral. So I agree
>with Jan to some point that the proposed workaround might not be
>working on some systems.

According to the Xen code, NMI expected for 2 primary purposes:
- watchdog NMI from LAPIC
- "system" NMIs (like due to SERR)

Most of the time we deal with watchdog NMIs, while all others should be
somewhat rare. The thing is, we actually need to read I/O port 61h on
system NMIs only. 

If the main problem lies in a flow of SMIs due to reading port 61h on
every NMI watchdog tick -- why not to avoid reading it?

There are at least 2 ways to check if the NMI was due to a watchdog
tick:
- LAPIC (SDM states that "When a performance monitoring counters
interrupt is generated, the mask bit for its associated LVT entry is
set")
- perf MSR overflow bit

So, if we detect it was a NMI due to a watchdog using these
methods (early in the NMI handler), we can avoid touching the port 61h
and thus triggering SMI I/O trap on it.

>> There might be a chance that perf counter frequency is calculated
>> wrong for some systems, resulting in a very high rate of NMI
>> watchdog ticks instead of long SMI handler execution time. >10ms
>> just looks... too extreme.
>>   
>
>We ruled that out.
>
>> Huawei Server 2488 V5 BIOS -- similar SMI I/O trap handler for the
>> port 61h found. Some differences with gigabyte H270 system though:
>> 
>> - no "allocated" I/O traps anymore, but one additional SMI I/O trap
>>   encountered: port 900h, dword size. Possibly related to PCIe PM
>>   facilities.
>> 
>> - port 61h SMI handler now has multiple calls to debug/assert stub
>>   functions -- there might be a chance that some of impacted systems
>>   had debug build on, resulting in those stubs expanded to some real
>>   debugging code with negative impact on SMI handling speed.
>> 
>> Few additional observations:
>> 
>> - port 61h I/O Trap SMI handler checks accessed I/O address/size to
>> be equal to 61h/1byte. There might be some difference when reading
>> port 61h via inw(0x60)/inl(0x60)/etc
>> 
>> - looks like there exist an alternative way to read NMI status
>> without triggering SMI -- via ports 63h/65h/67h, but this depends on
>>   undocumented bit in Generic Control and Status register
>>   


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.