[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/nmi: lower initial watchdog frequency to avoid boot hangs



On 06/02/18 18:17, Alexey G wrote:
> On Tue, 6 Feb 2018 17:21:19 +0000
> Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> wrote:
>> On 06/02/18 17:08, Alexey G wrote:
>>> The major concern here is the possiblity of SMI being triggered _not_
>>> by some specific I/O port access. Primarily, if it actually was a
>>> periodic SMI.
>>>
>>> If the actual SMI source is not related to some place in the NMI
>>> handler code but was eg. due to some SMI timer, lowering NMI watchdog
>>> frequency might not fix the issue completely, but lower its
>>> reproducibility (perhaps to some very rare occurrences). So it's
>>> better be sure what was the real source of SMI.
>>>   
>>
>> This *is* related to this instruction - it was confirmed empirically.
>> Removing this instruction stops SMIs from occurring and effectively
>> removes the issue leaving the frequency unchanged.
> 
> Hmm, it would be interesting to know for what evil purpose does it need
> to trap I/O port 61h.
> BTW, on which motherboard model the issue was reproduced?
> 

The issue has been reported for some Dell/Huawei Skylake platforms (one
of them PowerEdge R740 to be precise) but I don't think the others are
unaffected (the issue supposedly originates from Intel's reference code)
- the default BIOS setup indeed matters.

>>> 2. According to the code, it looks like NMI status reading happens
>>> while NMIs are still blocked -- this means that SMI handler must
>>> exec IRET by itself to reset NMI blocking state -- again, this is
>>> possible (eg. in unreal->protmode switching code), but not likely.
>>>   
>>
>> According to SDM one NMI might be pending while taken in SMI mode (see
>> ch. 34.8). This is actually even true if NMI comes while servicing
>> another NMI. So when we return to the NMI handler from SMI and finish
>> it properly the next one appears immediately.
> 
> If the SMI handler doesn't mess up with NMI blocking state, it
> means that SMI handler processes every reading of port 61h longer than a
> watchdog NMI period duration... which is quite long. Motherboard vendor
> did something very wrong with I/O trap handling in the SMI handler code
> if it takes so much.
> 

We're going to inform the vendors.

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxxx
> https://lists.xenproject.org/mailman/listinfo/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.