[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3] x86/nmi: start NMI watchdog on CPU0 after SMP bootstrap



On 19/02/18 15:18, Jan Beulich wrote:
>>>> On 19.02.18 at 15:23, <igor.druzhinin@xxxxxxxxxx> wrote:
>> We're noticing a reproducible system boot hang on certain
>> post-Skylake platforms where the BIOS is configured in
>> legacy boot mode with x2APIC disabled. The system stalls
>> immediately after writing the first SMP initialization
>> sequence into APIC ICR.
>>
>> The cause of the problem is watchdog NMI handler execution -
>> somewhere near the end of NMI handling (after it's already
>> rescheduled the next NMI) it tries to access IO port 0x61
>> to get the actual NMI reason on CPU0. Unfortunately, this
>> port is emulated by BIOS using SMIs and this emulation for
>> some reason takes more time than we expect during INIT-SIPI-SIPI
>> sequence. As the result, the system is constantly moving between
>> NMI and SMI handler and not making any progress.
>>
>> To avoid this, initialize the watchdog after SMP bootstrap on
>> CPU0 and, additionally, protect the NMI handler by moving
>> IO port access before NMI re-scheduling. The latter should help
>> in case of post boot CPU onlining. Although we're running
>> watchdog at much lower frequency it's neveretheless possible
>> we may trigger the issue anyway.
> 
> I'm afraid I can't connect "the latter" to anything earlier in the
> description.

It's the previous sentence - there are 2 things that we do here - the
latter is "protect the NMI handler by moving IO port access before NMI
re-scheduling"

Igor


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.