[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/nmi: lower initial watchdog frequency to avoid boot hangs

On 08/02/18 12:32, Alexey G wrote:
> On Thu, 8 Feb 2018 10:47:45 +0000
> Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> wrote:
>> I've done this measurement before. So what we are seeing exactly is
>> that the time we are spending in SMI is spiking (sometimes up to
>> 200ms) at the moment we go through INIT-SIPI-SIPI sequence. Looks like
>> this is enough to push the system into a livelock spiral. So I agree
>> with Jan to some point that the proposed workaround might not be
>> working on some systems.
> According to the Xen code, NMI expected for 2 primary purposes:
> - watchdog NMI from LAPIC
> - "system" NMIs (like due to SERR)

- Perf/Oprofile.  This is currently mutually exclusive with Xen using
the watchdog, but needn't be and hopefully won't be in the future.

> Most of the time we deal with watchdog NMIs, while all others should be
> somewhat rare. The thing is, we actually need to read I/O port 61h on
> system NMIs only. 
> If the main problem lies in a flow of SMIs due to reading port 61h on
> every NMI watchdog tick -- why not to avoid reading it?
> There are at least 2 ways to check if the NMI was due to a watchdog
> tick:
> - LAPIC (SDM states that "When a performance monitoring counters
> interrupt is generated, the mask bit for its associated LVT entry is
> set")
> - perf MSR overflow bit
> So, if we detect it was a NMI due to a watchdog using these
> methods (early in the NMI handler), we can avoid touching the port 61h
> and thus triggering SMI I/O trap on it.

The problem is having multiple NMIs arriving.  Like all other edge
triggered interrupts, extra arrivals get dropped.  By skipping the 0x61
read if we believe it was a watchdog NMI, we've opened a race condition
where we will completely miss the system NMI.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.