[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/nmi: lower initial watchdog frequency to avoid boot hangs
On 06/02/18 16:23, Jan Beulich wrote: >>>> On 06.02.18 at 17:14, <igor.druzhinin@xxxxxxxxxx> wrote: >> On 06/02/18 16:07, Jan Beulich wrote: >>>>>> On 05.02.18 at 22:18, <igor.druzhinin@xxxxxxxxxx> wrote: >>>> --- a/xen/arch/x86/nmi.c >>>> +++ b/xen/arch/x86/nmi.c >>>> @@ -34,7 +34,8 @@ >>>> #include <asm/apic.h> >>>> >>>> unsigned int nmi_watchdog = NMI_NONE; >>>> -static unsigned int nmi_hz = HZ; >>>> +/* initial watchdog frequency - shouldn't be too high to avoid boot hangs >> */ >>>> +static unsigned int nmi_hz = HZ / 10; >>> >>> For one - the comment should explain what "too high" means. >>> Further - what if on another system 10Hz is still too high? I also hope >>> you realize that you slow down boot a little for everyone just >>> because of this one machine model. Can the lower frequency perhaps >>> be set via DMI quirk, or otherwise obtain a command line override >>> (perhaps something like "watchdog=probe:10Hz")? >>> >> >> I can improve the comment message. >> Why does this change slow down anything while I'm lowering the frequency >> - not making it higher? > > We wait for two occurrences of the NMI in wait_for_nmis(). > >> The alternative approach would be to reshuffle >> the code and take the reason before programming the next interrupt as >> suggested before. In that case the actual frequency would be adjusted >> naturally I think. > > Thinking about this, reading the reason early seems like a good idea > to me irrespective of the issue here. > I ran a couple of experiments with different layouts in NMI handler: it looks like it doesn't really help as merely having this instruction inside the handler and running it at 100Hz breaks a number of timeouts in SMP bootstrap code and makes it unstable. So we are back to lowering the frequency as I'm now out of ideas. The problem with a quirk/commandline parameter is that the issue is reported for a wide variety of systems and, as it looks like, depends on the default BIOS setup - means it's hard to identify particular machines. We should obviously sort this out with Intel but until then lowering the initial frequency is our only option. Igor _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |