[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] Xen/timer: Disable watchdog during dumping timer queues



On 9/19/2016 10:46 PM, Jan Beulich wrote:
Well, without a clear understanding of why the issue occurs (for
>> which I need to refer you back to the questionable stack dump)
>> I'm hesitant to agree to this step, yet ...
>
> After some researches, I found do_invalid_op() on the stack dump is
> caused by run_in_exception_handler(__ns16550_poll) in the ns16550_poll()
> rather than fatal event. The timeout issue still exists when run
> __ns16550_poll() directly in the ns16550_poll().
Well, I then still don't see why e.g. dump_domains() doesn't also need
it.

After testing, dump_domains() also has such issue after I create two VM
with 128 vcpus.

Earlier you did say:

  Keyhandler may run in the timer handler and the following log shows
  calltrace. The timer subsystem run all expired timers' handler
  before programing next timer event. If keyhandler runs longer than
  timeout, there will be no chance to configure timer before triggering
  watchdog and hypervisor rebooting.

The fact that using debug keys may adversely affect the rest of the
system is known. And the nesting of process_pending_softirqs()
inside do_softirq() should, from looking at them, work fine. So I
continue to have trouble seeing the specific reason for the problem
you say you observe.

The precondition of process_pending_softirq() working in the debug key
handler is that timer interrupt arrives on time and nmi_timer_fn() can
run to update nmi_timer_ticks before watchdog timeout.

When a timer interrupt arrives, timer_softirq_action() will run all
expired timer handlers before programing next timer interrupt via
reprogram_timer(). If a timer handler runs too long E,G >5s(Time for
watchdog timeout is default to be 5s.), this will cause no timer
interrupt arriving within 5s and nmi_timer_fn() also won't be called.
Does this make sense to you?


And as a separate note - dump_registers() is quite an exception
among the key handlers, and that's for a good reason (as the
comment there says). So I continue to be hesitant to see this
spread to other key handlers.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.