[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/watchdog: Use real timestamps for watchdog timeout

>>> On 24.05.13 at 14:41, Tim Deegan <tim@xxxxxxx> wrote:
> At 12:36 +0100 on 24 May (1369398982), Jan Beulich wrote:
>> > 1) Along with the local_irq_disable()/enable() pairs in
>> > local_time_calibration, having an atomic_t indicating "time data update
>> > in progress", allowing the NMI handler to decide to bail early.
>> > 
>> > 2) Modify local_time_calibration() to fill in a shadow cpu_time set, and
>> > a different atomic_t to indicate which one is consistent.  This would
>> > allow the NMI handler to always use one consistent set of timing
>> > information.
> Of those two, I prefer (1), just because it doesn't add any cost to the
> normal users of NOW().

The reason I dislike 1 is because you then have however small a
probability of many/all NMI instance just happening while the time
gets updated, resulting in all of them bailing early, and the
watchdog never firing.

> Using TSC to gate the actual watchdog crash might get a bit messy,
> especially if it ends up adding code to the users of write_tsc().

The only problematic write_tsc() user is that to recover from a
stopped counter in deep C states. This is not a meaningful problem
because - as just said in another reply - NMI storms and long times
spent in deep C states exclude each other.

> And nmi_watchdog_tick() can just check regs->eip as a
> hint not to trust the scale factors. :)

By doing a range check looking for it to point into some function?
How would that cope with LTO?


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.