[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 17780: tolerable FAIL - PUSHED

>>> On 22.04.13 at 21:41, xen.org <ian.jackson@xxxxxxxxxxxxx> wrote:
> flight 17780 xen-unstable real [real]
> http://www.chiark.greenend.org.uk/~xensrcts/logs/17780/ 
> Failures :-/ but no regressions.
> Regressions which are regarded as allowable (not blocking):
>  test-amd64-amd64-xl-sedf      7 debian-install            fail REGR. vs. 
> 17778


how do the timestamps in the serial log get generated? Particularly,
are these in any way dependent on the time of the host that the
logs originate from? I'm asking because this sequence

Apr 22 15:30:59.512179 (XEN) UPD03: t=0030ad29f3b4 l=000c5e44bd4d m=000c5e45242a
Apr 22 15:30:59.524107 (XEN) UPD07: t=0030ad29f385 l=000c5e44bcf2 m=000c5e45242a
Apr 22 15:30:59.524165 (XEN) UPD06: t=0030ad29f391 l=000c5e44bcf2 m=000c5e45242a
Apr 22 15:30:59.532103 (XEN) UPD04: t=0030ad29f3cc l=000c5e44bd3c m=000c5e45242a
Apr 22 15:30:59.532157 (XEN) UPD00: t=0030ad29f32e l=000c5e44bd2c m=000c5e45242a
Apr 22 15:30:59.544108 (XEN) UPD02: t=0030ad29f3bc l=000c5e44bd50 m=000c5e45242a
Apr 22 15:30:59.544164 (XEN) UPD01: t=0030ad29f3d2 l=000c5e44bd22 m=000c5e45242a
Apr 22 15:30:59.547711 (XEN) PLT: s=000c5e467b19 c=00008e48a66e
Apr 22 15:30:59.547759 (XEN) UPD02: t=00a0375364b2 l=00698c938de5 m=0023b5d300d8
Apr 22 15:37:39.736264 (XEN) UPD01: t=00a0375364de l=00698c938dfb m=0023b5d300d8
Apr 22 15:37:39.748665 (XEN) UPD04: t=00a03753648a l=00698c938dba m=0023b5d300d8
Apr 22 15:37:39.748749 (XEN) UPD06: t=00a0375364be l=00698c938e13 m=0023b5d300d8
Apr 22 15:37:39.756571 (XEN) Platform timer appears to have unexpectedly 
wrapped 1 times.
Apr 22 15:37:39.756631 (XEN) UPD07: t=00a0375364c6 l=00698c938e2b m=0023b5d300d8
Apr 22 15:37:39.768560 (XEN) UPD03: t=00a0375364ba l=00698c938df6 m=0023b5d300d8
Apr 22 15:37:39.768615 (XEN) UPD05: t=00a037536482 l=00698c938dba m=0023b5d300d8
Apr 22 15:37:39.776557 (XEN) PLT: n=00698c93df6f pn=00698d31b1b1 
pm=0000ffffffff pw=00af64900db1
Apr 22 15:37:39.776618 (XEN) PLT: c=0000e3d7c0e5 d=823444f2 s=0001e3d7c0e5
Apr 22 15:37:39.788591 (XEN) UPD00: t=00a037536426 l=00698c938de7 m=0023b5d300d8
Apr 22 15:37:39.788647 (XEN) PLT: s=00698d33fccb c=0001e3d7c94d

appears to be inconsistent, in that the second UPD02 instance has a
log time stamp that is almost 7 minutes too early, while the immediately
preceding PLT: line still looks consistent. If this can't be explained by
how the logs get time stamped, this could be a hint at the problem
we're having with the timer code (yet I can't seem to make sense of
that "hint").


the almost 7 minute gap hints at C-state management being broken
on these machines, yet it already assumes the LAPIC timer to be
reliable only in C1, so theoretically the HPET broadcast mechanism
ought to keep the CPUs waking up from deeper C states.

What's puzzling too is that these issues only ever occur on SEDF,
and with SEDF having been broken entirely until relatively recently,
this information can't really be used to point at some specific time
range during which an eventual regression slipped in. Of course,
with the issue (as far as I'm aware) never having occurred on 4.2,
the new MWAIT idle driver might be part of the problem (despite it
leveraging the same HPET/PIT broadcast mechanism as the ACPI
one). And wrt SEDF, I don't think the scheduler itself is at fault
here - it seems more likely to me that it merely uses timers in
ways sufficiently different from the credit one to (once in a while)
have an effect on wakeup behavior.

If anyone has any other ideas on how to explain the observed
behavior, please let me know!


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.