|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Wg-test-framework] baroque1 hardware problem
On Wed, 20 May 2015 11:54:51 -0400
Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> wrote:
> Ian Jackson writes ("Minutes All-Net synchronisation call, 20th May"):
> > ACTION: Ian to send technical details, so that All-net can raise
> > with supplier (Intel).
>
> So, the problem is as follows:
>
>
> Summary:
> --------
>
> Sometimes, when powered on, baroque1 does not come up.
>
> Symptoms are that the serial control lines do change (visible in
> sympathy log), but no text is printed on the serial console. Waiting
> a long time (up to at least ten minutes) has no effect. Sending
> "return" on the serial console elicits no response.
>
> After the problem has occurred, often more than one further attempt to
> power cycle the machine is required to get it to work again. After
> that it works normally until the fault recurs.
>
>
> Repro method:
> -------------
>
> * Write a pxeboot file which refers to a stock Wheey amd64
> debian-installer netboot image, and specifies a preseed file.
>
> * Power off (via the PDU).
>
> * Wait 30s.
>
> * Power on (via the PDU).
>
> * Monitor the preseed file http server access log waiting for the
> pressed file to be downloaded.
>
> * When the preseed file has been fetched, declare "success".
>
> Then run round for the next repetition. (Generally, this means
> that the server is powered off in the middle of one of
> debian-installer's software-fetching steps.)
>
> Alternatively, after 350s, declare "failure" and stop.
>
>
> Statistical information:
> ------------------------
>
> * My records show failures after the following number of repetitions:
> 96 (not quite sure about this - data collection was affected by an
> unrelated network problem on my workstation)
> 29, 25, 26.
>
> * My records show the following number of attempts needed to get the
> machine to work at all, again:
> 1, 3, 2, 3
>
> * I have run the same test on baroque0. It has managed (at least) 400
> consecutive power cycle restarts without problem.
>
>
> Handover:
> ---------
>
> I hereby hand both baroque0 and baroque1 over to you. (It seems most
> sensible to give you the working machine too, for comparison.)
Noted.
> The current setup in the colo is the PXE configuration as described
> above.
>
> So I think it should be possible to reproduce the problem as follows:
>
> - power baroque1 off
> - wait 30s
> - power baroque1 on
>
> - wait for it to show life on the serial console
>
> - wait for it to show entry into debian-installer
> (eg wait for "Setting up the clock" to be printed on the
> serial console), then declare success and go round again
I should be able to modify the oseleta test script to do this.
> I have disconnected the serial consoles of both machines from
> sympathy, so you should be able to connect to them with picocom or
> expect on /dev/ttyRP5 and /dev/ttyRP6.
Thanks.
> NB that I am now going to be away until next Wednesday morning.
NB'ed.
-d
> Ian.
>
_______________________________________________
Wg-test-framework mailing list
Wg-test-framework@xxxxxxxxxxxxxxxxxxxx
http://lists.xenproject.org/cgi-bin/mailman/listinfo/wg-test-framework
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |