[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xen-unstable-smoke test] 162597: regressions - FAIL

  • To: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Fri, 11 Jun 2021 08:58:58 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=syW9HbkIiOa9IrkfW/lkq+c4qWuGbNaK1+k53L9ByiQ=; b=kY9XUWvAGWdtEKsFZXzbPCDGwLsYjqGrfvWXMW6Xu605qNUnBbeQbusgBVsjfnaNtN6D3i6z2DgBlNNnKURI5NcZxvhbaaYlugA+32q7uAhxdcolvAWngG6x4Fc+YAhqCWEU+SACYWH+pNFlD5IYd9l5ud3ZhB5tQGhPhNVMp4GTrwEivHZ8J8OCLhjkKoAp/VvcbqGpRsMHQpULacEe9uE9qYEzoYeyQT4vI4L7QXIS+RpTNJsJED42uCjwCKQKgr/uWAWU/s1u6yG1D146F7HYLDF9r2y3OX4vesRFBXARSBHmHcI7VG8X8tYRuP+4j7jNT8XIdKWDnAQ99aj7ng==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=iylFZqzjN4wXnybxoVW8fH2tZtiCzLI3F+fv28I3pGbiCeMAJq66zoNBROlo4SwPDs3hkuOpOfxwXEH+FEgr5QpC/F4KxUQ80KVe+ZRTJMYT0f6YfZDX3o+QwKjzzpGNDDznKTPdE3vFQzsyV/7BHWn/Vx6VfpKkb8L7Juk6Wu4wfBrePYI92Nx596oiE20RoguM3PGYF7eB94R7Yqv7dYoBmZNYOijWDk5fyD6VOEJgmDm82Ci8/a/ReHsb3MQ8cGLDndQJ2RejURSTRv2kWVwpeTH32XP78pAYGRgIQ/n28bgVpy0MqNTBiFIgDK2QpDgIbXg0GHKFdEUvMp6miQ==
  • Authentication-results: lists.xenproject.org; dkim=none (message not signed) header.d=none;lists.xenproject.org; dmarc=none action=none header.from=suse.com;
  • Cc: Julien Grall <julien@xxxxxxx>, osstest service owner <osstest-admin@xxxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 11 Jun 2021 06:59:10 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 11.06.2021 03:49, Stefano Stabellini wrote:
> On Thu, 10 Jun 2021, Bertrand Marquis wrote:
>>> On 10 Jun 2021, at 12:32, Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>> On 10.06.2021 12:50, osstest service owner wrote:
>>>> flight 162597 xen-unstable-smoke real [real]
>>>> flight 162602 xen-unstable-smoke real-retest [real]
>>>> http://logs.test-lab.xenproject.org/osstest/logs/162597/
>>>> http://logs.test-lab.xenproject.org/osstest/logs/162602/
>>>> Regressions :-(
>>>> Tests which did not succeed and are blocking,
>>>> including tests which could not be run:
>>>> test-armhf-armhf-xl         18 guest-start/debian.repeat fail REGR. vs. 
>>>> 162574
>>> This now being the 3rd failure in a row, I guess there's a fair chance
>>> of there actually being something wrong with ...
>>>> commit dfcffb128be46a3e413eaa941744536fe53c94b6
>>>> Author: Stefano Stabellini <sstabellini@xxxxxxxxxx>
>>>> Date:   Wed Jun 9 10:37:59 2021 -0700
>>>>    xen/arm32: SPSR_hyp/SPSR
>>>>    SPSR_hyp is not meant to be accessed from Hyp mode (EL2); accesses
>>>>    trigger UNPREDICTABLE behaviour. Xen should read/write SPSR instead.
>>>>    See: ARM DDI 0487D.b page G8-5993.
>>>>    This fixes booting Xen/arm32 on QEMU.
>>>>    Signed-off-by: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>
>>>>    Reviewed-by: Julien Grall <jgrall@xxxxxxxxxx>
>>>>    Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xxxxxxxxxx>
>>>>    Tested-by: Edgar E. Iglesias <edgar.iglesias@xxxxxxxxxx>
>>> ... this. My Arm-untrained eye couldn't spot anything in the logs.
>> I am not sure to read the log correctly so do I see it right that dom0 
>> started and it failed then to start a guest ?

Well, in this particular flight it succeeded to create Dom1 (for
guest-start) and then it managed to also create Dom2, but failed to
get the expected "sign of life". It varies at which of the repeated
attempts the failure occurs (in one of the flights it also occurred
right at guest-start), but failure chances are high enough such that
so far in all of the flights things didn't complete successfully.
And with this high a failure rate, it accidentally succeeding and
thus leading to a push would probably do us more bad than good.

> Thanks Jan for bringing it to my attention. 
> I am not an expert in reading OSSTest logs. From the following:
> http://logs.test-lab.xenproject.org/osstest/logs/162597/test-armhf-armhf-xl/info.html
> I understand that Xen booted and a DomU was started. However,
> "migrate-support-check" and "saverestore-support-check" failed. Is that
> correct?

Yes, but these two steps aren't the problem - afaict they always fail,
and hence wouldn't prevent a push.

It's guest-start/debian.repeat which is the problem in this flight.

> If so, it would be really strange for SPSR_hyp/SPSR to cause the problem
> because I would expect Xen to hang at boot before Dom0 is started
> instead.
> I don't have any ARMv7 hardware to try to repro this issue, and ARMv7 is
> most certainly required (ARMv8/aarch32 won't repro.)
> Could someone more at ease with OSSTest than me arrange for a run with
> this commit reverted to verify that it is the issue?
> In any case, I tried to figure it out. I guessed it could be a compiler
> error. I followed the white rabbit down the ARM ARM hole. I disassebled
> the Xen binary [1] from the failed job. "msr SPSR, r11" is 0x0026a38c.
> The encoding should be at B9.3.12 of the ARMv7-A DDI 0406C and F5.1.121
> of ARMv8 DDI 0487D.b. Unfortunately it doesn't seem to match either one
> of them and I don't understand why.
> The "mrs r11, SPSR" is generated as 0x00262ecc. That should be described
> at F5.1.117 for ARMv8 and B9.3.9 for ARMv7. Also doesn't seem to match.

Indeed I was wondering whether perhaps the tool chain has an issue here.
Otoh I'd expect a tool chain issue to yield consistent failures rather
than ones with just a fair probability. Unless, of course, unspecified
behavior is hit, and the hardware indeed behaves randomly in this case.




Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.