[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xen-4.15-testing test] 168970: regressions - FAIL


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 30 Mar 2022 09:32:15 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NqvfZqfJm8N+O8e5n2Eq0kzoj3jiLBY34rMIGTB9nGE=; b=KsLksHfNmNj+fzCKjWtIvyq3re2LP/i09jNGYgw0qqY+Dpd9QOx2Y/D9EP6rPhowHeoytIxsL9iEFLWWvGsmA3E3rJG2JL5mvcQ49DW6IjvlUzrbkL1J20i34boIdMVN7vXl5wWnHGCEDWD4B0aws1q9PR7icdgg+59NOPY9WgAj8fwPlOqmnRj/F7YC0d49n8H96ZJJyk1fZsx17Qmi28UGYcqKAGvCVycTx7Zisz/WN3PqCP3J9mASNqlvk6tLAKMOphZZI4flJ6Rm2KxPG3dkyY/uJhMqXZT/acHWCKtjpJxCuZT9pjQxznvQODUrrn+JUeLLbomHVNQwI0gq3g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hvGbhW4LAby/3uM6YakYqtTQRhusTURhcsOsu1Vlge5l5nzLqBYJ1l613whAfpH4dMoHMkxkjgh2YcQOWB94zsHrvKU1MiEpEv2wsSOhZtCVvfs1MZGKtFwXk/takjI8nzS/tofoARODgXUs41AuXhKeXJrfi/BRcUPXQPStV/LHdmqmqcmOr4w+r0vRq56a7kxJDQB7nHqbKN0Uk3uc7SLIeyq5lLGyde4zJGArNJwMgSF2TpKXBcGueHxm7etbEM0JZI3jWwRYoWduQ3e9kQX6VTgiaXVM5lVMkACtQFdThCJ4RPFqxqHfuEyjGeyu1wWOVUcF8P1OwodB2rtJtQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: osstest service owner <osstest-admin@xxxxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 30 Mar 2022 07:32:41 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 29.03.2022 20:06, osstest service owner wrote:
> flight 168970 xen-4.15-testing real [real]
> flight 168989 xen-4.15-testing real-retest [real]
> http://logs.test-lab.xenproject.org/osstest/logs/168970/
> http://logs.test-lab.xenproject.org/osstest/logs/168989/
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-amd64-i386-livepatch    13 livepatch-run            fail REGR. vs. 
> 168502
>  test-amd64-amd64-livepatch   13 livepatch-run            fail REGR. vs. 
> 168502

Looks like it's more than just the one commit you did put on top of
the original batch. The log has

Mar 29 08:02:17.743419 (XEN) livepatch.c:1578: livepatch: xen_nop: timeout is 
30000000ns
Mar 29 08:02:17.743442 (XEN) livepatch.c:1690: livepatch: xen_nop: CPU44 - 
IPIing the other 55 CPUs
Mar 29 08:02:17.755416 (XEN) livepatch: xen_nop: Applying 1 functions
Mar 29 08:02:17.755436 (XEN) livepatch: xen_nop finished APPLY with rc=0
Mar 29 08:02:17.767371 (XEN) *** DOUBLE FAULT ***
Mar 29 08:02:18.031400 (XEN) *** DOUBLE FAULT ***
Mar 29 08:02:18.031417 (XEN) *** DOUBLE FAULT ***
Mar 29 08:02:18.031427 (XEN) *** DOUBLE FAULT ***
...

Clearly not very helpful that the double fault handler itself hits #DF
again before it can print anything useful. With the first printk()
completing but print_xen_info()'s not showing up I have some trouble
guessing where things might hit that nested #DF ...

Actually, xen_nop fiddles with xen_minor_version(), which print_xen_info()
calls. The comment in xen_nop.c about relying on the function being built
a certain way doesn't look very promising. Another comment referring to
"req" when likely "ret" is meant also doesn't help clarity. Since the
ENDBR is skipped while applying patches, the assumption is clearly
violated. Aiui this will lead to the RET being overwritten with NOP. And
this issue clearly exists only in the stable trees, as the function
wouldn't have ENDBR in staging/master.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.