[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL



Hi Jan,

On 09/06/17 09:19, Jan Beulich wrote:
On 07.06.17 at 10:12, <JBeulich@xxxxxxxx> wrote:
On 06.06.17 at 21:19, <sstabellini@xxxxxxxxxx> wrote:
On Tue, 6 Jun 2017, Jan Beulich wrote:
On 06.06.17 at 16:00, <ian.jackson@xxxxxxxxxxxxx> wrote:
Looking at the serial logs for that and comparing them with 10009,
it's not terribly easy to see what's going on because the kernel
versions are different and so produce different messages about xenbr0
(and I think may have a different bridge port management algorithm).

But the messages about promiscuous mode seem the same, and of course
promiscuous mode is controlled by userspace, rather than by the kernel
(so should be the same in both).

However, in the failed test we see extra messages about promis:

  Jun  5 13:37:08.353656 [ 2191.652079] device vif7.0-emu left promiscuous
mode
  ...
  Jun  5 13:37:08.377571 [ 2191.675298] device vif7.0 left promiscuous mode

Wouldn't those be another result of the guest shutting down /
being shut down?

Also, the qemu log for the guest in the failure case says this:

  Log-dirty command enable
  Log-dirty: no command yet.
  reset requested in cpu_handle_ioreq.

So this would seem to call for instrumentation on the qemu side
then, as the only path via which this can be initiated is - afaics -
qemu_system_reset_request(), which doesn't have very many
callers that could possibly be of interest here. Adding Stefano ...

I am pretty sure that those messages come from qemu traditional: "reset
requested in cpu_handle_ioreq" is not printed by qemu-xen.

Oh, indeed - I didn't pay attention to this being a *-qemut-*
test. I'm sorry.

In any case, the request comes from qemu_system_reset_request, which is
called by hw/acpi.c:pm_ioport_writew. It looks like the guest OS
initiated the reset (or resume)?

Right, this and hw/pckbd.c look to be the only possible
sources. Yet then it's still unclear what makes the guest go
down.

So with all of the above in mind I wonder whether we shouldn't
revert 933f966bcd then - that debugging code is unlikely to help
with any further analysis of the issue, as reaching that code
for a dying domain is only a symptom as far as we understand it
now, not anywhere near the cause.

Are you suggesting to revert on Xen 4.9?

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.