[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: handle_pio looping during domain shutdown, with qemu 4.2.0 in stubdom



> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx> On Behalf Of Jan 
> Beulich
> Sent: 05 June 2020 15:00
> To: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
> Subject: Re: handle_pio looping during domain shutdown, with qemu 4.2.0 in 
> stubdom
> 
> On 05.06.2020 13:18, Marek Marczykowski-Górecki wrote:
> > On Fri, Jun 05, 2020 at 11:38:17AM +0200, Jan Beulich wrote:
> >> On 04.06.2020 03:46, Marek Marczykowski-Górecki wrote:
> >>> Hi,
> >>>
> >>> (continuation of a thread from #xendevel)
> >>>
> >>> During system shutdown quite often I hit infinite stream of errors like
> >>> this:
> >>>
> >>>     (XEN) d3v0 Weird PIO status 1, port 0xb004 read 0xffff
> >>>     (XEN) domain_crash called from io.c:178
> >>>
> >>> This is all running on Xen 4.13.0 (I think I've got this with 4.13.1
> >>> too), nested within KVM. The KVM part means everything is very slow, so
> >>> various race conditions are much more likely to happen.
> >>>
> >>> It started happening not long ago, and I'm pretty sure it's about
> >>> updating to qemu 4.2.0 (in linux stubdom), previous one was 3.0.0.
> >>>
> >>> Thanks to Andrew and Roger, I've managed to collect more info.
> >>>
> >>> Context:
> >>>     dom0: pv
> >>>     dom1: hvm
> >>>     dom2: stubdom for dom1
> >>>     dom3: hvm
> >>>     dom4: stubdom for dom3
> >>>     dom5: pvh
> >>>     dom6: pvh
> >>>
> >>> It starts I think ok:
> >>>
> >>>     (XEN) hvm.c:1620:d6v0 All CPUs offline -- powering off.
> >>>     (XEN) d3v0 handle_pio port 0xb004 read 0x0000
> >>>     (XEN) d3v0 handle_pio port 0xb004 read 0x0000
> >>>     (XEN) d3v0 handle_pio port 0xb004 write 0x0001
> >>>     (XEN) d3v0 handle_pio port 0xb004 write 0x2001
> >>>     (XEN) d4v0 XEN_DMOP_remote_shutdown domain 3 reason 0
> >>
> >> I can't seem to be able to spot the call site of this, in any of
> >> qemu, libxl, or libxc. I'm in particular curious as to the further
> >> actions taken on the domain after this was invoked: Do any ioreq
> >> servers get unregistered immediately (which I think would be a
> >> problem)?
> >
> > It is here:
> > https://github.com/qemu/qemu/blob/master/hw/i386/xen/xen-hvm.c#L1539
> >
> > I think it's called from cpu_handle_ioreq(), and I think the request
> > state is set to STATE_IORESP_READY before exiting (unless there is some
> > exit() hidden in another function used there).
> 
> Thanks. There's nothing in surrounding code there that would unregister
> an ioreq server. But as said elsewhere, I don't know qemu very well,
> and hence I may easily overlook how else one may get unregistered
> prematurely.
> 

See 
https://git.qemu.org/?p=qemu.git;a=commit;h=ba7fdd64b6714af7e42dfbe5969caf62c0823f75

This makes sure the server is destroyed in the exit notifier (called when the 
QEMU process is killed)

  Paul

> Jan





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.