[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: handle_pio looping during domain shutdown, with qemu 4.2.0 in stubdom



On 08.06.20 11:15, Paul Durrant wrote:
-----Original Message-----
From: Jan Beulich <jbeulich@xxxxxxxx>
Sent: 08 June 2020 09:14
To: 'Marek Marczykowski-Górecki' <marmarek@xxxxxxxxxxxxxxxxxxxxxx>; paul@xxxxxxx
Cc: 'Andrew Cooper' <andrew.cooper3@xxxxxxxxxx>; 'xen-devel' 
<xen-devel@xxxxxxxxxxxxxxxxxxxx>
Subject: Re: handle_pio looping during domain shutdown, with qemu 4.2.0 in 
stubdom

On 05.06.2020 18:18, 'Marek Marczykowski-Górecki' wrote:
On Fri, Jun 05, 2020 at 04:39:56PM +0100, Paul Durrant wrote:
From: Jan Beulich <jbeulich@xxxxxxxx>
Sent: 05 June 2020 14:57

On 05.06.2020 15:37, Paul Durrant wrote:
From: Jan Beulich <jbeulich@xxxxxxxx>
Sent: 05 June 2020 14:32

On 05.06.2020 13:05, Paul Durrant wrote:
That would mean we wouldn't be seeing the "Unexpected PIO" message. From that 
message this
clearly
X86EMUL_UNHANDLEABLE which suggests a race with ioreq server teardown, possibly 
due to selecting
a
server but then not finding a vcpu match in ioreq_vcpu_list.

I was suspecting such, but at least the tearing down of all servers
happens only from relinquish-resources, which gets started only
after ->is_shut_down got set (unless the tool stack invoked
XEN_DOMCTL_destroydomain without having observed XEN_DOMINF_shutdown
set for the domain).

For individually unregistered servers - yes, if qemu did so, this
would be a problem. They need to remain registered until all vCPU-s
in the domain got paused.

It shouldn't be a problem should it? Destroying an individual server is only 
done with the domain
paused, so no vcpus can be running at the time.

Consider the case of one getting destroyed after it has already
returned data, but the originating vCPU didn't consume that data
yet. Once that vCPU gets unpaused, handle_hvm_io_completion()
won't find the matching server anymore, and hence the chain
hvm_wait_for_io() -> hvm_io_assist() ->
vcpu_end_shutdown_deferral() would be skipped. handle_pio()
would then still correctly consume the result.

True, and skipping hvm_io_assist() means the vcpu internal ioreq state will be 
left set to
IOREQ_READY and *that* explains why we would then exit hvmemul_do_io() with 
X86EMUL_UNHANDLEABLE (from
the first switch).

I can confirm X86EMUL_UNHANDLEABLE indeed comes from the first switch in
hvmemul_do_io(). And it happens shortly after ioreq server is destroyed:

(XEN) d12v0 XEN_DMOP_remote_shutdown domain 11 reason 0
(XEN) d12v0 domain 11 domain_shutdown vcpu_id 0 defer_shutdown 1
(XEN) d12v0 XEN_DMOP_remote_shutdown domain 11 done
(XEN) d12v0 hvm_destroy_ioreq_server called for 11, id 0

Can either of you tell why this is? As said before, qemu shouldn't
start tearing down ioreq servers until the domain has made it out
of all shutdown deferrals, and all its vCPU-s have been paused.
For the moment I think the proposed changes, while necessary, will
mask another issue elsewhere. The @releaseDomain xenstore watch,
being the trigger I would consider relevant here, will trigger
only once XEN_DOMINF_shutdown is reported set for a domain, which
gets derived from d->is_shut_down (i.e. not mistakenly
d->is_shutting_down).

I can't find anything that actually calls xendevicemodel_shutdown(). It was 
added by:

destroy_hvm_domain() in qemu does.


Juergen



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.