[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: handle_pio looping during domain shutdown, with qemu 4.2.0 in stubdom



> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: 08 June 2020 09:14
> To: 'Marek Marczykowski-Górecki' <marmarek@xxxxxxxxxxxxxxxxxxxxxx>; 
> paul@xxxxxxx
> Cc: 'Andrew Cooper' <andrew.cooper3@xxxxxxxxxx>; 'xen-devel' 
> <xen-devel@xxxxxxxxxxxxxxxxxxxx>
> Subject: Re: handle_pio looping during domain shutdown, with qemu 4.2.0 in 
> stubdom
> 
> On 05.06.2020 18:18, 'Marek Marczykowski-Górecki' wrote:
> > On Fri, Jun 05, 2020 at 04:39:56PM +0100, Paul Durrant wrote:
> >>> From: Jan Beulich <jbeulich@xxxxxxxx>
> >>> Sent: 05 June 2020 14:57
> >>>
> >>> On 05.06.2020 15:37, Paul Durrant wrote:
> >>>>> From: Jan Beulich <jbeulich@xxxxxxxx>
> >>>>> Sent: 05 June 2020 14:32
> >>>>>
> >>>>> On 05.06.2020 13:05, Paul Durrant wrote:
> >>>>>> That would mean we wouldn't be seeing the "Unexpected PIO" message. 
> >>>>>> From that message this
> clearly
> >>>>> X86EMUL_UNHANDLEABLE which suggests a race with ioreq server teardown, 
> >>>>> possibly due to selecting
> a
> >>>>> server but then not finding a vcpu match in ioreq_vcpu_list.
> >>>>>
> >>>>> I was suspecting such, but at least the tearing down of all servers
> >>>>> happens only from relinquish-resources, which gets started only
> >>>>> after ->is_shut_down got set (unless the tool stack invoked
> >>>>> XEN_DOMCTL_destroydomain without having observed XEN_DOMINF_shutdown
> >>>>> set for the domain).
> >>>>>
> >>>>> For individually unregistered servers - yes, if qemu did so, this
> >>>>> would be a problem. They need to remain registered until all vCPU-s
> >>>>> in the domain got paused.
> >>>>
> >>>> It shouldn't be a problem should it? Destroying an individual server is 
> >>>> only done with the domain
> >>> paused, so no vcpus can be running at the time.
> >>>
> >>> Consider the case of one getting destroyed after it has already
> >>> returned data, but the originating vCPU didn't consume that data
> >>> yet. Once that vCPU gets unpaused, handle_hvm_io_completion()
> >>> won't find the matching server anymore, and hence the chain
> >>> hvm_wait_for_io() -> hvm_io_assist() ->
> >>> vcpu_end_shutdown_deferral() would be skipped. handle_pio()
> >>> would then still correctly consume the result.
> >>
> >> True, and skipping hvm_io_assist() means the vcpu internal ioreq state 
> >> will be left set to
> IOREQ_READY and *that* explains why we would then exit hvmemul_do_io() with 
> X86EMUL_UNHANDLEABLE (from
> the first switch).
> >
> > I can confirm X86EMUL_UNHANDLEABLE indeed comes from the first switch in
> > hvmemul_do_io(). And it happens shortly after ioreq server is destroyed:
> >
> > (XEN) d12v0 XEN_DMOP_remote_shutdown domain 11 reason 0
> > (XEN) d12v0 domain 11 domain_shutdown vcpu_id 0 defer_shutdown 1
> > (XEN) d12v0 XEN_DMOP_remote_shutdown domain 11 done
> > (XEN) d12v0 hvm_destroy_ioreq_server called for 11, id 0
> 
> Can either of you tell why this is? As said before, qemu shouldn't
> start tearing down ioreq servers until the domain has made it out
> of all shutdown deferrals, and all its vCPU-s have been paused.
> For the moment I think the proposed changes, while necessary, will
> mask another issue elsewhere. The @releaseDomain xenstore watch,
> being the trigger I would consider relevant here, will trigger
> only once XEN_DOMINF_shutdown is reported set for a domain, which
> gets derived from d->is_shut_down (i.e. not mistakenly
> d->is_shutting_down).

I can't find anything that actually calls xendevicemodel_shutdown(). It was 
added by:

commit 1462f9ea8f4219d520a530787b80c986e050aa98
Author: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
Date:   Fri Sep 15 17:21:14 2017 +0100

    tools: libxendevicemodel: Provide xendevicemodel_shutdown

    Signed-off-by: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
    Acked-by: Wei Liu <wei.liu2@xxxxxxxxxx>

Perhaps Ian can shed more light on it?

  Paul

> 
> Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.