[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: handle_pio looping during domain shutdown, with qemu 4.2.0 in stubdom



> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: 05 June 2020 14:37
> To: paul@xxxxxxx
> Cc: 'Marek Marczykowski-Górecki' <marmarek@xxxxxxxxxxxxxxxxxxxxxx>; 'Andrew 
> Cooper'
> <andrew.cooper3@xxxxxxxxxx>; 'xen-devel' <xen-devel@xxxxxxxxxxxxxxxxxxxx>
> Subject: Re: handle_pio looping during domain shutdown, with qemu 4.2.0 in 
> stubdom
> 
> On 05.06.2020 13:25, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Paul Durrant <xadimgnik@xxxxxxxxx>
> >> Sent: 05 June 2020 12:06
> >> To: 'Jan Beulich' <jbeulich@xxxxxxxx>; 'Marek Marczykowski-Górecki'
> <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> >> Cc: 'Andrew Cooper' <andrew.cooper3@xxxxxxxxxx>; 'xen-devel' 
> >> <xen-devel@xxxxxxxxxxxxxxxxxxxx>
> >> Subject: RE: handle_pio looping during domain shutdown, with qemu 4.2.0 in 
> >> stubdom
> >>
> >> Sorry, only just catching up with this...
> >>
> >>> -----Original Message-----
> >>> From: Jan Beulich <jbeulich@xxxxxxxx>
> >>> Sent: 05 June 2020 10:09
> >>> To: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> >>> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>; xen-devel 
> >>> <xen-devel@xxxxxxxxxxxxxxxxxxxx>; Paul
> >>> Durrant <paul@xxxxxxx>
> >>> Subject: Re: handle_pio looping during domain shutdown, with qemu 4.2.0 
> >>> in stubdom
> >>>
> >>> On 04.06.2020 16:25, Marek Marczykowski-Górecki wrote:
> >>>> On Thu, Jun 04, 2020 at 02:36:26PM +0200, Jan Beulich wrote:
> >>>>> On 04.06.2020 13:13, Andrew Cooper wrote:
> >>>>>> On 04/06/2020 08:08, Jan Beulich wrote:
> >>>>>>> On 04.06.2020 03:46, Marek Marczykowski-Górecki wrote:
> >>>>>>>> Then, we get the main issue:
> >>>>>>>>
> >>>>>>>>     (XEN) d3v0 handle_pio port 0xb004 read 0x0000
> >>>>>>>>     (XEN) d3v0 Weird PIO status 1, port 0xb004 read 0xffff
> >>>>>>>>     (XEN) domain_crash called from io.c:178
> >>>>>>>>
> >>>>>>>> Note, there was no XEN_DOMCTL_destroydomain for domain 3 nor its 
> >>>>>>>> stubdom
> >>>>>>>> yet. But XEN_DMOP_remote_shutdown for domain 3 was called already.
> >>>>>>> I'd guess an issue with the shutdown deferral logic. Did you / can
> >>>>>>> you check whether XEN_DMOP_remote_shutdown managed to pause all
> >>>>>>> CPUs (I assume it didn't, since once they're paused there shouldn't
> >>>>>>> be any I/O there anymore, and hence no I/O emulation)?
> >>>>>>
> >>>>>> The vcpu in question is talking to Qemu, so will have v->defer_shutdown
> >>>>>> intermittently set, and skip the pause in domain_shutdown()
> >>>>>>
> >>>>>> I presume this lack of pause is to allow the vcpu in question to still
> >>>>>> be scheduled to consume the IOREQ reply?  (Its fairly opaque logic with
> >>>>>> 0 clarifying details).
> >>>>>>
> >>>>>> What *should* happen is that, after consuming the reply, the vcpu 
> >>>>>> should
> >>>>>> notice and pause itself, at which point it would yield to the
> >>>>>> scheduler.  This is the purpose of 
> >>>>>> vcpu_{start,end}_shutdown_deferral().
> >>>>>>
> >>>>>> Evidentially, this is not happening.
> >>>>>
> >>>>> We can't tell yet, until ...
> >>>>>
> >>>>>> Marek: can you add a BUG() after the weird PIO printing?  That should
> >>>>>> confirm whether we're getting into handle_pio() via the
> >>>>>> handle_hvm_io_completion() path, or via the vmexit path (at which case,
> >>>>>> we're fully re-entering the guest).
> >>>>>
> >>>>> ... we know this. handle_pio() gets called from 
> >>>>> handle_hvm_io_completion()
> >>>>> after having called hvm_wait_for_io() -> hvm_io_assist() ->
> >>>>> vcpu_end_shutdown_deferral(), so the issue may be that we shouldn't call
> >>>>> handle_pio() (etc) at all anymore in this state. IOW perhaps
> >>>>> hvm_wait_for_io() should return "!sv->vcpu->domain->is_shutting_down"
> >>>>> instead of plain "true"?
> >>>>>
> >>>>> Adding Paul to Cc, as being the maintainer here.
> >>>>
> >>>> Got it, by sticking BUG() just before that domain_crash() in
> >>>> handle_pio(). And also vcpu 0 of both HVM domains do have
> >>>> v->defer_shutdown.
> >>>
> >>> As per the log they did get it set. I'd be curious of the flag's
> >>> value (as well as v->paused_for_shutdown's) at the point of the
> >>> problematic handle_pio() invocation (see below). It may be
> >>> worthwhile to instrument vcpu_check_shutdown() (inside its if())
> >>> - before exiting to guest context (in order to then come back
> >>> and call handle_pio()) the vCPU ought to be getting through
> >>> there. No indication of it doing so would be a sign that there's
> >>> a code path bypassing the call to vcpu_end_shutdown_deferral().
> >>>
> >>>> (XEN) hvm.c:1620:d6v0 All CPUs offline -- powering off.
> >>>> (XEN) d3v0 handle_pio port 0xb004 read 0x0000
> >>>> (XEN) d3v0 handle_pio port 0xb004 read 0x0000
> >>>> (XEN) d3v0 handle_pio port 0xb004 write 0x0001
> >>>> (XEN) d3v0 handle_pio port 0xb004 write 0x2001
> >>>> (XEN) d4v0 XEN_DMOP_remote_shutdown domain 3 reason 0
> >>>> (XEN) d4v0 domain 3 domain_shutdown vcpu_id 0 defer_shutdown 1
> >>>> (XEN) d4v0 XEN_DMOP_remote_shutdown domain 3 done
> >>>> (XEN) hvm.c:1620:d5v0 All CPUs offline -- powering off.
> >>>> (XEN) d1v0 handle_pio port 0xb004 read 0x0000
> >>>> (XEN) d1v0 handle_pio port 0xb004 read 0x0000
> >>>> (XEN) d1v0 handle_pio port 0xb004 write 0x0001
> >>>> (XEN) d1v0 handle_pio port 0xb004 write 0x2001
> >>>> (XEN) d2v0 XEN_DMOP_remote_shutdown domain 1 reason 0
> >>>> (XEN) d2v0 domain 1 domain_shutdown vcpu_id 0 defer_shutdown 1
> >>>> (XEN) d2v0 XEN_DMOP_remote_shutdown domain 1 done
> >>>> (XEN) grant_table.c:3702:d0v0 Grant release 0x3 ref 0x11d flags 0x2 d6
> >>>> (XEN) grant_table.c:3702:d0v0 Grant release 0x4 ref 0x11e flags 0x2 d6
> >>>> (XEN) d3v0 handle_pio port 0xb004 read 0x0000
> >>>
> >>> Perhaps in this message could you also log
> >>> v->domain->is_shutting_down, v->defer_shutdown, and
> >>> v->paused_for_shutdown? (Would be nice if, after having made
> >>> changes to your debugging patch, you could point again at the
> >>> precise version you've used for the log provided.)
> >>>
> >>>> (XEN) d3v0 Unexpected PIO status 1, port 0xb004 read 0xffff
> >>>> (XEN) Xen BUG at io.c:178
> >>>
> >>> Btw, instead of BUG(), WARN() or dump_execution_state() would
> >>> likely also do, keeping Xen alive.
> >>>
> >>
> >> A shutdown deferral problem would result in X86EMUL_RETRY wouldn't it? 
> >> That would mean we wouldn't
> be
> >> seeing the "Unexpected PIO" message. From that message this clearly 
> >> X86EMUL_UNHANDLEABLE which
> >> suggests a race with ioreq server teardown, possibly due to selecting a 
> >> server but then not finding
> a
> >> vcpu match in ioreq_vcpu_list.
> >
> > Actually looking at remote_shutdown... the test of ( reason == 
> > SHUTDOWN_crash ) and then clearing
> defer_shutdown looks a bit odd... Just because the domain shutdown code has 
> been set that way doesn't
> mean that a vcpu is not deferred in emulation; SCHEDOP_shutdown_code could 
> easily be called from one
> vcpu whilst another has emulation pending.
> 
> I'm confused: The deferral is of shutting down the domain, not of
> a specific instance of emulation.

Now I'm confused... defer_shutdown is per-vcpu.

> When a guest crashed I understand
> this code is intended to make sure shutting down because of the
> crash won't get deferred because of in-progress emulation anywhere.
> 

OK, I neglected to notice the vcpu_pause_nosync() just below so, yes, there 
should be no way we can erroneously try to complete emulation after this point.

  Paul

> Marek didn't provide any hints so far that the guest may be crashing,
> so I think if there is an issue here, it's likely only a tangential
> one anyway.
> 
> Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.