Xen project Mailing List

Re: [PATCH V3 15/23] xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed

From: Stefano Stabellini <sstabellini@xxxxxxxxxx>

Date: Wed, 9 Dec 2020 18:30:03 -0800 (PST)

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <olekstysh@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Julien Grall <julien.grall@xxxxxxx>

Delivery-date: Thu, 10 Dec 2020 02:30:10 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, 9 Dec 2020, Julien Grall wrote: > On 09/12/2020 23:35, Stefano Stabellini wrote: > > On Wed, 9 Dec 2020, Stefano Stabellini wrote: > > > On Mon, 30 Nov 2020, Oleksandr Tyshchenko wrote: > > > > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx> > > > > > > > > This patch adds proper handling of return value of > > > > vcpu_ioreq_handle_completion() which involves using a loop > > > > in leave_hypervisor_to_guest(). > > > > > > > > The reason to use an unbounded loop here is the fact that vCPU > > > > shouldn't continue until an I/O has completed. In Xen case, if an I/O > > > > never completes then it most likely means that something went horribly > > > > wrong with the Device Emulator. And it is most likely not safe to > > > > continue. So letting the vCPU to spin forever if I/O never completes > > > > is a safer action than letting it continue and leaving the guest in > > > > unclear state and is the best what we can do for now. > > > > > > > > This wouldn't be an issue for Xen as do_softirq() would be called at > > > > every loop. In case of failure, the guest will crash and the vCPU > > > > will be unscheduled. > > > > > > Imagine that we have two guests: one that requires an ioreq server and > > > one that doesn't. If I am not mistaken this loop could potentially spin > > > forever on a pcpu, thus preventing any other guest being scheduled, even > > > if the other guest doesn't need any ioreq servers. > > > > > > > > > My other concern is that we are busy-looping. Could we call something > > > like wfi() or do_idle() instead? The ioreq server event notification of > > > completion should wake us up? > > > > > > Following this line of thinking, I am wondering if instead of the > > > busy-loop we should call vcpu_block_unless_event_pending(current) in > > > try_handle_mmio if IO_RETRY. Then when the emulation is done, QEMU (or > > > equivalent) calls xenevtchn_notify which ends up waking up the domU > > > vcpu. Would that work? > > > > I read now Julien's reply: we are already doing something similar to > > what I suggested with the following call chain: > > > > check_for_vcpu_work -> vcpu_ioreq_handle_completion -> wait_for_io -> > > wait_on_xen_event_channel > > > > So the busy-loop here is only a safety-belt in cause of a spurious > > wake-up, in which case we are going to call again check_for_vcpu_work, > > potentially causing a guest reschedule. > > > > Then, this is fine and addresses both my concerns. Maybe let's add a note > > in the commit message about it. > > Damm, I hit the "sent" button just a second before seen your reply. :/ Oh > well. I suggested the same because I have seen the same question multiple > time. :-) > > I am also wondering if there is any benefit in calling wait_for_io() > > earlier, maybe from try_handle_mmio if IO_RETRY? > > wait_for_io() may end up to deschedule the vCPU. I would like to avoid this to > happen in the middle of the I/O emulation because we need to happen it without > lock held at all. > > I don't think there are locks involved today, but the deeper in the call stack > the scheduling happens, the more chance we may screw up in the future. > > However... > > > leave_hypervisor_to_guest is very late for that. > > ... I am not sure what's the problem with that. The IOREQ will be notified of > the pending I/O as soon as try_handle_mmio() put the I/O in the shared page. > > If the IOREQ server is running on a different pCPU, then it might be possible > that the I/O has completed before reached leave_hypervisor_to_guest(). In this > case, we would not have to wait for the I/O. Yeah, I was thinking about that too. Actually it could be faster this way we end up being lucky. The reason for moving it earlier would be that by the time leave_hypervisor_to_guest is called "Xen" has already decided to continue running this particular vcpu. If we called wait_for_io() earlier, we would give important information to the scheduler before any decision is made. This is more "philosophical" than practical though. Let's leave it as is.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.