[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xc_hvm_inject_trap() races



On 11/07/2016 11:57 AM, Jan Beulich wrote:
>>>> On 07.11.16 at 10:37, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>> The problem there is that vmx_idtv_reinject() is a corner case: it
>> writes VM_ENTRY_INTR_INFO directly, and this can happen _before_ the
>> guest exits with, let's say, an EPT vm_event.
>>
>> In that case, the guest has exited, there's already an interrupt
>> pending, and while the VCPU is paused waiting for the vm_event reply we
>> request an injection that effectively obliterates the pending interrupt.
>>
>> Going the singlestep way is satisfactory for us for most cases, but it
>> still leaves the described corner case. The only fix proposal we could
>> think of is to, instead of vmx_idtv_reinject() doing the work, simply
>> set some flags, and have a later function do the actual work, in
>> vmx_intr_assist() style.
>>
>> I hope I've been able to make this clearer (and I haven't misunderstood
>> something in the process :) ).
> 
> You did, thanks. Yet I continue to remain on my earlier position: It's
> the non-architectural (injected) event which should get deferred,
> not architectural ones (either the variant you describe above, or
> any which hypervisor processing found necessary to deliver). The
> single step case can be viewed as an exception here, albeit ideally
> it also wouldn't need to be special cased.

But the problem is that if we defer, say a #PF, by the time there's a VM
entry with no pending interrupt we may get the wrong context (wrong
address space, guest mode, etc.).

The simple solution to this would be to save the context (for a #PF this
would additionally mean CR3 as well), and only inject at the first
opportunity where the context matches.

However, this gets ugly quickly: for one, the right context for a #PF
may occur on a different VCPU, so right off the bat we can't hold this
information per-VCPU anymore (it would need to be per-domain).

Second, it is conceivable that there will be several injection requests
not yet delivered, and in that case we need to save the context for all
of them (so an array or some sort of container of contexts becomes
necessary).

And third, there's no guarantee that the guest will ever reach the
proper context for injecting any of the deferred interrupts for the
duration of it's life, which brings us back to the problem we were
trying to solve: we still can't guarantee trap delivery, but now in a
much more complicated manner. :)

We haven't even discussed what to do if a new request comes when the
context buffer is full (we'd need to lose an undelivered trap), or that
different trap types may require different contexts and handling logic.

At this point it looks like the best solution is to simply discard the
non-architectural event if there's a pending architectural one, and add
a new vm_event, as suggested by Tamas, that notifies the application
that a trap has failed, or succeeded, and let it do the best it can with
that information.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.