[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] VM_EVENT_FLAG_SET_REGISTERS and sync_vcpu_execstate()



On 08/09/2016 12:41 PM, Tim Deegan wrote:
> At 19:29 +0300 on 08 Aug (1470684579), Razvan Cojocaru wrote:
>> On 08/08/16 18:52, Tamas K Lengyel wrote:
>>>> I think the issue might be that vm_event_vcpu_pause() uses
>>>>> vcpu_pause_nosync(), and it's being used everywhere we pause the VCPU in
>>>>> the course of sending out a vm_event.
>>>>>
>>>>> So this creates the premise for a race condition: either some more
>>>>> things happen between sending out the vm_event and replying to it that
>>>>> cause v->arch.user_regs to be synchronized - in which case everything
>>>>> works (this has been the case when I was reading the VCPU context via a
>>>>> domctl that did vcpu_pause()) -, or not, in which case all bets are off.
>>>>>
>>>>> In this case, we should either drop vm_event_vcpu_pause() completely and
>>>>> just use vcpu_pause() everywhere, modify it to use vcpu_sleep_sync()
>>>>> (and basically turn it into vcpu_pause()), or only sync in
>>>>> vm_event_set_registers() as I've suggested.
>>>>>
>>> I would say just using vcpu_pause() would make sense as it's the least
>>> complicated route. Is there any downside of doing the sync() in all
>>> cases? Was the current way implemented perhaps the way it is for
>>> performance reasons? If so, is it noticeable at all?
>>
>> I was hoping you'd know why it's implemented like this :), I think maybe
>> Tim Deegan (CCd, hopefully the email address is not out of date) did the
>> original implementation?
> 
> Wasn't me!  I'm missing some context here but it looks like
> vm_event_vcpu_pause() is always called on current, which means the
> vcpu:
>  - is "running", i.e. scheduled on this pcpu, so vcpu_pause_sync()
>    would deadlock in vcpu_sleep_sync() (well, it would ASSERT first).
>  - is not in guest mode, so any VMCx state should have been synced
>    onto the local stack at the last vmexit.
> 
> If your vm_event response hypercall needs access to remote vcpu state,
> then you should call vcpu_pause_sync() on _that_ path, and
> vcpu_unpause() when you're done.  If you can _guarantee_ (even with
> buggy/malicious tools) that the target vcpu is already paused and
> nothing can unpause it underfoot, then just calling vcpu_sleep_sync()
> before accessing the state is enough.
> 
> The *_sync() functions are dangerous - you can't ever let a domain
> call them on its own vcpus, or have two domains that can call them on
> each other, without some other interlock to stop two vcpus deadlocking
> waiting for each other to be descheduled.

Thanks for the reply! Indeed, those things have all hit me until I got
to the patch I've just submitted a couple of minutes ago. :)


Thanks again,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.