[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH V2] x86/altp2m: Fixed crash with INVALID_ALTP2M EPTP index
On 06/26/2018 02:56 PM, Jan Beulich wrote: >>>> On 26.06.18 at 12:55, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >> On 06/26/2018 01:26 PM, Jan Beulich wrote: >>>>>> On 25.06.18 at 16:10, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >>>> --- a/xen/arch/x86/mm/altp2m.c >>>> +++ b/xen/arch/x86/mm/altp2m.c >>>> @@ -58,8 +58,8 @@ altp2m_vcpu_destroy(struct vcpu *v) >>>> >>>> altp2m_vcpu_reset(v); >>>> >>>> - altp2m_vcpu_update_p2m(v); >>>> altp2m_vcpu_update_vmfunc_ve(v); >>>> + altp2m_vcpu_update_p2m(v); >>> >>> I agree this addresses this specific incarnation of the problem. However, >>> if the vCPU indeed runs while being manipulated, I don't think you get >>> rid of the race this way. For one, there is e.g. a solitary call to >>> altp2m_vcpu_update_vmfunc_ve() in the handling of >>> HVMOP_altp2m_vcpu_enable_notify. That'll lead to >>> SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS getting set, but >>> EPTP_INDEX won't be written. Whatever previous value is in place >>> would then be read back in VM exit handling. >>> >>> With that it also looks to me as if the two step (and hence non-atomic >>> from the perspective of the guest) update is a problem. Even with the >>> change above, the VM exit may now happen exactly between the two >>> function calls. >>> >>> It seems to me that pausing the vCPU is almost unavoidable (and then >>> the ordering of the two calls is relevant only because >>> vmx_vcpu_update_eptp() would better respect the intended new >>> setting of SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS). >> >> I see. Would not then your scenario be covered by >> altp2m_vcpu_update_vmfunc_ve() always calling altp2m_vcpu_update_p2m() >> at the very end (and removing duplicate calls to altp2m_vcpu_update_p2m())? > > No - there would then still be a window of time where the bit is set but > the index was not (yet) written. > >> Unless something is very wrong, all calls to >> altp2m_vcpu_update_vmfunc_ve() _should_ happen within a pause, so no >> calls to the VM exit handler should occur between them if they become a >> single block of logic within altp2m_vcpu_update_vmfunc_ve(). > > Did I misunderstand your analysis mails then? It looked to me as if > you were observing exactly such races, because of the vCPU not > being paused. No, you've understood my initial email perfectly, but I had unfortunately misread the log. I've attempted to clarify things with the follow-up (which clearly has not been eloquent) (the "apologies, I seem to have misread the crash timing" email). I've misread my log because altp2m_vcpu_update_vmfunc_ve() was called the first time on handling HVMOP_altp2m_vcpu_enable_notify, and a second time on altp2m_vcpu_destroy() - and looking at it I've first thought that the call leading to the crash happened in altp2m_vcpu_destroy() (which would have indeed implied a nasty race with the VM exit handler). But then I've looked closer and found that the crash was happening soon after the HVMOP_altp2m_vcpu_enable_notify handling, in the second run of the test application (the first run was writing INVALID_ALTP2M in EPTP_INDEX on altp2m_vcpu_destroy()). The VCPU pausing seem to work as expected. Sorry for the misunderstanding. >>> I also think that you'd better Cc the VMX maintainers here, even if >>> the fix itself is outside the code their maintainership covers. >> >> Sorry, I don't follow. The MAINTAINTERS file lists only Jun Nakajima and >> Kevin Tian under "INTEL(R) VT FOR X86 (VT-X)" (where >> xen/arch/x86/hvm/vmx/ and xen/include/asm-x86/hvm/vmx/ live), and they >> are both CCd here. What am I missing? > > Your patch submission didn't have them on Cc, I've added them when > replying. Understood, thanks. Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |