[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/3] x86/spec-ctrl: Fix NMI race condition with VT-x MSR_SPEC_CTRL handling


  • To: Andrew Cooper <amc96@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 17 Jan 2022 10:21:25 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=41x0D36Ab8/5Xyue/wcP6Z2ttVGUR5sCaCBD/lWC4PE=; b=E6yZ5buD9EhJnVULp3ig97bHC+zPRl/3N2GMdUXnNWSqnPeVj6CCaT7VnCPn02yLn/XXJcveki3n7YGycV7netU1qsx0wY/ItvhQFfYOglvmneUj6mqNM9Uha5k6vvClJudgxYBQkz1imkShO+1u0z+vx4eIpyv1NJ9zH5tK1rMsbNJgcYv+zYBDtJGxTbkzOH1Xkd/mXd0ucEEZRz8Jtfm+BZ9jylsoEorHLkxzJNPWUH6oY6dmU9t8ssUZMEHa2maSC+jDnKBemnFoEGgs+S251jYE4d1BoA5MgLkrqxgI5huNqsAZpVylgVg5LXHrHbBKxvFUi3wo3w6wNYIekA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=k5XZ1y4CklqnhRGdrLRmYVaWZfevu91IA/U3xI2wOKbkxkUSt7M08zO4ovwNSvuMbNJNWFzfiD0zCZyTKMoIDLNBE3QSsVY1IUtZEFqtoPgn8eGRX3K/8HlvIz0dEsrfmOqIX5pRU2l3PYq7aQBSvB7U9I4XE05ySDt6Zxt0gtDFLJKdz4GLNwltL9mDZghizpSG5B6qaNJacclOS75qfZtKYE4ixReKtbc7Yp9ESEbTM/b2medTituBlkUSLTQe5CaueCddkbxgh90Cx891nbfqcsQqAk/Af5A3UYkbMSyPzupJKmkgw4fXvwjsCV5x83tIhEW0G3oxfutmz8vGtQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 17 Jan 2022 09:21:42 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 14.01.2022 15:41, Andrew Cooper wrote:
> On 14/01/2022 14:14, Andrew Cooper wrote:
>> On 13/01/2022 16:38, Andrew Cooper wrote:
>>> The logic was based on a mistaken understanding of how NMI blocking on 
>>> vmexit
>>> works.  NMIs are only blocked for EXIT_REASON_NMI, and not for general 
>>> exits.
>>> Therefore, an NMI can in general hit early in the vmx_asm_vmexit_handler 
>>> path,
>>> and the guest's value will be clobbered before it is saved.
>>>
>>> Switch to using MSR load/save lists.  This causes the guest value to be 
>>> saved
>>> atomically with respect to NMIs/MCEs/etc.
>>>
>>> First, update vmx_cpuid_policy_changed() to configure the load/save lists at
>>> the same time as configuring the intercepts.  This function is always used 
>>> in
>>> remote context, so extend the vmx_vmcs_{enter,exit}() block to cover the 
>>> whole
>>> function, rather than having multiple remote acquisitions of the same VMCS.
>>>
>>> vmx_add_guest_msr() can fail, but only in ways which are entirely fatal to 
>>> the
>>> guest, so handle failures using domain_crash().  vmx_del_msr() can fail with
>>> -ESRCH but we don't matter, and this path will be taken during domain create
>>> on a system lacking IBRSB.
>>>
>>> Second, update vmx_msr_{read,write}_intercept() to use the load/save lists
>>> rather than vcpu_msrs, and update the comment to describe the new state
>>> location.
>>>
>>> Finally, adjust the entry/exit asm.  Drop DO_SPEC_CTRL_ENTRY_FROM_HVM
>>> entirely, and use a shorter code sequence to simply reload Xen's setting 
>>> from
>>> the top-of-stack block.
>>>
>>> Because the guest values are loaded/saved atomically, we do not need to use
>>> the shadowing logic to cope with late NMIs/etc, so we can omit
>>> DO_SPEC_CTRL_EXIT_TO_GUEST entirely and VMRESUME/VMLAUNCH with Xen's value 
>>> in
>>> context.  Furthermore, we can drop the SPEC_CTRL_ENTRY_FROM_PV too, as 
>>> there's
>>> no need to switch back to Xen's context on an early failure.
>>>
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>> ---
>>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>>> CC: Wei Liu <wl@xxxxxxx>
>>> CC: Jun Nakajima <jun.nakajima@xxxxxxxxx>
>>> CC: Kevin Tian <kevin.tian@xxxxxxxxx>
>>>
>>> Needs backporting as far as people can tolerate.
>>>
>>> If the entry/exit logic were in C, I'd ASSERT() that shadow tracking is off,
>>> but this is awkard to arrange in asm.
>> Actually, it's just occurred to me that an ASSERT is actually quite easy
>> here.  I'm proposing this additional delta (totally untested).
>>
>> diff --git a/xen/arch/x86/hvm/vmx/entry.S b/xen/arch/x86/hvm/vmx/entry.S
>> index 297ed8685126..f569c3259b32 100644
>> --- a/xen/arch/x86/hvm/vmx/entry.S
>> +++ b/xen/arch/x86/hvm/vmx/entry.S
>> @@ -41,6 +41,13 @@ ENTRY(vmx_asm_vmexit_handler)
>>              movzbl CPUINFO_xen_spec_ctrl(%rsp), %eax
>>              xor    %edx, %edx
>>              wrmsr
>> +
>> +#ifdef CONFIG_DEBUG
>> +            testb $SCF_use_shadow, CPUINFO_spec_ctrl_flags(%rsp)
>> +            jz 1f
>> +            ASSERT_FAILED("MSR_SPEC_CTRL shadowing active")
>> +1:
>> +#endif
>>          .endm
>>          ALTERNATIVE "", restore_spec_ctrl, X86_FEATURE_SC_MSR_HVM
>>          /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
> 
> Irritatingly this doesn't work, because the metadata associated with the
> ud2 instruction is tied to the compiled position in
> .altinstr_replacement, not the runtime position after alternatives have run.

Could we have the macro "return" ZF, leaving it to the invocation
site of the macro to act on it? ALTERNATIVE's first argument could
easily be "xor %reg, %reg" to set ZF without much other overhead.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.