[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 3/3] x86/vmx: implement Notify VM Exit


  • To: Roger Pau Monne <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Fri, 3 Jun 2022 14:49:54 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=enAcfvQGdHV1UXZB953itue6Ttn22yaOfRMciV9J9uQ=; b=bhBcLlN7nJaYiz97rfdZTTxVjR6tHypLlhbimyj4AxaW6GqK5BbyC1IXURrGsci6rAFbdo/eWVvC0q2oSbAF1906zMEt19rpEtioVSTxChcNfGxgKtuZwbdx/X0Nwhy+gjeSn765Zyahaw/QGkXt6ALjZwm/QHaR8c7M3+sVKvIcOQXmTR928LSNVJRVdKV7fXSCZ8+S16sqG99/ZvrDufEbt/lDFjnx2TVoiCDUPkUMpUe3cMSuFudxCCjBQu9ZCBpDkJ0fAZBFQ1m3NsICcRqf8JydpMami8Fi2s8LIdwsboMt1rvEeihFjxQqlE1fS2lhzIxZFmNKkaS2Eb8xZw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aHrBL/IDpwSSwB9PCfwqcKrFlgae3uWpE3d19JXcXhde4mciquMY+RHw65cD4bGDOGJMoc0ERx/wz1DIwq2O5auYqkBd5N8VeTFoG6LhBdg2DaDQGWiHXJpsNG6GMDyRmiVWmeYayW4U0z9SdoxffnK5+177NZEffIuv4SJM5/AQ2S6JvIRq3LnPOeNhaSuQo07Ic9Pe61s0+wJyyzO+aFRi7mXJ5kqtt3kpk05MOYuP+Hj+jmaBAzGhRXA1cn1ITh2+lFai3xkjzvAbn77BV5TGbo18DGgQIiSoYHxAXz2SpvWfGY7dfyQL/F3G/AcBI6/gyLks4MUceUYXimNptg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Fri, 03 Jun 2022 12:50:17 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 26.05.2022 13:11, Roger Pau Monne wrote:
> Under certain conditions guests can get the CPU stuck in an unbounded
> loop without the possibility of an interrupt window to occur on
> instruction boundary.  This was the case with the scenarios described
> in XSA-156.
> 
> Make use of the Notify VM Exit mechanism, that will trigger a VM Exit
> if no interrupt window occurs for a specified amount of time.  Note
> that using the Notify VM Exit avoids having to trap #AC and #DB
> exceptions, as Xen is guaranteed to get a VM Exit even if the guest
> puts the CPU in a loop without an interrupt window, as such disable
> the intercepts if the feature is available and enabled.
> 
> Setting the notify VM exit window to 0 is safe because there's a
> threshold added by the hardware in order to have a sane window value.
> 
> Suggested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> ---
> Changes since v1:
>  - Properly update debug state when using notify VM exit.
>  - Reword commit message.
> ---
> This change enables the notify VM exit by default, KVM however doesn't
> seem to enable it by default, and there's the following note in the
> commit message:
> 
> "- There's a possibility, however small, that a notify VM exit happens
>    with VM_CONTEXT_INVALID set in exit qualification. In this case, the
>    vcpu can no longer run. To avoid killing a well-behaved guest, set
>    notify window as -1 to disable this feature by default."
> 
> It's not obviously clear to me whether the comment was meant to be:
> "There's a possibility, however small, that a notify VM exit _wrongly_
> happens with VM_CONTEXT_INVALID".
> 
> It's also not clear whether such wrong hardware behavior only affects
> a specific set of hardware, in a way that we could avoid enabling
> notify VM exit there.
> 
> There's a discussion in one of the Linux patches that 128K might be
> the safer value in order to prevent false positives, but I have no
> formal confirmation about this.  Maybe our Intel maintainers can
> provide some more feedback on a suitable notify VM exit window
> value.

This certainly wants sorting one way or another before I, for one,
would consider giving an R-b here.

> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -67,6 +67,9 @@ integer_param("ple_gap", ple_gap);
>  static unsigned int __read_mostly ple_window = 4096;
>  integer_param("ple_window", ple_window);
>  
> +static unsigned int __ro_after_init vm_notify_window;
> +integer_param("vm-notify-window", vm_notify_window);

Could even be a runtime param, I guess. Albeit I realize this would
complicate things further down.

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -1419,10 +1419,19 @@ static void cf_check vmx_update_host_cr3(struct vcpu 
> *v)
>  
>  void vmx_update_debug_state(struct vcpu *v)
>  {
> +    unsigned int mask = 1u << TRAP_int3;
> +
> +    if ( !cpu_has_monitor_trap_flag && cpu_has_vmx_notify_vm_exiting )

I'm puzzled by the lack of symmetry between this and ...

> +        /*
> +         * Only allow toggling TRAP_debug if notify VM exit is enabled, as
> +         * unconditionally setting TRAP_debug is part of the XSA-156 fix.
> +         */
> +        mask |= 1u << TRAP_debug;
> +
>      if ( v->arch.hvm.debug_state_latch )
> -        v->arch.hvm.vmx.exception_bitmap |= 1U << TRAP_int3;
> +        v->arch.hvm.vmx.exception_bitmap |= mask;
>      else
> -        v->arch.hvm.vmx.exception_bitmap &= ~(1U << TRAP_int3);
> +        v->arch.hvm.vmx.exception_bitmap &= ~mask;
>  
>      vmx_vmcs_enter(v);
>      vmx_update_exception_bitmap(v);
> @@ -4155,6 +4164,9 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>          switch ( vector )
>          {
>          case TRAP_debug:
> +            if ( cpu_has_monitor_trap_flag && cpu_has_vmx_notify_vm_exiting )
> +                goto exit_and_crash;

... this condition. Shouldn't one be the inverse of the other (and
then it's the one down here which wants adjusting)?

> @@ -126,5 +126,6 @@ PERFCOUNTER(realmode_exits,      "vmexits from realmode")
>  PERFCOUNTER(pauseloop_exits, "vmexits from Pause-Loop Detection")
>  
>  PERFCOUNTER(buslock, "Bus Locks Detected")
> +PERFCOUNTER(vmnotify_crash, "domains crashed by Notify VM Exit")

I think the text is not entirely correct and would want to be
"domain crashes by ...". Multiple vCPU-s of a single domain can
experience this in parallel (granted this would require "good"
timing).

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.