[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 3/3] x86/vmx: implement Notify VM Exit


  • To: Xiaoyao Li <xiaoyao.li@xxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 16 Jun 2022 13:17:31 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jH1HBwAj15WT5WAef5LD5xjcQGRwUX4vRGD4OkaMjRs=; b=OIA3fvwrYXHDo2BGy8Fgre4c4nLXXQu3EkqLx7JvHB/yICAUYEMzPnuW0NMJUXPH24iq/D1T1fM/tSLOwxRl6L7gZ+dkbLPdhFnLQz03r5oytEFFEu38Cd2XbK/4sqy/WqoscR7XNe9bnsEdLvulOa1pxqK8ykXd8bdSw2TrLwr7HHKWNtPkO2MGlNYPGtTIn2tlmF0OVshyJr3nEn5oJtt96lxR2NB7O0IKPH4xwATUxHGlyjd0AvuK2NDO4+kIlKn22ilcEYFzMqYKMcGuG0SAFduM6XgU9Nvd8Y2CByQHL/GZnUxztUoJFUsg2DLK3U2nbmscWFaEIIZWTS105w==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QM6z2T6JK6u+dRb8KSyjiUAYUoGKqnv9m6y7W/cy3YhhsPoLIch10OjKnrao7xP21ACK3YDW3ZQ9QdIOyxlu1SvYmGNPgwqjVK62jGghWOecdE5pE8QKAIM0CnbWi+1RYHcM4YJ75N3iTG8Ptg0yTLOUS4qwNLOvMsOek92yCYaJCCrl0RT3H/xo8fYfkzbZkABBdhgL/EYfkpAyhQxD/l2sVJeY8Or6Gh2gXzMRtXIpFYPzbIFLR7pFTM00Qe2fB+qIZBtUA6YhnqEpEUET2NxEV5i/2OpVDdTvhn78z7zZQx4vVoyKKbuL73xcSa2xpdv8UKOksZl+gjepwZ7UUA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Cooper, Andrew" <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, "Beulich, Jan" <JBeulich@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>, "Qiang, Chenyi" <chenyi.qiang@xxxxxxxxx>
  • Delivery-date: Thu, 16 Jun 2022 11:17:54 +0000
  • Ironport-data: A9a23:+6FdD6N2j62yMtfvrR1VlsFynXyQoLVcMsEvi/4bfWQNrUolhjVSx jAcXmqHPq2OMDPwL410YIzi8U8H65GAx9BrQQto+SlhQUwRpJueD7x1DKtR0wB+jCHnZBg6h ynLQoCYdKjYdleF+lH1dOKJQUBUjclkfJKlYAL/En03FFYMpBsJ00o5wbZn29Aw27BVPivW0 T/Mi5yHULOa82Yc3lI8s8pvfzs24ZweEBtB1rAPTagjUG32zhH5P7pGTU2FFFPqQ5E8IwKPb 72rIIdVXI/u10xF5tuNyt4Xe6CRK1LYFVDmZnF+A8BOjvXez8CbP2lS2Pc0MC9qZzu1c99Zy 5JdmLi1bEASGrSQsfpecDNBORBQBPgTkFPHCSDXXc276WTjKiGp79AwSUY8MMsf5/p9BnxI+ boAMjcRYxufhuWwhrWmVu1rgcdlJ87uVG8dkig4kXeFUrB7ENaaHP2iCdxwhV/cguhUGvnTf YwBYCdHZxXceRxffFwQDfrSmc/33SSuLGUH8Dp5o4IquGrNkSV92YS0LeOWXOfaXOxLh3ix8 zeuE2PRR0ty2Mak4SqE+3W9j+iJmSLTWYQOGbn+/flv6HWQy3ISDlsKVFK9ifi/lkO6HdlYL iQ86ico6KQ/6kGvZt38RAGj5m6JuAYGXNhdGPF87xuCooL2yQuEAmkPThZadccr8sQxQFQC1 EKNnt7vLSxitvuSU3313qyPsTq4NCwRLGkDTSwJVw0I55/kuo5bpg3LZsZuFuiylNKdMTPtx XaMpSs3hbQWhOYK0bm2+RbMhDfEjpPJQwgk50POX2uj4St4YpKoY8qj7l2z0BpbBIOQT13Es H1ancGbtboKFcvUy3TLR/gRFra04frDKCfbnVNkA5gm8XKq5mKneodTpjp5IS+FL/o5RNMgW 2eL0Ss52XOZFCHCgXNfC25pN/kX8A==
  • Ironport-hdrordr: A9a23:2v/R4KDhFzEd64blHeglsceALOsnbusQ8zAXPh9KJCC9I/bzqy nxpp8mPH/P5wr5lktQ/OxoHJPwOU80kqQFmrX5XI3SJTUO3VHFEGgM1+vfKlHbak7DH6tmpN 1dmstFeaLN5DpB/KHHCWCDer5PoeVvsprY49s2p00dMT2CAJsQizuRZDzrcHGfE2J9dOcE/d enl4J6T33KQwVlUu2LQl0+G8TTrdzCk5zrJTYAGh4c8QGLyRel8qTzHRS01goXF2on+8ZpzU H11yjCoomzufCyzRHRk0fV8pRtgdPkjv9OHtaFhMQ5IijlziyoeINicbufuy1dmpDl1H8a1P 335zswNcV67H3cOkmzvBvWwgHllA0j7nfzoGXo9kfLkIjcfnYXGsBBjYVWfl/y8Ew7puxx16 pNwiawq4dXJQmoplWy2/H4EzVR0makq3srluAey1ZFV5EFVbNXpYsDuGtIDZY7Gj7g4oxPKp ggMCjl3ocXTbqmVQGbgoE2q+bcHEjbXy32DnTqg/blkgS/xxtCvg4lLM92pAZ1yHtycegB2w 3+CNUYqFh/dL5pUUtDPpZwfSKWMB26ffueChPaHbzYfJt3SU7lmtrQ3Igfwt2MVdgh8KYS8a 6xJW+w81RCNn7TNQ==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Ping?

On Thu, Jun 09, 2022 at 12:09:18PM +0200, Roger Pau Monné wrote:
> On Thu, Jun 09, 2022 at 03:39:33PM +0800, Xiaoyao Li wrote:
> > On 6/9/2022 3:04 PM, Tian, Kevin wrote:
> > > +Chenyi/Xiaoyao who worked on the KVM support. Presumably
> > > similar opens have been discussed in KVM hence they have the
> > > right background to comment here.
> > > 
> > > > From: Roger Pau Monne <roger.pau@xxxxxxxxxx>
> > > > Sent: Thursday, May 26, 2022 7:12 PM
> > > > 
> > > > Under certain conditions guests can get the CPU stuck in an unbounded
> > > > loop without the possibility of an interrupt window to occur on
> > > > instruction boundary.  This was the case with the scenarios described
> > > > in XSA-156.
> > > > 
> > > > Make use of the Notify VM Exit mechanism, that will trigger a VM Exit
> > > > if no interrupt window occurs for a specified amount of time.  Note
> > > > that using the Notify VM Exit avoids having to trap #AC and #DB
> > > > exceptions, as Xen is guaranteed to get a VM Exit even if the guest
> > > > puts the CPU in a loop without an interrupt window, as such disable
> > > > the intercepts if the feature is available and enabled.
> > > > 
> > > > Setting the notify VM exit window to 0 is safe because there's a
> > > > threshold added by the hardware in order to have a sane window value.
> > > > 
> > > > Suggested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> > > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> > > > ---
> > > > Changes since v1:
> > > >   - Properly update debug state when using notify VM exit.
> > > >   - Reword commit message.
> > > > ---
> > > > This change enables the notify VM exit by default, KVM however doesn't
> > > > seem to enable it by default, and there's the following note in the
> > > > commit message:
> > > > 
> > > > "- There's a possibility, however small, that a notify VM exit happens
> > > >     with VM_CONTEXT_INVALID set in exit qualification. In this case, the
> > > >     vcpu can no longer run. To avoid killing a well-behaved guest, set
> > > >     notify window as -1 to disable this feature by default."
> > > > 
> > > > It's not obviously clear to me whether the comment was meant to be:
> > > > "There's a possibility, however small, that a notify VM exit _wrongly_
> > > > happens with VM_CONTEXT_INVALID".
> > > > 
> > > > It's also not clear whether such wrong hardware behavior only affects
> > > > a specific set of hardware,
> > 
> > I'm not sure what you mean for a specific set of hardware.
> > 
> > We make it default off in KVM just in case that future silicon wrongly sets
> > VM_CONTEXT_INVALID bit. Becuase we make the policy that VM cannot continue
> > running in that case.
> > 
> > For the worst case, if some future silicon happens to have this kind silly
> > bug, then the existing product kernel all suffer the possibility that their
> > VM being killed due to the feature is default on.
> 
> That's IMO a weird policy.  If there's such behavior in any hardware
> platform I would assume Intel would issue an errata, and then we would
> just avoid using the feature on affected hardware (like we do with
> other hardware features when they have erratas).
> 
> If we applied the same logic to all new Intel features we won't use
> any of them.  At least in Xen there are already combinations of vmexit
> conditions that will lead to the guest being killed.
> 
> > > > in a way that we could avoid enabling
> > > > notify VM exit there.
> > > > 
> > > > There's a discussion in one of the Linux patches that 128K might be
> > > > the safer value in order to prevent false positives, but I have no
> > > > formal confirmation about this.  Maybe our Intel maintainers can
> > > > provide some more feedback on a suitable notify VM exit window
> > > > value.
> > 
> > The 128k is the internal threshold for SPR silicon. The internal threshold
> > is tuned by Intel for each silicon, to make sure it's big enough to avoid
> > false positive even when user set vmcs.notify_window to 0.
> > 
> > However, it varies for different processor generations.
> > 
> > What is the suitable value is hard to say, it depends on how soon does VMM
> > want to intercept the VM. Anyway, Intel ensures that even value 0 is safe.
> 
> Ideally we need a fixed default value that's guaranteed to work on all
> possible hardware that supports the feature, or alternatively a way to
> calculate a sane default window based on the hardware platform.
> 
> Could we get some wording added to the ISE regarding 0 being a
> suitable default value to use because hardware will add a threshold
> internally to make the value safe?
> 
> Thanks, Roger.
> 



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.