Xen project Mailing List

Re: [PATCH] x86/svm: retry after unhandled NPT fault if gfn was marked for recalculation

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>

From: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>

Date: Fri, 22 May 2020 11:14:24 +0100

Authentication-results: esa5.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, wl@xxxxxxx, jbeulich@xxxxxxxx, andrew.cooper3@xxxxxxxxxx

Delivery-date: Fri, 22 May 2020 10:14:35 +0000

Ironport-sdr: NfqJwmCDKLrTc+jp4yFEKqPI5eWD2h76oct4UKWna0tTU6a/EdOgeh6S73OwCRSjv4X6PCPkCJ fu0llXY3dX6+Lbj9ZFmAoLCt1Q/pxUIjpXR0pQcv4dyB+9s0QnV4held04X5z0kJNlCrt+ePQP hQEuZN7qPyBUnVoB597x+cHfrsfEyrA2MyszHN6BDXrMgxi8vUW6VPzoNpdveO7SckpS9p6nkS AIVXWSwGvi2JKc84tw5/UIDg3joEn95wTRPT3woHxIlRExklzudx4c23W1E2PcccAr/eJT/k3p ZKI=

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 22/05/2020 11:08, Roger Pau Monné wrote: > On Thu, May 21, 2020 at 10:43:58PM +0100, Igor Druzhinin wrote: >> If a recalculation NPT fault hasn't been handled explicitly in >> hvm_hap_nested_page_fault() then it's potentially safe to retry - >> US bit has been re-instated in PTE and any real fault would be correctly >> re-raised next time. >> >> This covers a specific case of migration with vGPU assigned on AMD: >> global log-dirty is enabled and causes immediate recalculation NPT >> fault in MMIO area upon access. This type of fault isn't described >> explicitly in hvm_hap_nested_page_fault (this isn't called on >> EPT misconfig exit on Intel) which results in domain crash. > > Couldn't direct MMIO regions be handled like other types of memory for > the purposes of logdiry mode? > > I assume there's already a path here used for other memory types when > logdirty is turned on, and hence would seem better to just make direct > MMIO regions also use that path? The proble of handling only MMIO case is that the issue still stays. It will be hit with some other memory type since it's not MMIO specific. The issue is that if global recalculation is called, the next hit to this type will cause a transient fault which will not be handled correctly after a due fixup by neither of our handlers. >> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> >> --- >> xen/arch/x86/hvm/svm/svm.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c >> index 46a1aac..f0d0bd3 100644 >> --- a/xen/arch/x86/hvm/svm/svm.c >> +++ b/xen/arch/x86/hvm/svm/svm.c >> @@ -1726,6 +1726,10 @@ static void svm_do_nested_pgfault(struct vcpu *v, >> /* inject #VMEXIT(NPF) into guest. */ >> nestedsvm_vmexit_defer(v, VMEXIT_NPF, pfec, gpa); >> return; >> + case 0: >> + /* If a recalculation page fault hasn't been handled - just retry. >> */ >> + if ( pfec & PFEC_user_mode ) >> + return; > > I'm slightly worried that this diverges from the EPT implementation > now, in the sense that returning 0 from hvm_hap_nested_page_fault will > no longer trigger a guest crash. My second alternative from my follow up email addresses this. I also didn't like this aspect. Igor

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.