Xen project Mailing List

Re: [PATCH] x86/svm: do not try to handle recalc NPT faults immediately

To: Roger Pau Monné <roger@xxxxxxx>

From: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>

Date: Fri, 29 May 2020 16:06:49 +0100

Authentication-results: esa6.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, george.dunlap@xxxxxxxxxx, wl@xxxxxxx, jbeulich@xxxxxxxx, andrew.cooper3@xxxxxxxxxx

Delivery-date: Fri, 29 May 2020 15:07:14 +0000

Ironport-sdr: e7XbytowCZx0ORFF5jF8a6kGqCFaqssE1F7VlabNyK7R2AlLLWub5sftG0Y61FE4T1S2FJyniL lSZ4W78NgiuUK48V2uDucyoXzLXFhaxYubrgmE6uueoMM7ICCkLGfq25PndSQ1sISd+KT0XSTm flPgBl41970kDw2BXh4f1ENsKfDs0yV/hENx3gxaQFEL3qYkR7sOPdyphGKRVok4AM6jSRa9gD N9Dzc2eg1m0nSduDx3kiraWS4r8M+9UokCyW2zDOHAnDxc1lj/jLxL+287NGG7UShflgEYXs3m k1w=

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 29/05/2020 15:33, Roger Pau Monné wrote: > On Fri, May 29, 2020 at 01:35:53AM +0100, Igor Druzhinin wrote: >> A recalculation NPT fault doesn't always require additional handling >> in hvm_hap_nested_page_fault(), moreover in general case if there is no >> explicit handling done there - the fault is wrongly considered fatal. >> >> Instead of trying to be opportunistic - use safer approach and handle >> P2M recalculation in a separate NPT fault by attempting to retry after >> making the necessary adjustments. This is aligned with Intel behavior >> where there are separate VMEXITs for recalculation and EPT violations >> (faults) and only faults are handled in hvm_hap_nested_page_fault(). >> Do it by also unifying do_recalc return code with Intel implementation >> where returning 1 means P2M was actually changed. > > That seems like a good approach IMO. > > Do you know whether this will make the code slower? (since there are > cases previously handled in a single vmexit that would take two > vmexits now) The only case I could think of is memory writes during migration - first fault would propagate P2M type recalculation while the second actually log a dirty page. The slowdown would be only during live phase obviously but should be marginal and in line with what we currently have on Intel. >> This covers a specific case of migration with vGPU assigned on AMD: >> global log-dirty is enabled and causes immediate recalculation NPT >> fault in MMIO area upon access. >> >> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> > > Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > >> --- >> This is a safer alternative to: >> https://lists.xenproject.org/archives/html/xen-devel/2020-05/msg01662.html >> and more correct approach from my PoV. >> --- >> xen/arch/x86/hvm/svm/svm.c | 5 +++-- >> xen/arch/x86/mm/p2m-pt.c | 8 ++++++-- >> 2 files changed, 9 insertions(+), 4 deletions(-) >> >> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c >> index 46a1aac..7f6f578 100644 >> --- a/xen/arch/x86/hvm/svm/svm.c >> +++ b/xen/arch/x86/hvm/svm/svm.c >> @@ -2923,9 +2923,10 @@ void svm_vmexit_handler(struct cpu_user_regs *regs) >> v->arch.hvm.svm.cached_insn_len = vmcb->guest_ins_len & 0xf; >> rc = vmcb->exitinfo1 & PFEC_page_present >> ? p2m_pt_handle_deferred_changes(vmcb->exitinfo2) : 0; >> - if ( rc >= 0 ) >> + if ( rc == 0 ) >> + /* If no recal adjustments were being made - handle this fault >> */ >> svm_do_nested_pgfault(v, regs, vmcb->exitinfo1, >> vmcb->exitinfo2); >> - else >> + else if ( rc < 0 ) >> { >> printk(XENLOG_G_ERR >> "%pv: Error %d handling NPF (gpa=%08lx ec=%04lx)\n", >> diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c >> index 5c05017..377565b 100644 >> --- a/xen/arch/x86/mm/p2m-pt.c >> +++ b/xen/arch/x86/mm/p2m-pt.c >> @@ -340,7 +340,7 @@ static int do_recalc(struct p2m_domain *p2m, unsigned >> long gfn) >> unsigned long gfn_remainder = gfn; >> unsigned int level = 4; >> l1_pgentry_t *pent; >> - int err = 0; >> + int err = 0, rc = 0; >> >> table = map_domain_page(pagetable_get_mfn(p2m_get_pagetable(p2m))); >> while ( --level ) >> @@ -402,6 +402,8 @@ static int do_recalc(struct p2m_domain *p2m, unsigned >> long gfn) >> clear_recalc(l1, e); >> err = p2m->write_p2m_entry(p2m, gfn, pent, e, level + 1); >> ASSERT(!err); >> + >> + rc = 1; >> } >> } >> unmap_domain_page((void *)((unsigned long)pent & PAGE_MASK)); >> @@ -448,12 +450,14 @@ static int do_recalc(struct p2m_domain *p2m, unsigned >> long gfn) >> clear_recalc(l1, e); >> err = p2m->write_p2m_entry(p2m, gfn, pent, e, level + 1); >> ASSERT(!err); >> + >> + rc = 1; >> } >> >> out: >> unmap_domain_page(table); >> >> - return err; >> + return err ? err : rc; > > Nit: you can use the elvis operator here: return err ?: rc; > > Also I couldn't spot any caller that would have troubles with the > function now returning 1 in certain conditions, can you confirm the > callers have been audited? Yes, I checked all the callers before making the change. That's actually where I spotted Intel side is doing exactly the same already. Igor

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.