[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH for-4.14 v3] x86/svm: do not try to handle recalc NPT faults immediately



> -----Original Message-----
> From: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
> Sent: 03 June 2020 23:42
> To: xen-devel@xxxxxxxxxxxxxxxxxxxx
> Cc: jbeulich@xxxxxxxx; andrew.cooper3@xxxxxxxxxx; wl@xxxxxxx; 
> roger.pau@xxxxxxxxxx;
> george.dunlap@xxxxxxxxxx; paul@xxxxxxx; Igor Druzhinin 
> <igor.druzhinin@xxxxxxxxxx>
> Subject: [PATCH for-4.14 v3] x86/svm: do not try to handle recalc NPT faults 
> immediately
> 
> A recalculation NPT fault doesn't always require additional handling
> in hvm_hap_nested_page_fault(), moreover in general case if there is no
> explicit handling done there - the fault is wrongly considered fatal.
> 
> This covers a specific case of migration with vGPU assigned which
> uses direct MMIO mappings made by XEN_DOMCTL_memory_mapping hypercall:
> at a moment log-dirty is enabled globally, recalculation is requested
> for the whole guest memory including those mapped MMIO regions

I still think it is odd to put this in the commit comment since, as I said 
before, Xen ensures that this situation cannot happen at
the moment.

> which causes a page fault being raised at the first access to them;
> but due to MMIO P2M type not having any explicit handling in
> hvm_hap_nested_page_fault() a domain is erroneously crashed with unhandled
> SVM violation.
> 
> Instead of trying to be opportunistic - use safer approach and handle
> P2M recalculation in a separate NPT fault by attempting to retry after
> making the necessary adjustments. This is aligned with Intel behavior
> where there are separate VMEXITs for recalculation and EPT violations
> (faults) and only faults are handled in hvm_hap_nested_page_fault().
> Do it by also unifying do_recalc return code with Intel implementation
> where returning 1 means P2M was actually changed.
> 
> Since there was no case previously where p2m_pt_handle_deferred_changes()
> could return a positive value - it's safe to replace ">= 0" with just "== 0"
> in VMEXIT_NPF handler. finish_type_change() is also not affected by the
> change as being able to deal with >0 return value of p2m->recalc from
> EPT implementation.
> 
> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
> Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>

However, it's a worthy fix so...

Release-acked-by: Paul Durrant <paul@xxxxxxx>

> ---
> Changes in v2:
> - replace rc with recalc_done bool
> - updated comment in finish_type_change()
> - significantly extended commit description
> Changes in v3:
> - covert bool to int implicitly
> - a little bit more info of the usecase in the message
> ---
>  xen/arch/x86/hvm/svm/svm.c | 5 +++--
>  xen/arch/x86/mm/p2m-pt.c   | 7 ++++++-
>  xen/arch/x86/mm/p2m.c      | 2 +-
>  3 files changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index 46a1aac..7f6f578 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -2923,9 +2923,10 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
>              v->arch.hvm.svm.cached_insn_len = vmcb->guest_ins_len & 0xf;
>          rc = vmcb->exitinfo1 & PFEC_page_present
>               ? p2m_pt_handle_deferred_changes(vmcb->exitinfo2) : 0;
> -        if ( rc >= 0 )
> +        if ( rc == 0 )
> +            /* If no recal adjustments were being made - handle this fault */
>              svm_do_nested_pgfault(v, regs, vmcb->exitinfo1, vmcb->exitinfo2);
> -        else
> +        else if ( rc < 0 )
>          {
>              printk(XENLOG_G_ERR
>                     "%pv: Error %d handling NPF (gpa=%08lx ec=%04lx)\n",
> diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
> index 5c05017..070389e 100644
> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -341,6 +341,7 @@ static int do_recalc(struct p2m_domain *p2m, unsigned 
> long gfn)
>      unsigned int level = 4;
>      l1_pgentry_t *pent;
>      int err = 0;
> +    bool recalc_done = false;
> 
>      table = map_domain_page(pagetable_get_mfn(p2m_get_pagetable(p2m)));
>      while ( --level )
> @@ -402,6 +403,8 @@ static int do_recalc(struct p2m_domain *p2m, unsigned 
> long gfn)
>                  clear_recalc(l1, e);
>                  err = p2m->write_p2m_entry(p2m, gfn, pent, e, level + 1);
>                  ASSERT(!err);
> +
> +                recalc_done = true;
>              }
>          }
>          unmap_domain_page((void *)((unsigned long)pent & PAGE_MASK));
> @@ -448,12 +451,14 @@ static int do_recalc(struct p2m_domain *p2m, unsigned 
> long gfn)
>              clear_recalc(l1, e);
>          err = p2m->write_p2m_entry(p2m, gfn, pent, e, level + 1);
>          ASSERT(!err);
> +
> +        recalc_done = true;
>      }
> 
>   out:
>      unmap_domain_page(table);
> 
> -    return err;
> +    return err ?: recalc_done;
>  }
> 
>  int p2m_pt_handle_deferred_changes(uint64_t gpa)
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index 17f320b..db7bde0 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1197,7 +1197,7 @@ static int finish_type_change(struct p2m_domain *p2m,
>          rc = p2m->recalc(p2m, gfn);
>          /*
>           * ept->recalc could return 0/1/-ENOMEM. pt->recalc could return
> -         * 0/-ENOMEM/-ENOENT, -ENOENT isn't an error as we are looping
> +         * 0/1/-ENOMEM/-ENOENT, -ENOENT isn't an error as we are looping
>           * gfn here. If rc is 1 we need to have it 0 for success.
>           */
>          if ( rc == -ENOENT || rc > 0 )
> --
> 2.7.4





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.