[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/svm: do not try to handle recalc NPT faults immediately


  • To: Roger Pau Monné <roger@xxxxxxx>
  • From: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
  • Date: Fri, 29 May 2020 16:06:49 +0100
  • Authentication-results: esa6.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, george.dunlap@xxxxxxxxxx, wl@xxxxxxx, jbeulich@xxxxxxxx, andrew.cooper3@xxxxxxxxxx
  • Delivery-date: Fri, 29 May 2020 15:07:14 +0000
  • Ironport-sdr: e7XbytowCZx0ORFF5jF8a6kGqCFaqssE1F7VlabNyK7R2AlLLWub5sftG0Y61FE4T1S2FJyniL lSZ4W78NgiuUK48V2uDucyoXzLXFhaxYubrgmE6uueoMM7ICCkLGfq25PndSQ1sISd+KT0XSTm flPgBl41970kDw2BXh4f1ENsKfDs0yV/hENx3gxaQFEL3qYkR7sOPdyphGKRVok4AM6jSRa9gD N9Dzc2eg1m0nSduDx3kiraWS4r8M+9UokCyW2zDOHAnDxc1lj/jLxL+287NGG7UShflgEYXs3m k1w=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 29/05/2020 15:33, Roger Pau Monné wrote:
> On Fri, May 29, 2020 at 01:35:53AM +0100, Igor Druzhinin wrote:
>> A recalculation NPT fault doesn't always require additional handling
>> in hvm_hap_nested_page_fault(), moreover in general case if there is no
>> explicit handling done there - the fault is wrongly considered fatal.
>>
>> Instead of trying to be opportunistic - use safer approach and handle
>> P2M recalculation in a separate NPT fault by attempting to retry after
>> making the necessary adjustments. This is aligned with Intel behavior
>> where there are separate VMEXITs for recalculation and EPT violations
>> (faults) and only faults are handled in hvm_hap_nested_page_fault().
>> Do it by also unifying do_recalc return code with Intel implementation
>> where returning 1 means P2M was actually changed.
> 
> That seems like a good approach IMO.
> 
> Do you know whether this will make the code slower? (since there are
> cases previously handled in a single vmexit that would take two
> vmexits now)

The only case I could think of is memory writes during migration -
first fault would propagate P2M type recalculation while the second
actually log a dirty page.

The slowdown would be only during live phase obviously but should be
marginal and in line with what we currently have on Intel.

>> This covers a specific case of migration with vGPU assigned on AMD:
>> global log-dirty is enabled and causes immediate recalculation NPT
>> fault in MMIO area upon access.
>>
>> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
>> ---
>> This is a safer alternative to:
>> https://lists.xenproject.org/archives/html/xen-devel/2020-05/msg01662.html
>> and more correct approach from my PoV.
>> ---
>>  xen/arch/x86/hvm/svm/svm.c | 5 +++--
>>  xen/arch/x86/mm/p2m-pt.c   | 8 ++++++--
>>  2 files changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
>> index 46a1aac..7f6f578 100644
>> --- a/xen/arch/x86/hvm/svm/svm.c
>> +++ b/xen/arch/x86/hvm/svm/svm.c
>> @@ -2923,9 +2923,10 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
>>              v->arch.hvm.svm.cached_insn_len = vmcb->guest_ins_len & 0xf;
>>          rc = vmcb->exitinfo1 & PFEC_page_present
>>               ? p2m_pt_handle_deferred_changes(vmcb->exitinfo2) : 0;
>> -        if ( rc >= 0 )
>> +        if ( rc == 0 )
>> +            /* If no recal adjustments were being made - handle this fault 
>> */
>>              svm_do_nested_pgfault(v, regs, vmcb->exitinfo1, 
>> vmcb->exitinfo2);
>> -        else
>> +        else if ( rc < 0 )
>>          {
>>              printk(XENLOG_G_ERR
>>                     "%pv: Error %d handling NPF (gpa=%08lx ec=%04lx)\n",
>> diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
>> index 5c05017..377565b 100644
>> --- a/xen/arch/x86/mm/p2m-pt.c
>> +++ b/xen/arch/x86/mm/p2m-pt.c
>> @@ -340,7 +340,7 @@ static int do_recalc(struct p2m_domain *p2m, unsigned 
>> long gfn)
>>      unsigned long gfn_remainder = gfn;
>>      unsigned int level = 4;
>>      l1_pgentry_t *pent;
>> -    int err = 0;
>> +    int err = 0, rc = 0;
>>  
>>      table = map_domain_page(pagetable_get_mfn(p2m_get_pagetable(p2m)));
>>      while ( --level )
>> @@ -402,6 +402,8 @@ static int do_recalc(struct p2m_domain *p2m, unsigned 
>> long gfn)
>>                  clear_recalc(l1, e);
>>                  err = p2m->write_p2m_entry(p2m, gfn, pent, e, level + 1);
>>                  ASSERT(!err);
>> +
>> +                rc = 1;
>>              }
>>          }
>>          unmap_domain_page((void *)((unsigned long)pent & PAGE_MASK));
>> @@ -448,12 +450,14 @@ static int do_recalc(struct p2m_domain *p2m, unsigned 
>> long gfn)
>>              clear_recalc(l1, e);
>>          err = p2m->write_p2m_entry(p2m, gfn, pent, e, level + 1);
>>          ASSERT(!err);
>> +
>> +        rc = 1;
>>      }
>>  
>>   out:
>>      unmap_domain_page(table);
>>  
>> -    return err;
>> +    return err ? err : rc;
> 
> Nit: you can use the elvis operator here: return err ?: rc;
> 
> Also I couldn't spot any caller that would have troubles with the
> function now returning 1 in certain conditions, can you confirm the
> callers have been audited?

Yes, I checked all the callers before making the change. That's actually
where I spotted Intel side is doing exactly the same already.

Igor



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.