[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/2] xen/arm: Convert runstate address during hypcall


  • To: Julien Grall <julien@xxxxxxx>
  • From: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
  • Date: Fri, 12 Jun 2020 14:13:23 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=UvA2vYDVQ5KMS1BQJ+/F12lF7TBR7At6rvaxQ6p5PS4=; b=aMuSG6GA4vaXX1AjAKFCJbImBqYi6xgJLx5KXAQ0AlVIoUjO2Bu9XWalCZ0rS2i3KfothhwjaEzMAeAKZolyzg+97XmSPYRwTei0J6EwiqPnO+rufyvimgNP6OkcHB1VghdQNprBUWYaJdizUoFjD3ibVKQHpZ923nONTCgiziUni0fozrk8tL7cIjBoNqJ1K3UgRLRv5HhY+7DPaD6l1NmeWq7SLbFO/JXTJxcZ6EfT3rtqATSutnZZeeBpJ65P4V3U0yjy3dvKIPxT95WMkwV1FFRKXs2k8cy0C+miZeRo96dGAj2H9Pk/8OVXk13kFcz7BQxyWzPbB0q/fMlgcg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ippoWR5jN+HxJNmbEQDADPkjbATNm1DcfzO3YLU+TVTEhr5J/hPzJWTZEmmJ9N3Bk/t3viacW9xNwqhcThAdpxeVjpW+cPz25nWQuI0UgVT5BBKIVmBIXtev/rlvdi+6nlYQyw9yKb/ZEnuLsg/zpIctboBL+wS42wd30K1Pnk7/0Q55TZGudnguTP+rbllqEgbenX0GaXAKCJF7WlN+HUpHuMvPvi6dhzb6LLBA9qpCnaCAs4/AHN9v6bNRm/V/aphD2UYoE7G7ZXce8H2Gj68zwAKCJMOXZD2jrKcDvhcNacTH0WzQKFkw+kZbZMwYYdWBTz7HFUGogqNFGyj3EQ==
  • Authentication-results-original: xen.org; dkim=none (message not signed) header.d=none;xen.org; dmarc=none action=none header.from=arm.com;
  • Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, nd <nd@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Fri, 12 Jun 2020 14:13:38 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: xen.org; dkim=none (message not signed) header.d=none;xen.org; dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHWP+e0l1aGgjLKI0+XFOOVnv/iZKjUz6iAgAA3ygA=
  • Thread-topic: [PATCH 1/2] xen/arm: Convert runstate address during hypcall

Hi Julien,

> On 12 Jun 2020, at 11:53, Julien Grall <julien@xxxxxxx> wrote:
> 
> Hi Bertrand,
> 
> On 11/06/2020 12:58, Bertrand Marquis wrote:
>> At the moment on Arm, a Linux guest running with KTPI enabled will
>> cause the following error when a context switch happens in user mode:
>> (XEN) p2m.c:1890: d1v0: Failed to walk page-table va 0xffffff837ebe0cd0
> 
> I think you want to add a bit more context explaining the reason on the 
> failure. I.e. this is because the virtual address used by the runstate is 
> only accessible when running in kernel space.

Ok

> 
>> This patch is modifying the register runstate area handling on arm to
>> convert the guest address during the hypercall. During runtime context
>> switches the area is mapped to update the guest runstate copy.
>> The patch is also removing the initial copy which was done during the
>> hypercall handling as this is done on the current core when the context
>> is restore to go back to guest execution on arm.
> 
> This is explaining what the commit does but not why we want to translate the 
> virtual address at hypercall time. More importantly, this doesn't explain the 
> restrictions added on the hypercall and why they are fine.
> 
> Note that they also should be documented in the public header.

Ok

> 
>> As a consequence if the register runstate hypercall is called on one
>> vcpu for an other vcpu, the area will not be updated until the vcpu
>> will be executed (on arm only).
> 
> The code below suggests the hypercall will just fail if you call it from a 
> different vCPU. Is that correct?

No the hypercall will work, but the area in this case won’t be updated.
When the hypercall is called on the current vcpu, the area will be updated when 
the context will be restored before returning to the guest so there is no need 
to do it an extra time in the hypercall itself.
When this is done for a different vcpu the current code is not updating the 
area anymore.

I did think at first to do it but the complexity it was adding (mainly due to 
the dual page case) was very high and I could not really think of a use case or 
find one in Linux.
But now that I think of it I could, in that specific case, use a copy_to_guest.

Do you think this is something which would make sense to restore ?

Anyway I will fix the commit message to be clearer on that part.

> 
>> On x86, the behaviour is not modified, the address conversion is done
>> during the context switch and the area is updated fully during the
>> hypercall.
>> inline functions in headers could not be used as the architecture
>> domain.h is included before the global domain.h making it impossible
>> to use the struct vcpu inside the architecture header.
>> This should not have any performance impact as the hypercall is only
>> called once per vcpu usually.
>> Signed-off-by: Bertrand Marquis <bertrand.marquis@xxxxxxx>
>> ---
>>  xen/arch/arm/domain.c        | 109 +++++++++++++++++++++++++++++------
>>  xen/arch/x86/domain.c        |  30 +++++++++-
>>  xen/arch/x86/x86_64/domain.c |   4 +-
>>  xen/common/domain.c          |  19 ++----
>>  xen/include/asm-arm/domain.h |   8 +++
>>  xen/include/asm-x86/domain.h |  16 +++++
>>  xen/include/xen/domain.h     |   4 ++
>>  xen/include/xen/sched.h      |  16 +----
>>  8 files changed, 153 insertions(+), 53 deletions(-)
>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index 31169326b2..739059234f 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -19,6 +19,7 @@
>>  #include <xen/sched.h>
>>  #include <xen/softirq.h>
>>  #include <xen/wait.h>
>> +#include <xen/domain_page.h>
>>    #include <asm/alternative.h>
>>  #include <asm/cpuerrata.h>
>> @@ -275,36 +276,104 @@ static void ctxt_switch_to(struct vcpu *n)
>>      virt_timer_restore(n);
>>  }
>>  -/* Update per-VCPU guest runstate shared memory area (if registered). */
>> -static void update_runstate_area(struct vcpu *v)
>> +void arch_cleanup_runstate_guest(struct vcpu *v)
> 
> I would prefer if this is name arch_vcpu_cleanup_runstate() as this is 
> per-vCPU and not per-domain information.

Ok

> 
>>  {
>> -    void __user *guest_handle = NULL;
>> -    struct vcpu_runstate_info runstate;
>> +    spin_lock(&v->arch.runstate_guest.lock);
>>  -    if ( guest_handle_is_null(runstate_guest(v)) )
>> -        return;
>> +    /* cleanup previous page if any */
>> +    if ( v->arch.runstate_guest.page )
>> +    {
>> +        put_page_and_type(v->arch.runstate_guest.page);
> 
> get_page_from_gva() is only grabbing a reference on the page. So you want to 
> use put_page() here.
> 
> Note that we don't have type reference on Arm, so it equivalent to 
> put_page(). But this wouldn't be technically correct :).

Ok, thanks for the explanation.

> 
>> +        v->arch.runstate_guest.page = NULL;
>> +        v->arch.runstate_guest.offset = 0;
>> +    }
>>  -    memcpy(&runstate, &v->runstate, sizeof(runstate));
>> +    spin_unlock(&v->arch.runstate_guest.lock);
>> +}
>>  -    if ( VM_ASSIST(v->domain, runstate_update_flag) )
>> +int arch_setup_runstate_guest(struct vcpu *v,
>> +                            struct vcpu_register_runstate_memory_area area)
> 
> The indentation looks off here.
> 
> Also, same remark for the naming.

Ok

> 
> 
>> +{
>> +    struct page_info *page;
>> +    unsigned offset;
>> +
>> +    spin_lock(&v->arch.runstate_guest.lock);
>> +
>> +    /* cleanup previous page if any */
>> +    if ( v->arch.runstate_guest.page )
>>      {
>> -        guest_handle = &v->runstate_guest.p->state_entry_time + 1;
>> -        guest_handle--;
>> -        runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
>> -        __raw_copy_to_guest(guest_handle,
>> -                            (void *)(&runstate.state_entry_time + 1) - 1, 
>> 1);
>> -        smp_wmb();
>> +        put_page_and_type(v->arch.runstate_guest.page);
> 
> Same remark here. Although I would prefer if we try to have a common helper 
> to cleaning up the runstate. Maybe cleanup_runstate_vcpu_locked()?

Agree I will add a static function and use it on those 2 locations.

> 
>> +        v->arch.runstate_guest.page = NULL;
>> +        v->arch.runstate_guest.offset = 0;
>> +    }
>> +
>> +    offset = ((vaddr_t)area.addr.v) & ~PAGE_MASK;
>> +    if ( offset > (PAGE_SIZE - sizeof(struct vcpu_runstate_info)) )
>> +    {
>> +        spin_unlock(&v->arch.runstate_guest.lock);
>> +        gprintk(XENLOG_DEBUG, "Runstate is crossing pages\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    /* provided address must be aligned to a 64bit */
>> +    if ( offset % alignof(struct vcpu_runstate_info) )
>> +    {
>> +        spin_unlock(&v->arch.runstate_guest.lock);
>> +        gprintk(XENLOG_DEBUG, "Runstate pointer is not aligned\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    page = get_page_from_gva(v, (vaddr_t)area.addr.v, GV2M_WRITE);
> 
> In the current implementation, 0 was used to unregister the runstate area. I 
> think we want to keep that feature and not throw an error.

I will add this use case (I missed it).

> 
>> +    if ( !page )
>> +    {
>> +        spin_unlock(&v->arch.runstate_guest.lock);
>> +        gprintk(XENLOG_DEBUG, "Runstate pointer is not mapped\n");
>> +        return -EINVAL;
>>      }
>>  -    __copy_to_guest(runstate_guest(v), &runstate, 1);
>> +    v->arch.runstate_guest.page = page;
>> +    v->arch.runstate_guest.offset = offset;
>> +
> 
> In the current implementation, the runstate area was update with the latest 
> information during the hypercall. This doesn't seem to happen anymore. Is 
> there any specific reason?

As explained before, this is still happening on the current vcpu as a result of 
the context switch back to the guest at the end of the hypercall.
This is not done if called for a different vcpu anymore, but I could add it 
back using copy_to_guest

> 
>> +    spin_unlock(&v->arch.runstate_guest.lock);
>> +
>> +    return 0;
>> +}
>> +
>> +
>> +/* Update per-VCPU guest runstate shared memory area (if registered). */
>> +static void update_runstate_area(struct vcpu *v)
>> +{
>> +    struct vcpu_runstate_info *guest_runstate;
>> +    void *p;
>> +
>> +    spin_lock(&v->arch.runstate_guest.lock);
>>  -    if ( guest_handle )
>> +    if ( v->arch.runstate_guest.page )
>>      {
>> -        runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
>> +        p = __map_domain_page(v->arch.runstate_guest.page);
>> +        guest_runstate = p + v->arch.runstate_guest.offset;
>> +
>> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
>> +        {
>> +            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
>> +            guest_runstate->state_entry_time |= XEN_RUNSTATE_UPDATE;
>> +            smp_wmb();
>> +        }
>> +
>> +        memcpy(guest_runstate, &v->runstate, sizeof(v->runstate));
>>          smp_wmb();
>> -        __raw_copy_to_guest(guest_handle,
>> -                            (void *)(&runstate.state_entry_time + 1) - 1, 
>> 1);
>> +
>> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
>> +        {
>> +            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
>> +            guest_runstate->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
>> +            smp_wmb();
> 
> Why do you need this barrier?

As the bit is supposed to be used to wait for an update to be done, I felt it 
was better to push this out as soon as possible.
As there is already one after the memcpy, the cost should be fairly limited 
here.

> 
> 
> [...]
> 
>> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
>> index 4e2f582006..3a7f53e13d 100644
>> --- a/xen/include/asm-arm/domain.h
>> +++ b/xen/include/asm-arm/domain.h
>> @@ -11,6 +11,7 @@
>>  #include <asm/vgic.h>
>>  #include <asm/vpl011.h>
>>  #include <public/hvm/params.h>
>> +#include <public/vcpu.h>
> 
> Why do you need to add this new include?
> 
>>  #include <xen/serial.h>
>>  #include <xen/rbtree.h>
>>  @@ -206,6 +207,13 @@ struct arch_vcpu
>>       */
>>      bool need_flush_to_ram;
>>  +    /* runstate guest info */
>> +    struct {
>> +        struct page_info *page;  /* guest page */
>> +        unsigned         offset; /* offset in page */
> 
> Please use unsigned int.
Ok

Thanks for your comments.
Bertrand


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.