[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC PATCH 1/1] xen: Use a global mapping for runstate
> On 29 May 2020, at 10:43, Julien Grall <julien@xxxxxxx> wrote: > > Hi Bertrand, > > On 29/05/2020 09:13, Bertrand Marquis wrote: >> Hi Julien, >>> On 28 May 2020, at 19:54, Julien Grall <julien@xxxxxxx> wrote: >>> >>> Hi Bertrand, >>> >>> Thank you for the patch. >>> >>> On 28/05/2020 16:25, Bertrand Marquis wrote: >>>> At the moment on Arm, a Linux guest running with KTPI enabled will >>>> cause the following error when a context switch happens in user mode: >>>> (XEN) p2m.c:1890: d1v0: Failed to walk page-table va 0xffffff837ebe0cd0 >>>> This patch is modifying runstate handling to map the area given by the >>>> guest inside Xen during the hypercall. >>>> This is removing the guest virtual to physical conversion during context >>>> switches which removes the bug >>> >>> It would be good to spell out that a virtual address is not stable. So >>> relying on it is wrong. >>> >>>> and improve performance by preventing to >>>> walk page tables during context switches. >>> >>> With Secret free hypervisor in mind, I would like to suggest to map/unmap >>> the runstate during context switch. >>> >>> The cost should be minimal when there is a direct map (i.e on Arm64 and >>> x86) and still provide better performance on Arm32. >> Even with a minimal cost this is still adding some non real-time behaviour >> to the context switch. > > Just to be clear, by minimal I meant the mapping part is just a virt_to_mfn() > call and the unmapping is a NOP. > > IHMO, if virt_to_mfn() ends up to add non-RT behavior then you have much > bigger problem than just this call. > >> But definitely from the security point of view as we have to map a page from >> the guest, we could have accessible in Xen some data that should not be >> there. >> There is a trade here where: >> - xen can protect by map/unmapping >> - a guest which wants to secure his data should either not use it or make >> sure there is nothing else in the page > > Both are valid and depends on your setup. One may want to protect all the > existing guests, so modifying a guest may not be a solution. The fact to map/unmap is increasing the protection but not removing the problem completely. > >> That sounds like a thread local storage kind of problematic where we want >> data from xen to be accessible fast from the guest and easy to be modified >> from xen. > > Agree. On x86, they have a perdomain mapping so it would be possible to do > it. We would need to add the feature on Arm. That would definitely be cleaner. > >>> >>> The change should be minimal compare to the current approach but this could >>> be taken care separately if you don't have time. >> I could add that to the serie in a separate patch so that it can be >> discussed and test separately ? > > If you are picking the task, then I would suggest to add it directly in this > patch. I will see that but the discussion is leading more on a result where we accept the current status. > >>> >>>> -- >>>> In the current status, this patch is only working on Arm and needs to >>>> be fixed on X86 (see #error on domain.c for missing get_page_from_gva). >>>> Signed-off-by: Bertrand Marquis <bertrand.marquis@xxxxxxx> >>>> --- >>>> xen/arch/arm/domain.c | 32 +++++++++------- >>>> xen/arch/x86/domain.c | 51 ++++++++++++++----------- >>>> xen/common/domain.c | 84 ++++++++++++++++++++++++++++++++++------- >>>> xen/include/xen/sched.h | 11 ++++-- >>>> 4 files changed, 124 insertions(+), 54 deletions(-) >>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c >>>> index 31169326b2..799b0e0103 100644 >>>> --- a/xen/arch/arm/domain.c >>>> +++ b/xen/arch/arm/domain.c >>>> @@ -278,33 +278,37 @@ static void ctxt_switch_to(struct vcpu *n) >>>> /* Update per-VCPU guest runstate shared memory area (if registered). */ >>>> static void update_runstate_area(struct vcpu *v) >>>> { >>>> - void __user *guest_handle = NULL; >>>> - struct vcpu_runstate_info runstate; >>>> + struct vcpu_runstate_info *runstate; >>>> - if ( guest_handle_is_null(runstate_guest(v)) ) >>>> + /* XXX why do we accept not to block here */ >>>> + if ( !spin_trylock(&v->runstate_guest_lock) ) >>>> return; >>>> - memcpy(&runstate, &v->runstate, sizeof(runstate)); >>>> + runstate = runstate_guest(v); >>>> + >>>> + if (runstate == NULL) >>>> + { >>>> + spin_unlock(&v->runstate_guest_lock); >>>> + return; >>>> + } >>>> if ( VM_ASSIST(v->domain, runstate_update_flag) ) >>>> { >>>> - guest_handle = &v->runstate_guest.p->state_entry_time + 1; >>>> - guest_handle--; >>>> - runstate.state_entry_time |= XEN_RUNSTATE_UPDATE; >>>> - __raw_copy_to_guest(guest_handle, >>>> - (void *)(&runstate.state_entry_time + 1) - 1, >>>> 1); >>>> + runstate->state_entry_time |= XEN_RUNSTATE_UPDATE; >>>> smp_wmb(); >>> >>> Because you set v->runstate.state_entry_time below, the placement of the >>> barrier seems a bit odd. >>> >>> I would suggest to update v->runstate.state_entry_time first and then >>> update runstate->state_entry_time. >> We do want the guest to know when we modify the runstate so we need to make >> sure the XEN_RUNSTATE_UPDATE is actually set in a visible way before we do >> the memcpy. >> That’s why the barrier is after. > > I think you misunderstood my comment here. I meant to write the following > code: > > v->runstate.state_entry_time = ... > runstate->state_entry_time = ... > smp_wmb() > > So it is much clearer that the barrier is because runstate->state_entry_time > needs to be updated before the memcpy(). > >>>> + >>>> +#ifdef CONFIG_ARM >>>> + page = get_page_from_gva(v, area->addr.p, GV2M_WRITE); >>> >>> A guest is allowed to setup the runstate for a different vCPU than the >>> current one. This will lead to get_page_from_gva() to fail as the function >>> cannot yet work with a vCPU other than current. >> If the area is mapped per cpu but isn’t the aim of this to have a way to >> check other cpus status ? > > The aim is to collect how much time a vCPU has been unscheduled. This doesn't > have to be run from a different vCPU. > > Anyway, my point is the hypercall allows it today. If you want to make such > restriction, then we need to agree on it and document it. > >>> >>> AFAICT, there is no restriction on when the runstate hypercall can be >>> called. So this can even be called before the vCPU is brought up. >> I understand the remark but it still feels very weird to allow an invalid >> address in an hypercall. >> Wouldn’t we have a lot of potential issues accepting an address that we >> cannot check ? > > Well, that's why you see errors when using KPTI ;). If you use a global > mapping, then it is not possible to continue without validating the address. > > But to do this, you will have to put restriction on the hypercalls. I would > be OK to make such restriction on Arm. > > Cheers, > > -- > Julien Grall
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |