[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] RT Xen on ARM - R-Car series



On Thu, Jan 31, 2019 at 11:14:37PM +0000, Julien Grall wrote:
> Hi Stefano,
> 
> On 1/31/19 9:56 PM, Stefano Stabellini wrote:
> > On Thu, 31 Jan 2019, Julien Grall wrote:
> > > On 31/01/2019 12:00, Andrii Anisov wrote:
> > > > Hello Julien,
> > > > 
> > > > On 31.01.19 13:37, Julien Grall wrote:
> > > > > > On my side I just commented out that printk, because it renders a 
> > > > > > debug
> > > > > > build unusable.
> > > > > 
> > > > > ... if it is unusable, why don't you try to tackle the problem?
> > > > Because of...
> > > > 
> > > > > This is in my long ever growing list of things to
> > > > 
> > > > ... be done.
> > > > 
> > > > Some of things get solutions, some WAs.
> > > 
> > > I can't see a good workaround for this. At some point someone would have 
> > > to
> > > pick it up rather than building a house of cards.
> > 
> > I ran into this problem as well not too long ago too. It is very
> > annoying and it is basically impossible to work-around, the only thing
> > possible would be to suppress the warning, but that doesn't even count
> > as a work-around :-)
> 
> I am sure I will regret to have said that, but I will for fairness :).
> 
> If security is not a concern within the guest, then you can disable kpti
> (either via Kconfig or command line). All the errors should go away for
> Linux guest.
> 
> > 
> > The way forward is to modify the existing
> > VCPUOP_register_runstate_memory_area interface. I liked Julien's idea in
> > https://lists.xenproject.org/archives/html/xen-devel/2018-03/msg00227.html:
> > we don't necessarily need to change the parameters of the hypercalls
> > from a guest virtual address to a guest physical address. It should be
> > enough to convert the guest virtual address into a guest physical
> > address in Xen when VCPUOP_register_runstate_memory_area is called
> > (xen/common/domain.c:VCPUOP_register_runstate_memory_area), then store
> > the guest physical address or its mapping in v->runstate_guest (the type
> > of runstate_guest needs to change) and always use the guest physical
> > address for future updates on the runstate memory area.
> 
> I would love to say it is that easy :). However, there are some research to
> do regarding how this is used by guests today. The hypercall is taking a
> virtual address, so technically it would be possible for a guest to pass a
> non page-aligned virtual address. So this would span onto two buffers (it
> seems to happen on older Linux).
> 
> Furthermore, because this is a virtual address, a guess would be free to
> modify the mapping at any time.
> 
> So if we want to use guest physical address in Xen, we need to know if it
> will not break any current guest. This would also probably needs to be
> documented in the interface.
> 
> With the current interface workaround, we are still playing with devil (see
> [2]). So it would be nice to get a new interface that does not use virtual
> address.

So, I've got a hacky patch to 'fix' this on x86, by taking a reference
to the mfn behind the virtual address provided when setting up the
hypercall and mapping it in Xen. This however doesn't work on ARM due
to the lack of paging_gva_to_gfn. I guess there's something similar to
translate a guest virtual address into a gfn or a mfn?

I've only tested this very slightly, so there might be dragons...

Anyway, this is what I have so far:

---8<---
commit 9b88c4aba641abd188144c4bad1d69345625678f
Author: Roger Pau Monne <roger.pau@xxxxxxxxxx>
Date:   Fri Feb 1 12:13:47 2019 +0100

    keep runstate area mapped

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 6dc633ed50..cfd60ed7a7 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -277,29 +277,23 @@ static void ctxt_switch_to(struct vcpu *n)
 /* Update per-VCPU guest runstate shared memory area (if registered). */
 static void update_runstate_area(struct vcpu *v)
 {
-    void __user *guest_handle = NULL;
-
-    if ( guest_handle_is_null(runstate_guest(v)) )
+    if ( !v->runstate_guest(v) )
         return;
 
     if ( VM_ASSIST(v->domain, runstate_update_flag) )
     {
-        guest_handle = &v->runstate_guest.p->state_entry_time + 1;
-        guest_handle--;
         v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
-        __raw_copy_to_guest(guest_handle,
-                            (void *)(&v->runstate.state_entry_time + 1) - 1, 
1);
+        v->runstate_guest->state_entry_time |= XEN_RUNSTATE_UPDATE;
         smp_wmb();
     }
 
-    __copy_to_guest(runstate_guest(v), &v->runstate, 1);
+    memcpy(v->runstate_guest, &v->runstate, sizeof(v->runstate));
 
-    if ( guest_handle )
+    if ( VM_ASSIST(v->domain, runstate_update_flag) )
     {
         v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
         smp_wmb();
-        __raw_copy_to_guest(guest_handle,
-                            (void *)(&v->runstate.state_entry_time + 1) - 1, 
1);
+        v->runstate_guest->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
     }
 }
 
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 32dc4253ff..b86d8d94c7 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1604,26 +1604,22 @@ void paravirt_ctxt_switch_to(struct vcpu *v)
 }
 
 /* Update per-VCPU guest runstate shared memory area (if registered). */
-bool update_runstate_area(struct vcpu *v)
+void update_runstate_area(struct vcpu *v)
 {
-    bool rc;
     struct guest_memory_policy policy = { .nested_guest_mode = false };
-    void __user *guest_handle = NULL;
 
-    if ( guest_handle_is_null(runstate_guest(v)) )
-        return true;
+    if ( !v->runstate_guest )
+        return;
 
     update_guest_memory_policy(v, &policy);
 
     if ( VM_ASSIST(v->domain, runstate_update_flag) )
     {
-        guest_handle = has_32bit_shinfo(v->domain)
-            ? &v->runstate_guest.compat.p->state_entry_time + 1
-            : &v->runstate_guest.native.p->state_entry_time + 1;
-        guest_handle--;
         v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
-        __raw_copy_to_guest(guest_handle,
-                            (void *)(&v->runstate.state_entry_time + 1) - 1, 
1);
+        if ( has_32bit_shinfo((v)->domain) )
+            v->compat_runstate_guest->state_entry_time |= XEN_RUNSTATE_UPDATE;
+        else
+            v->runstate_guest->state_entry_time |= XEN_RUNSTATE_UPDATE;
         smp_wmb();
     }
 
@@ -1632,31 +1628,22 @@ bool update_runstate_area(struct vcpu *v)
         struct compat_vcpu_runstate_info info;
 
         XLAT_vcpu_runstate_info(&info, &v->runstate);
-        __copy_to_guest(v->runstate_guest.compat, &info, 1);
-        rc = true;
+        memcpy(v->compat_runstate_guest, &info, sizeof(info));
     }
     else
-        rc = __copy_to_guest(runstate_guest(v), &v->runstate, 1) !=
-             sizeof(v->runstate);
+        memcpy(v->runstate_guest, &v->runstate, sizeof(v->runstate));
 
-    if ( guest_handle )
+    if ( VM_ASSIST(v->domain, runstate_update_flag) )
     {
         v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
         smp_wmb();
-        __raw_copy_to_guest(guest_handle,
-                            (void *)(&v->runstate.state_entry_time + 1) - 1, 
1);
+        if ( has_32bit_shinfo((v)->domain) )
+            v->compat_runstate_guest->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+        else
+            v->runstate_guest->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
     }
 
     update_guest_memory_policy(v, &policy);
-
-    return rc;
-}
-
-static void _update_runstate_area(struct vcpu *v)
-{
-    if ( !update_runstate_area(v) && is_pv_vcpu(v) &&
-         !(v->arch.flags & TF_kernel_mode) )
-        v->arch.pv.need_update_runstate_area = 1;
 }
 
 static inline bool need_full_gdt(const struct domain *d)
@@ -1779,7 +1766,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
 
     if ( prev != next )
     {
-        _update_runstate_area(prev);
+        update_runstate_area(prev);
         vpmu_switch_from(prev);
         np2m_schedule(NP2M_SCHEDLE_OUT);
     }
@@ -1841,7 +1828,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
 
     if ( prev != next )
     {
-        _update_runstate_area(next);
+        update_runstate_area(next);
 
         /* Must be done with interrupts enabled */
         vpmu_switch_to(next);
diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index a571d15c13..3046bd4bc9 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -421,7 +421,7 @@ static int __init pvh_setup_p2m(struct domain *d)
     pvh_setup_e820(d, nr_pages);
     do {
         preempted = false;
-        paging_set_allocation(d, dom0_paging_pages(d, nr_pages),
+        paging_set_allocation(d, dom0_paging_pages(d, nr_pages) * 10,
                               &preempted);
         process_pending_softirqs();
     } while ( preempted );
diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c
index 7e84b04082..027bcf4eb0 100644
--- a/xen/arch/x86/pv/domain.c
+++ b/xen/arch/x86/pv/domain.c
@@ -313,9 +313,6 @@ static void _toggle_guest_pt(struct vcpu *v)
     if ( !(v->arch.flags & TF_kernel_mode) )
         return;
 
-    if ( v->arch.pv.need_update_runstate_area && update_runstate_area(v) )
-        v->arch.pv.need_update_runstate_area = 0;
-
     if ( v->arch.pv.pending_system_time.version &&
          update_secondary_system_time(v, &v->arch.pv.pending_system_time) )
         v->arch.pv.pending_system_time.version = 0;
diff --git a/xen/arch/x86/x86_64/domain.c b/xen/arch/x86/x86_64/domain.c
index c46dccc25a..9734cd39ef 100644
--- a/xen/arch/x86/x86_64/domain.c
+++ b/xen/arch/x86/x86_64/domain.c
@@ -22,34 +22,21 @@ arch_compat_vcpu_op(
     {
     case VCPUOP_register_runstate_memory_area:
     {
-        struct compat_vcpu_register_runstate_memory_area area;
-        struct compat_vcpu_runstate_info info;
-
-        area.addr.p = 0;
+        union {
+            struct compat_vcpu_register_runstate_memory_area compat;
+            struct vcpu_register_runstate_memory_area native;
+        } area = { };
 
         rc = -EFAULT;
-        if ( copy_from_guest(&area.addr.h, arg, 1) )
+        if ( copy_from_guest(&area.compat.addr.v, arg, 1) )
             break;
 
-        if ( area.addr.h.c != area.addr.p ||
-             !compat_handle_okay(area.addr.h, 1) )
+        unmap_runstate_area(v);
+        rc = map_runstate_area(v, &area.native);
+        if ( rc )
             break;
 
-        rc = 0;
-        guest_from_compat_handle(v->runstate_guest.compat, area.addr.h);
-
-        if ( v == current )
-        {
-            XLAT_vcpu_runstate_info(&info, &v->runstate);
-        }
-        else
-        {
-            struct vcpu_runstate_info runstate;
-
-            vcpu_runstate_get(v, &runstate);
-            XLAT_vcpu_runstate_info(&info, &runstate);
-        }
-        __copy_to_guest(v->runstate_guest.compat, &info, 1);
+        update_runstate_area(v);
 
         break;
     }
diff --git a/xen/common/domain.c b/xen/common/domain.c
index c623daec56..4a4c4a414d 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1177,7 +1177,7 @@ int domain_soft_reset(struct domain *d)
 
     for_each_vcpu ( d, v )
     {
-        set_xen_guest_handle(runstate_guest(v), NULL);
+        unmap_runstate_area(v);
         unmap_vcpu_info(v);
     }
 
@@ -1318,6 +1318,84 @@ void unmap_vcpu_info(struct vcpu *v)
     put_page_and_type(mfn_to_page(mfn));
 }
 
+int map_runstate_area(struct vcpu *v,
+                      struct vcpu_register_runstate_memory_area *area)
+{
+    unsigned long offset;
+    unsigned int i;
+    struct domain *d = v->domain;
+    size_t size =
+#ifdef CONFIG_COMPAT
+        has_32bit_shinfo((v)->domain) ? sizeof(*v->compat_runstate_guest) :
+#endif
+                                        sizeof(*v->runstate_guest);
+
+    if ( v->runstate_guest )
+    {
+        ASSERT_UNREACHABLE();
+        return -EBUSY;
+    }
+
+    offset = area->addr.p & ~PAGE_MASK;
+
+    for ( i = 0; i < ARRAY_SIZE(v->runstate_mfn); i++ )
+    {
+        p2m_type_t t;
+        uint32_t pfec = PFEC_page_present;
+        gfn_t gfn = _gfn(paging_gva_to_gfn(v, area->addr.p, &pfec));
+        struct page_info *pg;
+
+        if ( gfn_eq(gfn, INVALID_GFN) )
+            return -EFAULT;
+
+        v->runstate_mfn[i] = get_gfn(d, gfn_x(gfn), &t);
+        if ( t != p2m_ram_rw || mfn_eq(v->runstate_mfn[i], INVALID_MFN) )
+            return -EFAULT;
+
+        pg = mfn_to_page(v->runstate_mfn[i]);
+        if ( !pg || !get_page_type(pg, PGT_writable_page) )
+        {
+            put_gfn(d, gfn_x(gfn));
+            return -EFAULT;
+        }
+
+        put_gfn(d, gfn_x(gfn));
+
+        if ( PFN_DOWN(gfn_x(gfn) + size) == PFN_DOWN(gfn_x(gfn)) )
+            break;
+
+        area->addr.p += PAGE_SIZE - offset;
+    }
+
+    v->runstate_nr = i + 1;
+
+    v->runstate_guest = vmap(v->runstate_mfn, v->runstate_nr);
+    if ( !v->runstate_guest )
+    {
+        for ( i = 0; i < v->runstate_nr; i++)
+            put_page_and_type(mfn_to_page(v->runstate_mfn[i]));
+        return -EFAULT;
+    }
+    v->runstate_guest = (void *)v->runstate_guest + offset;
+
+    return 0;
+}
+
+void unmap_runstate_area(struct vcpu *v)
+{
+    unsigned int i;
+
+    if ( !v->runstate_guest )
+        return;
+
+    vunmap(v->runstate_guest);
+    for ( i = 0; i < v->runstate_nr; i++ )
+        put_page_and_type(mfn_to_page(v->runstate_mfn[i]));
+
+    v->runstate_guest = NULL;
+    v->runstate_nr = 0;
+}
+
 int default_initialise_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     struct vcpu_guest_context *ctxt;
@@ -1495,27 +1573,17 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, 
XEN_GUEST_HANDLE_PARAM(void) arg)
     case VCPUOP_register_runstate_memory_area:
     {
         struct vcpu_register_runstate_memory_area area;
-        struct vcpu_runstate_info runstate;
 
         rc = -EFAULT;
         if ( copy_from_guest(&area, arg, 1) )
             break;
 
-        if ( !guest_handle_okay(area.addr.h, 1) )
+        unmap_runstate_area(v);
+        rc = map_runstate_area(v, &area);
+        if ( rc )
             break;
 
-        rc = 0;
-        runstate_guest(v) = area.addr.h;
-
-        if ( v == current )
-        {
-            __copy_to_guest(runstate_guest(v), &v->runstate, 1);
-        }
-        else
-        {
-            vcpu_runstate_get(v, &runstate);
-            __copy_to_guest(runstate_guest(v), &runstate, 1);
-        }
+        update_runstate_area(v);
 
         break;
     }
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 277f99f633..70b3aeb9a8 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -554,7 +554,6 @@ struct pv_vcpu
     uint32_t dr7_emul;
 
     /* Deferred VA-based update state. */
-    bool_t need_update_runstate_area;
     struct vcpu_time_info pending_system_time;
 };
 
@@ -646,7 +645,7 @@ struct guest_memory_policy
 void update_guest_memory_policy(struct vcpu *v,
                                 struct guest_memory_policy *policy);
 
-bool update_runstate_area(struct vcpu *);
+void update_runstate_area(struct vcpu *);
 bool update_secondary_system_time(struct vcpu *,
                                   struct vcpu_time_info *);
 
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index d1bfc82f57..090a54df97 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -118,4 +118,9 @@ struct vnuma_info {
 
 void vnuma_destroy(struct vnuma_info *vnuma);
 
+struct vcpu_register_runstate_memory_area;
+int map_runstate_area(struct vcpu *v,
+                      struct vcpu_register_runstate_memory_area *area);
+void unmap_runstate_area(struct vcpu *v);
+
 #endif /* __XEN_DOMAIN_H__ */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 86f15b11e0..cb4bec9e0c 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -163,15 +163,15 @@ struct vcpu
     void            *sched_priv;    /* scheduler-specific data */
 
     struct vcpu_runstate_info runstate;
+    mfn_t runstate_mfn[2];
+    unsigned int runstate_nr;
 #ifndef CONFIG_COMPAT
-# define runstate_guest(v) ((v)->runstate_guest)
-    XEN_GUEST_HANDLE(vcpu_runstate_info_t) runstate_guest; /* guest address */
+    struct vcpu_runstate_info *runstate_guest;
 #else
-# define runstate_guest(v) ((v)->runstate_guest.native)
     union {
-        XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
-        XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t) compat;
-    } runstate_guest; /* guest address */
+        struct vcpu_runstate_info *runstate_guest;
+        struct compat_vcpu_runstate_info *compat_runstate_guest;
+    };
 #endif
 
     /* last time when vCPU is scheduled out */


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.