[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 12/27] ARM: vGICv3: handle virtual LPI pending and property tables



Hi,

I posted the comments on v3 by mistake. I will duplicate them here.

On 03/04/17 21:28, Andre Przywara wrote:
Allow a guest to provide the address and size for the memory regions
it has reserved for the GICv3 pending and property tables.
We sanitise the various fields of the respective redistributor
registers and map those pages into Xen's address space to have easy
access.

Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx>
---
 xen/arch/arm/vgic-v3.c       | 136 +++++++++++++++++++++++++++++++++++++------
 xen/common/memory.c          |  61 +++++++++++++++++++
 xen/include/asm-arm/domain.h |   6 +-
 xen/include/asm-arm/vgic.h   |   2 +
 xen/include/xen/mm.h         |   8 +++
 5 files changed, 195 insertions(+), 18 deletions(-)

diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 797fd86..2c6b317 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -19,12 +19,14 @@
  */

 #include <xen/bitops.h>
+#include <xen/domain_page.h>
 #include <xen/lib.h>
 #include <xen/init.h>
 #include <xen/softirq.h>
 #include <xen/irq.h>
 #include <xen/sched.h>
 #include <xen/sizes.h>
+#include <xen/vmap.h>
 #include <asm/current.h>
 #include <asm/mmio.h>
 #include <asm/gic_v3_defs.h>
@@ -228,12 +230,15 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, 
mmio_info_t *info,
         goto read_reserved;

     case VREG64(GICR_PROPBASER):
-        /* LPI's not implemented */
-        goto read_as_zero_64;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
+        *r = vgic_reg64_extract(v->domain->arch.vgic.rdist_propbase, info);

vgic_reg64_extract will likely turned into a series of non-atomic operation. So how do you prevent issue?

+        return 1;

     case VREG64(GICR_PENDBASER):
-        /* LPI's not implemented */
-        goto read_as_zero_64;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
+        *r = vgic_reg64_extract(v->arch.vgic.rdist_pendbase, info);

Ditto.

+        *r &= ~GICR_PENDBASER_PTZ;       /* WO, reads as 0 */
+        return 1;

[...]

 static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
                                           uint32_t gicr_reg,
                                           register_t r)
 {
     struct hsr_dabt dabt = info->dabt;
+    uint64_t reg;

     switch ( gicr_reg )
     {
@@ -393,36 +477,54 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, 
mmio_info_t *info,
         goto write_impl_defined;

     case VREG64(GICR_SETLPIR):
-        /* LPI is not implemented */
+        /* LPIs without an ITS are not implemented */
         goto write_ignore_64;

     case VREG64(GICR_CLRLPIR):
-        /* LPI is not implemented */
+        /* LPIs without an ITS are not implemented */
         goto write_ignore_64;

     case 0x0050:
         goto write_reserved;

     case VREG64(GICR_PROPBASER):
-        /* LPI is not implemented */
-        goto write_ignore_64;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
+
+        /* Writing PROPBASER with LPIs enabled is UNPREDICTABLE. */
+        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )

v will point to the vCPU associated to the re-distributor. However, this could be updated from any vCPU. So I think there is tiny race where v->arch.vgic.flags may not been seen and therefore you will get rdist_propbase written afterwards.

+            return 1;
+
+        reg = v->domain->arch.vgic.rdist_propbase;
+        vgic_reg64_update(&reg, r, info);
+        reg = sanitize_propbaser(reg);
+        v->domain->arch.vgic.rdist_propbase = reg;

This code could be called concurrently and I don't think this will behave well.

+        return 1;

     case VREG64(GICR_PENDBASER):
-        /* LPI is not implemented */
-        goto write_ignore_64;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
+
+        /* Writing PENDBASER with LPIs enabled is UNPREDICTABLE. */
+        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )

Ditto

+            return 1;
+
+        reg = v->arch.vgic.rdist_pendbase;
+        vgic_reg64_update(&reg, r, info);
+        reg = sanitize_pendbaser(reg);
+        v->arch.vgic.rdist_pendbase = reg;

Ditto.

[...]

diff --git a/xen/common/memory.c b/xen/common/memory.c
index ad0b33c..3ca3f60 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c

Any modification in common code should be a separate patch and have appropriate maintainers CCed.

@@ -1418,6 +1418,67 @@ int prepare_ring_for_helper(
 }

 /*
+ * Mark a given number of guest pages as used (by increasing their refcount),
+ * starting with the given guest address. This needs to be called once before
+ * calling (possibly repeatedly) map_one_guest_pages().
+ * Before the domain gets destroyed, call put_guest_pages() to drop the
+ * reference.
+ */

I was hoping that you would have taken my comments into account where you wrote the new functions but it seems they were ignored :/. I feel like it is a wasted of my time to repeat again and again comments.

Both get_guest_pages and put_guest_pages are wrong because you are assuming the p2m mapping will never changes. This is wrong and I asked a forward plan for that which seems to have been skipped too...

+int get_guest_pages(struct domain *d, paddr_t gpa, unsigned int nr_pages)

Please use the type gfn_t rather paddr_t,

+{
+    unsigned int i;
+    struct page_info *page;
+
+    for ( i = 0; i < nr_pages; i++ )
+    {
+        page = get_page_from_gfn(d, (gpa >> PAGE_SHIFT) + i, NULL, P2M_ALLOC);

Get page may return a foreign page (e.g belonging to another domain) and we don't want to use this type of page for ITS memory.

+        if ( !page )
+        {
+            /* Make sure we drop the references of pages we got so far. */
+            put_guest_pages(d, gpa, i);
+            return -EINVAL;
+        }
+    }
+
+    return 0;
+}
+
+void put_guest_pages(struct domain *d, paddr_t gpa, unsigned int nr_pages)

Same comments as above.

+{
+    mfn_t mfn;
+    int i;

unsigned int

+
+    p2m_read_lock(&d->arch.p2m);
+    for ( i = 0; i < nr_pages; i++ )
+    {
+        mfn = p2m_get_entry(&d->arch.p2m, _gfn((gpa >> PAGE_SHIFT) + i),
+                            NULL, NULL, NULL);
+        if ( mfn_eq(mfn, INVALID_MFN) )
+            continue;
+        put_page(mfn_to_page(mfn_x(mfn)));

This function is completely wrong in the actual state. You assume that the stage-2 page table has not been modified by the guest between get_guest_pages and put_guest_pages. If it has been modified, you may remove a reference on the wrong page.

Furthermore, it is likely an error to have the mfn not valid in this case.

As we discussed earlier, the way forward is to protect the pages. It is not mandatory for DOM0, but a comment in the code is necessary to explain what is missing.

+    }
+    p2m_read_unlock(&d->arch.p2m);
+}
+
+/*
+ * Provides easy access to guest memory by "mapping" one page of it into
+ * Xen's VA space. In fact it relies on the memory being already mapped
+ * and just provides a pointer to it.
+ */
+void *map_one_guest_page(struct domain *d, paddr_t guest_addr)
+{

This might be correct for DOM0 but will not work for guest. Even if you don't support guest right now, we should really avoid such assumption in the code. It will likely mean quite a lot of rework which I'd like to see now.

Looking at how you use this, it would make more sense to have an helper to copy from the guest memory to a buffer. I think this is not the first time I am suggesting that.

I think this would also avoid protecting the guest memory.

For an example of what I meant give a look to the vITS series sent by Cavium a year ago:

https://patchwork.kernel.org/patch/8177251/

+    void *ptr = map_domain_page(_mfn(guest_addr >> PAGE_SHIFT));
+
+    return ptr + (guest_addr & ~PAGE_MASK);
+}
+
+/* "Unmap" a previously mapped guest page. Could be optimized away. */
+void unmap_one_guest_page(void *va)
+{
+    unmap_domain_page(((uintptr_t)va & PAGE_MASK));
+}
+
+/*
  * Local variables:
  * mode: C
  * c-file-style: "BSD"
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 6ee7538..f460457 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -109,6 +109,8 @@ struct arch_domain
         } *rdist_regions;
         int nr_regions;                     /* Number of rdist regions */
         uint32_t rdist_stride;              /* Re-Distributor stride */
+        unsigned int nr_lpis;

You switched to "unsigned long" in the gic-v3-its code, but still keep "unsigned int" here.

Why that?

[...]

+        uint64_t rdist_propbase;
         struct rb_root its_devices;         /* Devices mapped to an ITS */
         spinlock_t its_devices_lock;        /* Protects the its_devices tree */
         struct radix_tree_root pend_lpi_tree; /* Stores struct pending_irq's */
@@ -256,7 +258,9 @@ struct arch_vcpu

         /* GICv3: redistributor base and flags for this vCPU */
         paddr_t rdist_base;
-#define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
+        uint64_t rdist_pendbase;
+#define VGIC_V3_RDIST_LAST      (1 << 0)        /* last vCPU of the rdist */
+#define VGIC_V3_LPIS_ENABLED    (1 << 1)
         uint8_t flags;
     } vgic;

diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index a24a971..b5ae3e9 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -312,6 +312,8 @@ extern void register_vgic_ops(struct domain *d, const 
struct vgic_ops *ops);
 int vgic_v2_init(struct domain *d, int *mmio_count);
 int vgic_v3_init(struct domain *d, int *mmio_count);

+extern int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi);

Prototype should be added in the same patch as the declaration. So please move to patch #10.

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.