[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 08/28] ARM: GICv3 ITS: introduce host LPI array

To: Andre Przywara <andre.przywara@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>
From: Julien Grall <julien.grall@xxxxxxx>
Date: Tue, 7 Feb 2017 18:01:07 +0000
Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Vijay Kilari <vijay.kilari@xxxxxxxxx>
Delivery-date: Tue, 07 Feb 2017 18:01:21 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

Hi Andre,

On 30/01/2017 18:31, Andre Przywara wrote:

The number of LPIs on a host can be potentially huge (millions),
although in practise will be mostly reasonable. So prematurely allocating
an array of struct irq_desc's for each LPI is not an option.
However Xen itself does not care about LPIs, as every LPI will be injected
into a guest (Dom0 for now).
Create a dense data structure (8 Bytes) for each LPI which holds just
enough information to determine the virtual IRQ number and the VCPU into
which the LPI needs to be injected.
Also to not artificially limit the number of LPIs, we create a 2-level
table for holding those structures.
This patch introduces functions to initialize these tables and to
create, lookup and destroy entries for a given LPI.
We allocate and access LPI information in a way that does not require
a lock.

Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx>
---
 xen/arch/arm/gic-v3-its.c        |  80 ++++++++++++++++-
 xen/arch/arm/gic-v3-lpi.c        | 187 ++++++++++++++++++++++++++++++++++++++-
 xen/include/asm-arm/atomic.h     |   6 +-
 xen/include/asm-arm/gic.h        |   5 ++
 xen/include/asm-arm/gic_v3_its.h |   9 ++
 5 files changed, 282 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 4a3a394..f073ab5 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -83,6 +83,20 @@ static int its_send_cmd_sync(struct host_its *its, int cpu)
     return its_send_command(its, cmd);
 }

+static int its_send_cmd_mapti(struct host_its *its,
+                              uint32_t deviceid, uint32_t eventid,
+                              uint32_t pintid, uint16_t icid)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_MAPTI | ((uint64_t)deviceid << 32);
+    cmd[1] = eventid | ((uint64_t)pintid << 32);
+    cmd[2] = icid;
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
 static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
 {
     uint64_t cmd[4];
@@ -111,6 +125,19 @@ static int its_send_cmd_mapd(struct host_its *its, 
uint32_t deviceid,
     return its_send_command(its, cmd);
 }

+static int its_send_cmd_inv(struct host_its *its,
+                            uint32_t deviceid, uint32_t eventid)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_INV | ((uint64_t)deviceid << 32);
+    cmd[1] = eventid;
+    cmd[2] = 0x00;
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
 /* Set up the (1:1) collection mapping for the given host CPU. */
 int gicv3_its_setup_collection(int cpu)
 {
@@ -359,13 +386,47 @@ int gicv3_its_init(struct host_its *hw_its)

 static void remove_mapped_guest_device(struct its_devices *dev)
 {
+    int i;
+
     if ( dev->hw_its )
         its_send_cmd_mapd(dev->hw_its, dev->host_devid, 0, 0, false);

+    for ( i = 0; i < dev->eventids / 32; i++ )


Please use LPI_BLOCK rather than 32.

+        gicv3_free_host_lpi_block(dev->hw_its, dev->host_lpis[i]);

Without looking at the implementation of gicv3_free_host_lpi_block, Ithink the usage of the function is very confusing. When I readhost_lpis, I expect to see one LPI per event. But instead it be thefirst LPI of a block. The lack of documentation of the field inits_devices does not help to understand what's going on.


So please add some documentation and probably renaming some fields.

Also, the function can return an error but you don't check it. Pleasemake sure to verify the return value.

Lastly should not we discard the LPIs before removing the device? Ordoes MAPD take care for you?

+
     xfree(dev->itt_addr);
+    xfree(dev->host_lpis);

I forgot to mention in the previous patch. You free dev->itt_addr anddev->host_lpis without even waiting that the ITS handle the command.This is real bad idea as Xen could re-allocate the memory to someoneelse as soon as xfree has finished.

     xfree(dev);
 }

+/*
+ * On the host ITS @its, map @nr_events consecutive LPIs.
+ * The mapping connects a device @devid and event @eventid pair to LPI @lpi,
+ * increasing both @eventid and @lpi to cover the number of requested LPIs.
+ */
+int gicv3_its_map_host_events(struct host_its *its,
+                              int devid, int eventid, int lpi,
+                              int nr_events)

All those fields should at least be unsigned int. For devid, I would useuint32_t.

In general anything that does not require to be signed should beunsigned int. Similarly if we deal with an hardware value the typeshould be uintXX_t. This makes easier to match the hardware and avoidpotential issue later.

+{
+    int i, ret;


i should be unsigned int.

+
+    for ( i = 0; i < nr_events; i++ )
+    {
+        ret = its_send_cmd_mapti(its, devid, eventid + i, lpi + i, 0);

A comment explain what 0 stands for would be really helpful. Somethingalong those lines: "All interrupt are mapped to CPU0 (e.g collection 0)by default".

+        if ( ret )
+            return ret;
+        ret = its_send_cmd_inv(its, devid, eventid + i);

So the spec allows up to 32KB event per device. As all the LPIs will berouted to CPU0 (e.g collection 0), it would be more efficient to do anINVALL.

Furthermore, what if the queue is full? AFAIU, you will return an errorbut it is not propagate. So Xen will think the device has been mappedeven if it is not true.

I think we need to have a plan here as this may likely happen if adevice has many MSI and/or the queue is nearly full.

+        if ( ret )
+            return ret;
+    }
+
+    ret = its_send_cmd_sync(its, 0);
+    if ( ret )
+        return ret;
+
+    return 0;


the "if ( ret ) return ret; return 0; could be simplified with return ret;

+}
+
 int gicv3_its_map_guest_device(struct domain *d, int host_devid,
                                int guest_devid, int bits, bool valid)
 {
@@ -373,7 +434,7 @@ int gicv3_its_map_guest_device(struct domain *d, int 
host_devid,
     struct its_devices *dev, *temp;
     struct rb_node **new = &d->arch.vgic.its_devices.rb_node, *parent = NULL;
     struct host_its *hw_its;
-    int ret;
+    int ret, i;


i should be unsigned.


     /* check for already existing mappings */
     spin_lock(&d->arch.vgic.its_devices_lock);
@@ -430,10 +491,19 @@ int gicv3_its_map_guest_device(struct domain *d, int 
host_devid,
         goto out_unlock;
     }

+    dev->host_lpis = xzalloc_array(uint32_t, BIT(bits) / 32);


I cannot find any code making sure the number of event is aligned to 32.

BTW, the 32 should really be LPI_BLOCK.

+    if ( !dev->host_lpis )
+    {
+        xfree(dev);
+        xfree(itt_addr);
+        return -ENOMEM;
+    }
+
     ret = its_send_cmd_mapd(hw_its, host_devid, bits - 1,
                             virt_to_maddr(itt_addr), true);
     if ( ret )
     {
+        xfree(dev->host_lpis);
         xfree(itt_addr);
         xfree(dev);
         goto out_unlock;
@@ -450,6 +520,14 @@ int gicv3_its_map_guest_device(struct domain *d, int 
host_devid,

     spin_unlock(&d->arch.vgic.its_devices_lock);

+    /*
+     * Map all host LPIs within this device already. We can't afford to queue
+     * any host ITS commands later on during the guest's runtime.
+     */
+    for ( i = 0; i < BIT(bits) / 32; i++ )


Ditto.

+        dev->host_lpis[i] = gicv3_allocate_host_lpi_block(hw_its, d, 
host_devid,
+                                                          i * 32);


Ditto.

Also, gicv3_allocate_host_lpi_block could return an error if, forinstance, there is not enough LPIs. So please check the return value.

+
     return 0;

 out_unlock:
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index 5911b91..8f6e7f3 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -18,16 +18,34 @@

 #include <xen/config.h>
 #include <xen/lib.h>
-#include <xen/mm.h>


Why mm.h has been dropped?

+#include <xen/sched.h>
+#include <xen/err.h>
+#include <xen/sched.h>
 #include <xen/sizes.h>
+#include <asm/atomic.h>
+#include <asm/domain.h>
+#include <asm/io.h>
 #include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
 #include <asm/gic_v3_its.h>

+/* LPIs on the host always go to a guest, so no struct irq_desc for them. */
+union host_lpi {
+    uint64_t data;
+    struct {
+        uint32_t virt_lpi;
+        uint16_t dom_id;
+        uint16_t vcpu_id;
+    };
+};


I think Stefano requested some documentation on this field.

+
 /* Global state */
 static struct {
     uint8_t *lpi_property;
+    union host_lpi **host_lpis;

It would be really helpful to a have documentation in the codeexplaining what host_lpis is supposed to contain.

     unsigned int host_lpi_bits;
+    /* Protects allocation and deallocation of host LPIs, but not the access */
+    spinlock_t host_lpis_lock;
 } lpi_data;

 /* Physical redistributor address */
@@ -38,6 +56,19 @@ static DEFINE_PER_CPU(int, redist_id);
 static DEFINE_PER_CPU(void *, pending_table);

 #define MAX_PHYS_LPIS   (BIT_ULL(lpi_data.host_lpi_bits) - LPI_OFFSET)
+#define HOST_LPIS_PER_PAGE      (PAGE_SIZE / sizeof(union host_lpi))
+
+static union host_lpi *gic_get_host_lpi(uint32_t plpi)
+{
+    if ( !is_lpi(plpi) || plpi >= MAX_PHYS_LPIS + LPI_OFFSET )
+        return NULL;
+
+    plpi -= LPI_OFFSET;
+    if ( !lpi_data.host_lpis[plpi / HOST_LPIS_PER_PAGE] )
+        return NULL;
+
+    return &lpi_data.host_lpis[plpi / HOST_LPIS_PER_PAGE][plpi % 
HOST_LPIS_PER_PAGE];
+}

 /* Stores this redistributor's physical address and ID in a per-CPU variable */
 void gicv3_set_redist_address(paddr_t address, int redist_id)
@@ -130,15 +161,169 @@ uint64_t gicv3_lpi_get_proptable(void)
 static unsigned int max_lpi_bits = CONFIG_MAX_PHYS_LPI_BITS;
 integer_param("max_lpi_bits", max_lpi_bits);

+/*
+ * Allocate the 2nd level array for host LPIs. This one holds pointers
+ * to the page with the actual "union host_lpi" entries. Our LPI limit
+ * avoids excessive memory usage.
+ */
 int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits)
 {
+    int nr_lpi_ptrs;


unsigned int

+
+    BUILD_BUG_ON(sizeof(union host_lpi) > sizeof(unsigned long));

Why this BUILD_BUG_ON? Is it because you want to make sure read areatomic? If so, please add a comment.

+
     lpi_data.host_lpi_bits = min(hw_lpi_bits, max_lpi_bits);

+    spin_lock_init(&lpi_data.host_lpis_lock);
+
+    nr_lpi_ptrs = MAX_PHYS_LPIS / (PAGE_SIZE / sizeof(union host_lpi));
+    lpi_data.host_lpis = xzalloc_array(union host_lpi *, nr_lpi_ptrs);
+    if ( !lpi_data.host_lpis )
+        return -ENOMEM;
+
     printk("GICv3: using at most %lld LPIs on the host.\n", MAX_PHYS_LPIS);

     return 0;
 }

+#define LPI_BLOCK       32

I think a comment on the reason behind the number 32 was requested byStefano here. And I agree with him.

+
+/* Must be called with host_lpis_lock held. */


This is a call for adding an ASSERT in the function.

+static int find_unused_host_lpi(int start, uint32_t *index)


start should probably unsigned.

+{
+    int chunk;


Ditto.

+    uint32_t i = *index;
+
+    for ( chunk = start; chunk < MAX_PHYS_LPIS / HOST_LPIS_PER_PAGE; chunk++ )
+    {
+        /* If we hit an unallocated chunk, use entry 0 in that one. */
+        if ( !lpi_data.host_lpis[chunk] )
+        {
+            *index = 0;
+            return chunk;
+        }
+
+        /* Find an unallocated entry in this chunk. */
+        for ( ; i < HOST_LPIS_PER_PAGE; i += LPI_BLOCK )
+        {
+            if ( lpi_data.host_lpis[chunk][i].dom_id == INVALID_DOMID )
+            {
+                *index = i;
+                return chunk;
+            }
+        }
+        i = 0;
+    }
+
+    return -1;
+}
+
+/*
+ * Allocate a block of 32 LPIs on the given host ITS for device "devid",
+ * starting with "eventid". Put them into the respective ITT by issuing a
+ */
+int gicv3_allocate_host_lpi_block(struct host_its *its, struct domain *d,
+                                  uint32_t host_devid, uint32_t eventid)

Most of the parameters here is just for calling back ITS and the callershould all in hand. So I would much prefer to avoid a call chain ITS ->LPI -> ITS and make the LPI code ITS agnostic.

Furthermore, the return value is both used to return an error and theLPI. Given that an LPI is encoded on 32-bit, sooner or later there willbe a clash with between the error and the LPI. So it would be wise todissociate the error code and the LPIs.


The prototype of this function would like:

int gicv3_allocate_host_lpi_block(struct domain *domain, uint32_t*first_lpi);

+{
+    static uint32_t next_lpi = 0;
+    uint32_t lpi, lpi_idx = next_lpi % HOST_LPIS_PER_PAGE;

lpi_idx will be overridden by find_unused_host_lpi, so you don't need toinitialize it here.

This would also allow you to store the chunk rather than the next lpiand dropping the division in find_unused_host_lpi.

+    int chunk;
+    int i;
+
+    spin_lock(&lpi_data.host_lpis_lock);
+    chunk = find_unused_host_lpi(next_lpi / HOST_LPIS_PER_PAGE, &lpi_idx);
+
+    if ( chunk == - 1 )          /* rescan for a hole from the beginning */
+    {
+        lpi_idx = 0;
+        chunk = find_unused_host_lpi(0, &lpi_idx);
+        if ( chunk == -1 )
+        {
+            spin_unlock(&lpi_data.host_lpis_lock);
+            return -ENOSPC;
+        }
+    }

My understanding of the algo to find a chunk is you will always try toallocate forward. So if the current chunk is full, you will allocate thenext one rather than looking whether a previous chunk has spaceavailable. This will result to allocate more memory than necessary.


Similarly unused chunk could be freed to save memory.

+
+    /* If we hit an unallocated chunk, we initialize it and use entry 0. */
+    if ( !lpi_data.host_lpis[chunk] )
+    {
+        union host_lpi *new_chunk;
+
+        new_chunk = alloc_xenheap_pages(0, 0);


Please use alloc_xenheap_page as you only allocate one page.

Also, when NUMA support will be added we may want to take into accountthe node associated to the device saving us some time when reading thememory. You don't need to handle that now, but a TODO would be quitehelpful.

+        if ( !new_chunk )
+        {
+            spin_unlock(&lpi_data.host_lpis_lock);
+            return -ENOMEM;
+        }
+
+        for ( i = 0; i < HOST_LPIS_PER_PAGE; i += LPI_BLOCK )
+            new_chunk[i].dom_id = INVALID_DOMID;
+
+        lpi_data.host_lpis[chunk] = new_chunk;
+        lpi_idx = 0;
+    }
+
+    lpi = chunk * HOST_LPIS_PER_PAGE + lpi_idx;
+
+    for ( i = 0; i < LPI_BLOCK; i++ )
+    {
+        union host_lpi hlpi;
+
+        /*
+         * Mark this host LPI as belonging to the domain, but don't assign
+         * any virtual LPI or a VCPU yet.
+         */
+        hlpi.virt_lpi = INVALID_LPI;
+        hlpi.dom_id = d->domain_id;
+        hlpi.vcpu_id = INVALID_DOMID;

Please don't mix vcpu and domain. If INVALID_VCPUID does not exist thenit might be worth adding one.

+        write_u64_atomic(&lpi_data.host_lpis[chunk][lpi_idx + i].data,
+                         hlpi.data);
+
+        /*
+         * Enable this host LPI, so we don't have to do this during the
+         * guest's runtime.
+         */
+        lpi_data.lpi_property[lpi + i] |= LPI_PROP_ENABLED;
+    }
+
+    /*
+     * We have allocated and initialized the host LPI entries, so it's safe
+     * to drop the lock now. Access to the structures can be done concurrently
+     * as it involves only an atomic uint64_t access.
+     */
+    spin_unlock(&lpi_data.host_lpis_lock);
+
+    __flush_dcache_area(&lpi_data.lpi_property[lpi], LPI_BLOCK);

Please use dcache_* helpers. Also, the flush is only needed when theproperty table is not mapped cacheable and innershareable by the GIC.

+
+    gicv3_its_map_host_events(its, host_devid, eventid, lpi + LPI_OFFSET,
+                              LPI_BLOCK);
+
+    next_lpi = lpi + LPI_BLOCK;
+    return lpi + LPI_OFFSET;
+}
+
+int gicv3_free_host_lpi_block(struct host_its *its, uint32_t lpi)
+{
+    union host_lpi *hlpi, empty_lpi = { .dom_id = INVALID_DOMID };
+    int i;
+
+    hlpi = gic_get_host_lpi(lpi);
+    if ( !hlpi )
+        return -ENOENT;
+
+    spin_lock(&lpi_data.host_lpis_lock);
+
+    for ( i = 0; i < LPI_BLOCK; i++ )
+        write_u64_atomic(&hlpi[i].data, empty_lpi.data);
+
+    /* TODO: Call a function in gic-v3-its.c to send DISCARDs */

I think this should be done by the caller and not here. You also need todisable the interrupts which has not been done here.

+
+    spin_unlock(&lpi_data.host_lpis_lock);
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-arm/atomic.h b/xen/include/asm-arm/atomic.h
index 22a5036..df9de6a 100644
--- a/xen/include/asm-arm/atomic.h
+++ b/xen/include/asm-arm/atomic.h
@@ -53,9 +53,9 @@ build_atomic_write(write_u16_atomic, "h", WORD, uint16_t, "r")
 build_atomic_write(write_u32_atomic, "",  WORD, uint32_t, "r")
 build_atomic_write(write_int_atomic, "",  WORD, int, "r")

-#if 0 /* defined (CONFIG_ARM_64) */
-build_atomic_read(read_u64_atomic, "x", uint64_t, "=r")
-build_atomic_write(write_u64_atomic, "x", uint64_t, "r")
+#if defined (CONFIG_ARM_64)
+build_atomic_read(read_u64_atomic, "", "", uint64_t, "=r")
+build_atomic_write(write_u64_atomic, "", "", uint64_t, "r")
 #endif

This change should be in a separate patch that will also explain whyfixing build_atomic* call is needed.


 build_add_sized(add_u8_sized, "b", BYTE, uint8_t, "ri")
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 12bd155..7825575 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -220,7 +220,12 @@ enum gic_version {
     GIC_V3,
 };

+#define INVALID_LPI     0
 #define LPI_OFFSET      8192


Newline here please.

+static inline bool is_lpi(unsigned int irq)
+{
+    return irq >= LPI_OFFSET;
+}


I think both INVALID_LPI and is_lpi should be moved in irq.h.


 extern enum gic_version gic_hw_version(void);

diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 9c5dcf3..0e6b06a 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -97,6 +97,8 @@
 #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
 #define HOST_ITS_USES_PTA               (1U << 1)

+#define INVALID_DOMID ((uint16_t)~0)
+

Rather than defining your own invalid domid, it would be better to usethe one defined in xen.h (see DOMID_INVALID).

 /* data structure for each hardware ITS */
 struct host_its {
     struct list_head entry;
@@ -117,6 +119,7 @@ struct its_devices {
     uint32_t guest_devid;
     uint32_t host_devid;
     uint32_t eventids;
+    uint32_t *host_lpis;

I forgot to mention it earlier. But there is a general lack of commenton all the structure. Please try to address that on the next version.

 };

 extern struct list_head host_its_list;
@@ -149,6 +152,12 @@ int gicv3_its_setup_collection(int cpu);
  */
 int gicv3_its_map_guest_device(struct domain *d, int host_devid,
                                int guest_devid, int bits, bool valid);
+int gicv3_its_map_host_events(struct host_its *its,
+                              int host_devid, int eventid,
+                              int lpi, int nrevents);
+int gicv3_allocate_host_lpi_block(struct host_its *its, struct domain *d,
+                                  uint32_t host_devid, uint32_t eventid);
+int gicv3_free_host_lpi_block(struct host_its *its, uint32_t lpi);

 #else


Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

Prev by Date: [Xen-devel] [PATCH v4 0/2] XEN SWIOTLB dma operations extension
Next by Date: [Xen-devel] [distros-debian-snapshot test] 68530: tolerable trouble: blocked/broken/fail/pass
Previous by thread: [Xen-devel] [PATCH v4 0/2] XEN SWIOTLB dma operations extension
Next by thread: Re: [Xen-devel] [PATCH 08/28] ARM: GICv3 ITS: introduce host LPI array
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.