[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 03/28] ARM: GICv3: allocate LPI pending and property table





On 27/02/17 11:34, Andre Przywara wrote:
Hi,

Hi Andre,


"Yes, will fix" to everything not explicitly mentioned below.

On 06/02/17 16:26, Julien Grall wrote:
Hi Andre,

On 30/01/17 18:31, Andre Przywara wrote:
The ARM GICv3 provides a new kind of interrupt called LPIs.
The pending bits and the configuration data (priority, enable bits) for
those LPIs are stored in tables in normal memory, which software has to
provide to the hardware.
Allocate the required memory, initialize it and hand it over to each
redistributor. The maximum number of LPIs to be used can be adjusted with
the command line option "max_lpi_bits", which defaults to a compile time
constant exposed in Kconfig.

Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx>
---
 xen/arch/arm/Kconfig              |  15 +++++
 xen/arch/arm/Makefile             |   1 +
 xen/arch/arm/gic-v3-lpi.c         | 129
++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c             |  44 +++++++++++++
 xen/include/asm-arm/bitops.h      |   1 +
 xen/include/asm-arm/gic.h         |   2 +
 xen/include/asm-arm/gic_v3_defs.h |  52 ++++++++++++++-
 xen/include/asm-arm/gic_v3_its.h  |  22 ++++++-
 8 files changed, 264 insertions(+), 2 deletions(-)
 create mode 100644 xen/arch/arm/gic-v3-lpi.c

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index bf64c61..71734a1 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -49,6 +49,21 @@ config HAS_ITS
         bool "GICv3 ITS MSI controller support"
         depends on HAS_GICV3

+config MAX_PHYS_LPI_BITS
+        depends on HAS_ITS
+        int "Maximum bits for GICv3 host LPIs (14-32)"
+        range 14 32
+        default "20"
+        help
+          Specifies the maximum number of LPIs (in bits) Xen should take
+          care of. The host ITS may provide support for a very large
number
+          of supported LPIs, for all of which we may not want to
allocate
+          memory, so this number here allows to limit this.

I think the description is misleading, if a user wants 8K worth of LPIs
by default, he would have to use 14 and not 13.

Furthermore, you provide both a runtime option (via command line) and
build time option (via Kconfig). You don't express what is the
differences between both and how there are supposed to co-exist.

Anyway, IHMO the command line option should be sufficient to allow
override if necessary. So I would drop the Kconfig version.

The idea is simply to let the Kconfig version specify the default value
if there is no command line option. So giving a command line option will
always override Kconfig.
Should we know a sensible default value, we can indeed get rid of this
Kconfig snippet here.

Please have in mind that distribution will ship one version of Xen for all the platforms. There is no sensible default value and I don't see how a distribution will be able to pick-up one.

So I still don't see the point of this default option.


+          Xen itself does not know how many LPIs domains will ever need
+          beforehand.
+          This can be overriden on the command line with the
max_lpi_bits

s/overriden/overridden/

+          parameter.
+
 endmenu

 menu "ARM errata workaround via the alternative framework"
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 5f4ff23..4ccf2eb 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -19,6 +19,7 @@ obj-y += gic.o
 obj-y += gic-v2.o
 obj-$(CONFIG_HAS_GICV3) += gic-v3.o
 obj-$(CONFIG_HAS_ITS) += gic-v3-its.o
+obj-$(CONFIG_HAS_ITS) += gic-v3-lpi.o
 obj-y += guestcopy.o
 obj-y += hvm.o
 obj-y += io.o
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
new file mode 100644
index 0000000..e2fc901
--- /dev/null
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -0,0 +1,129 @@
+/*
+ * xen/arch/arm/gic-v3-lpi.c
+ *
+ * ARM GICv3 Locality-specific Peripheral Interrupts (LPI) support
+ *
+ * Copyright (C) 2016,2017 - ARM Ltd
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <xen/config.h>

xen/config.h is not necessary.

+#include <xen/lib.h>
+#include <xen/mm.h>
+#include <xen/sizes.h>
+#include <asm/gic.h>
+#include <asm/gic_v3_defs.h>
+#include <asm/gic_v3_its.h>
+
+/* Global state */
+static struct {
+    uint8_t *lpi_property;
+    unsigned int host_lpi_bits;

On the previous version, Stefano suggested to rename this to
phys_lpi_bits + adding a comment as you store the number of bits.

However, looking at the usage the number of bits is only required during
the initialization. Runtime code (such as gic_get_host_lpi) will use the
number of LPIs (see gic_get_host_lpi) and therefore require extra
instructions to compute the value.

So I would prefer if you store the number of LPIs here to optimize the
common case.

Well, it's a shift, not the 5th root. This is in fact the only
difference between the two approaches:
000000000024d770 <gic_get_host_lpi>:
  ...
  24d788:       9ac22421        lsr     x1, x1, x2

One instruction can make a lot of differences when a function is executed hundred times.

And I was thinking about it before, my rationale for not doing it was:
- We need both the number and the shift, and it's much easier to get the
number from the bits than the other way around.

But you only need the number of bits in the initialization. I don't care if the ITS initialization is little slower than the invert.

- The bits are the real source of the information (from TYPER).

So? It is fine to use another way to store the value in Xen if it makes easier to use in most of the places. This will also be less confusing for the users.

- Having a number would always raise the question whether we need to
make sure that there is more than one bit set in there and how to deal
with it.

It is not difficult to handle that.


Also, I find the naming "id_bits" confusing because you store the number
of bits to encode the max LPI ID and not the number of bits to encode
the number of LPI.

"IDbits" is the spec term used. It describes how many bits you need to
describe an LPI number. LPIs start at number 8192.
GICv3 spec, 8.9.24:
IDbits, bits [23:19]
       The number of interrupt identifier bits supported, minus one.

I can rename this to "phys_lpi_idbits", if that sounds reasonable.

+} lpi_data;
+
+/* Pending table for each redistributor */
+static DEFINE_PER_CPU(void *, pending_table);
+
+#define MAX_PHYS_LPIS   (BIT_ULL(lpi_data.host_lpi_bits) - LPI_OFFSET)
+
+uint64_t gicv3_lpi_allocate_pendtable(void)
+{
+    uint64_t reg;
+    void *pendtable;
+
+    reg  = GIC_BASER_CACHE_RaWaWb <<
GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
+    reg |= GIC_BASER_CACHE_SameAsInner <<
GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
+    reg |= GIC_BASER_InnerShareable <<
GICR_PENDBASER_SHAREABILITY_SHIFT;
+
+    if ( !this_cpu(pending_table) )
+    {
+        /*
+         * The pending table holds one bit per LPI and even covers
bits for
+         * interrupt IDs below 8192, so we allocate the full range.
+         * The GICv3 imposes a 64KB alignment requirement.
+         */
+        pendtable = _xmalloc(BIT_ULL(lpi_data.host_lpi_bits) / 8,
SZ_64K);
+        if ( !pendtable )
+            return 0;
+
+        memset(pendtable, 0, BIT_ULL(lpi_data.host_lpi_bits) / 8);

You can use _zalloc to do the allocation and then memset to 0.

+        __flush_dcache_area(pendtable,
BIT_ULL(lpi_data.host_lpi_bits) / 8);

Please use clean_and_invalidate_dcache_va_range.

+
+        this_cpu(pending_table) = pendtable;
+    }
+    else
+    {
+        pendtable = this_cpu(pending_table);
+    }

The {} are not necessary.

Sure, this was coming from the Linux rule here: if the if-clause
requires braces, the else-clause has to have those too. To me it looks
weird otherwise. Xen's coding style document doesn't explicitly say.

We tend to avoid pointless {} in Xen.


Also, on the previous version it was mentioned
this should be an error and then replace by a BUG_ON().

I don't see how this would be an actual bug. Yes, the code as it is
right now calls this only once, but it wouldn't hurt if we call this
multiple times. And I am always a bit wary of crashing the system if we
could just carry on instead, but ...

Please do the change.

... however you like.

The question is not whether it would hurt to call this code twice but if it makes sense. It does not make sense to try to allocate the pendtable twice for the same CPU. So it is ever happening it means this is a programming error, hence a BUG_ON makes sense here.

Another solution is having an ASSERT(this_cpu(pending_table) == NULL); at the beginning of the function.


+
+    reg |= GICR_PENDBASER_PTZ;
+
+    ASSERT(!(virt_to_maddr(pendtable) & ~GENMASK(51, 16)));

I don't understand the purpose of this ASSERT. The bits 15:0 should
always be zero otherwise this would be a bug in the memory allocator.
For bits 64:52, the architecture so far only support up to 52 bits.

You complained about using a mask on the address before, which I can
understand.
However we are writing to a register described in the spec here and
should observe that the address only goes into bits [51:16]. Any other
bits should not be touched or we are getting weird errors.
So somehow I have to make sure we comply with this.
This could either be a mask or an ASSERT.
If the assert never fires: great. Nothing to worry about here.
But I think this matches the ASSERT idea: we rely on this address being 4K
aligned and not exceeding 52 bits worth of address bits, so we should
check these assumptions.

ASSERT are only for debug build. However, nothing prevent the Xen allocator to differ between non-debug and debug build. So you may end up to never allocate address that are invalid for the GIC in non-debug build.


By keeping this ASSERT, you will make our life more difficult to extend
the number of physical address supported if ARM decides to bump it.

This GICR register is limited to 52 bits of physical address space
according to the spec.
If we ever upgrade the address size, the GIC spec would need to be
upgraded as well, so chances are we either need to touch that code here
anyways or we use a GICv5 or the like from the beginning.

I am quite convinced it will make our life more difficult. But I am not going to fight for an ASSERT, there are more important stuff to fix. :)


Why don't you just disable LPIs here? AFAIK, it should just be
writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, GICR_CTLR);

Please don't shoot the messenger, but:
GICv3 spec 8.11.2 "GICR_CTLR, Redistributor Control Register":
Enable_LPIs, bit [0]:
"... When a write changes this bit from 0 to 1, this bit becomes RES1
and the Redistributor must load the LPI Pending table from memory to
check for any pending interrupts."

Read: LPIs are so great that you can't disable them anymore once you
have enabled them.

How this would work with kexec?

Yes, this is a bit weird, even has nasty side effects.

In that case, I would prefer to have either a panic or make the CPU unusable to avoid having weird behavior afterwards.

[...]

diff --git a/xen/include/asm-arm/bitops.h b/xen/include/asm-arm/bitops.h
index bda8898..1cbfb9e 100644
--- a/xen/include/asm-arm/bitops.h
+++ b/xen/include/asm-arm/bitops.h
@@ -24,6 +24,7 @@
 #define BIT(nr)                 (1UL << (nr))
 #define BIT_MASK(nr)            (1UL << ((nr) % BITS_PER_WORD))
 #define BIT_WORD(nr)            ((nr) / BITS_PER_WORD)
+#define BIT_ULL(nr)             (1ULL << (nr))
 #define BITS_PER_BYTE           8

 #define ADDR (*(volatile int *) addr)
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 836a103..12bd155 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -220,6 +220,8 @@ enum gic_version {
     GIC_V3,
 };

+#define LPI_OFFSET      8192
+

It would make much sense to have this definition moved in irq.h close to
NR_IRQS.

Also, I am a bit surprised that NR_IRQS & co has not been modified. Is
there any reason for that?

It wasn't needed and really shouldn't be.
LPI are to some degree *not* first class citizen IRQs, and AFAICT
NR_IRQS relate to struct irq_decs's, which we don't have for LPIs (since
Xen itself doesn't really care about LPIs, at least as interrupts).

I am still chasing every (derived) use of NR_IRQS, but so far this looks
good to me. Let me know if you find any issues with that.

Can we have a comment on top of NR_IRQS explaining why LPIs are not taken into account?

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.