[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 5/6] xen/arm: read cacheline size when needed


  • To: Julien Grall <julien.grall@xxxxxxx>
  • From: Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • Date: Tue, 20 Feb 2018 13:03:25 -0800 (PST)
  • Authentication-results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org
  • Authentication-results: mail.kernel.org; spf=none smtp.mailfrom=sstabellini@xxxxxxxxxx
  • Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxx
  • Delivery-date: Tue, 20 Feb 2018 21:03:42 +0000
  • Dmarc-filter: OpenDMARC Filter v1.3.2 mail.kernel.org A204120685
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, 20 Feb 2018, Julien Grall wrote:
> Hi Stefano,
> 
> On 19/02/18 21:58, Stefano Stabellini wrote:
> > On big.LITTLE systems the cacheline size might differ between big and
> > LITTLE cores. Instead of reading the cacheline size once at boot,
> > read it as needed from the system registers.
> > 
> > Suggested-by: Julien Grall <julien.grall@xxxxxxx>
> > Signed-off-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> > ---
> >   xen/arch/arm/arm32/head.S  |  9 +++++++--
> >   xen/arch/arm/arm64/head.S  | 10 ++++++++--
> >   xen/arch/arm/setup.c       | 17 -----------------
> >   xen/include/asm-arm/page.h | 17 +++++++++++++++--
> >   4 files changed, 30 insertions(+), 23 deletions(-)
> > 
> > diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
> > index 43374e7..db470ad 100644
> > --- a/xen/arch/arm/arm32/head.S
> > +++ b/xen/arch/arm/arm32/head.S
> > @@ -504,8 +504,13 @@ ENTRY(relocate_xen)
> >           dsb        /* So the CPU issues all writes to the range */
> >             mov   r5, r4
> > -        ldr   r6, =cacheline_bytes /* r6 := step */
> > -        ldr   r6, [r6]
> > +        mov   r6, #0
> 
> Comments in the code would be nice to know what you exactly do. Also in that
> case, it would make sense to have a macro as this may be useful in other
> places.

OK. This is a 1:1 translation from setup_cache.


> > +        mcr   CP32(r6, CSSELR_EL1)
> 
> Please use 32-bit naming in the 32-bit code.

I'll change.

 
> > +        mrc   CP32(r6, CSSELR_EL1)
> 
> The size of the cache is read using CSSIDR_EL1. But it looks like the way we
> get the cache line size in Xen is fragile.
> 
> We are retrieving the cache line size of Level 1 and assume this will be valid
> for all the other caches. Indeed cache maintenance ops may propagate to other
> caches depending the target (Point of Coherency vs Point of Unification).
> 
> Looking at the ARM ARM "Cache hierarchy abstraction for address-based
> operations" (D3-2061 DDI 0487C.a), CTR_EL0/CTR will holds the minimum line
> lenght values for the data caches. The value will be the most efficient
> address stride to use to apply a sequence of VA-based maintenance instructions
> to a range of VAs.
> 
> So it would be best and safer for Xen to use CTR/CTLR_EL0.DminLine.

This is insightful, thank you. Given that this patch is a backport
candidate, I would prefer to retain the same behavior we had before in
setup_cache. I can write a separate patch on top of this to make the
change to use CTR/CTLR_EL0.DminLine. That way, we can make a separate
decision on each of them on whether we want to backport them (and
potentially revert them) or not. In other words: this patch as-is is
suboptimal but is of very little risk. Making changes to the way we
determine the cacheline size improves the patch but significantly
increases the risk factor associated with it.

Does it make sense?


> > +        and   r6, r6, #0x7
> > +        add   r6, r6, #4
> > +        mov   r7, #1
> > +        lsl   r6, r7, r6
> >           mov   r7, r3
> >     1:      mcr   CP32(r7, DCCMVAC)
> > diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> > index fa0ef70..edea300 100644
> > --- a/xen/arch/arm/arm64/head.S
> > +++ b/xen/arch/arm/arm64/head.S
> > @@ -631,8 +631,14 @@ ENTRY(relocate_xen)
> >           dsb   sy        /* So the CPU issues all writes to the range */
> >             mov   x9, x3
> > -        ldr   x10, =cacheline_bytes /* x10 := step */
> > -        ldr   x10, [x10]
> > +
> > +        mov   x10, #0
> > +        msr   CSSELR_EL1, x10
> > +        mrs   x10, CSSELR_EL1
> > +        and   x10, x10, #0x7
> > +        add   x10, x10, #4
> > +        mov   x11, #1
> > +        lsl   x10, x11, x10
> 
> Please use dcache_line_size macro (see cache.S).

Similarly, I would prefer to retain the same old behavior here, and
fix it/improve it in a separate patch.


> >           mov   x11, x2
> >     1:      dc    cvac, x11
> > diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> > index 032a6a8..4754c95 100644
> > --- a/xen/arch/arm/setup.c
> > +++ b/xen/arch/arm/setup.c
> > @@ -680,21 +680,6 @@ static void __init setup_mm(unsigned long dtb_paddr,
> > size_t dtb_size)
> >   }
> >   #endif
> >   -size_t __read_mostly cacheline_bytes;
> > -
> > -/* Very early check of the CPU cache properties */
> > -void __init setup_cache(void)
> > -{
> > -    uint32_t ccsid;
> > -
> > -    /* Read the cache size ID register for the level-0 data cache */
> > -    WRITE_SYSREG32(0, CSSELR_EL1);
> > -    ccsid = READ_SYSREG32(CCSIDR_EL1);
> > -
> > -    /* Low 3 bits are log2(cacheline size in words) - 2. */
> > -    cacheline_bytes = 1U << (4 + (ccsid & 0x7));
> > -}
> > -
> >   /* C entry point for boot CPU */
> >   void __init start_xen(unsigned long boot_phys_offset,
> >                         unsigned long fdt_paddr,
> > @@ -708,8 +693,6 @@ void __init start_xen(unsigned long boot_phys_offset,
> >       struct domain *dom0;
> >       struct xen_arch_domainconfig config;
> >   -    setup_cache();
> > -
> >       percpu_init_areas();
> >       set_processor_id(0); /* needed early, for smp_processor_id() */
> >   diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> > index d948250..791e22e 100644
> > --- a/xen/include/asm-arm/page.h
> > +++ b/xen/include/asm-arm/page.h
> > @@ -133,8 +133,6 @@
> >     /* Architectural minimum cacheline size is 4 32-bit words. */
> >   #define MIN_CACHELINE_BYTES 16
> > -/* Actual cacheline size on the boot CPU. */
> > -extern size_t cacheline_bytes;
> >     #define copy_page(dp, sp) memcpy(dp, sp, PAGE_SIZE)
> >   @@ -142,9 +140,22 @@ extern size_t cacheline_bytes;
> >    * if 'range' is large enough we might want to use model-specific
> >    * full-cache flushes. */
> >   +static inline size_t read_cacheline_size(void)
> > +{
> > +    uint32_t ccsid;
> > +
> > +    /* Read the cache size ID register for the level-0 data cache */
> > +    WRITE_SYSREG32(0, CSSELR_EL1);
> > +    ccsid = READ_SYSREG32(CCSIDR_EL1);
> > +
> > +    /* Low 3 bits are log2(cacheline size in words) - 2. */
> > +    return (size_t) (1U << (4 + (ccsid & 0x7)));
> 
> See my remark above regarding the cacheline size.
> 
> > +}
> > +
> >   static inline int invalidate_dcache_va_range(const void *p, unsigned long
> > size)
> >   {
> >       const void *end = p + size;
> > +    size_t cacheline_bytes = read_cacheline_size();
> >       size_t cacheline_mask = cacheline_bytes - 1;
> >         dsb(sy);           /* So the CPU issues all writes to the range */
> > @@ -171,6 +182,7 @@ static inline int invalidate_dcache_va_range(const void
> > *p, unsigned long size)
> >     static inline int clean_dcache_va_range(const void *p, unsigned long
> > size)
> >   {
> > +    size_t cacheline_bytes = read_cacheline_size();
> >       const void *end = p + size;
> >       dsb(sy);           /* So the CPU issues all writes to the range */
> >       p = (void *)((uintptr_t)p & ~(cacheline_bytes - 1));
> > @@ -184,6 +196,7 @@ static inline int clean_dcache_va_range(const void *p,
> > unsigned long size)
> >   static inline int clean_and_invalidate_dcache_va_range
> >       (const void *p, unsigned long size)
> >   {
> > +    size_t cacheline_bytes = read_cacheline_size();
> >       const void *end = p + size;
> >       dsb(sy);         /* So the CPU issues all writes to the range */
> >       p = (void *)((uintptr_t)p & ~(cacheline_bytes - 1));
> > 
> 
> Cheers,
> 
> -- 
> Julien Grall
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.