|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 13/34] xen/riscv: introduce cmpxchg.h
On Mon, 2024-01-22 at 17:27 +0100, Jan Beulich wrote:
> On 22.12.2023 16:12, Oleksii Kurochko wrote:
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/cmpxchg.h
> > @@ -0,0 +1,496 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/* Copyright (C) 2014 Regents of the University of California */
> > +
> > +#ifndef _ASM_RISCV_CMPXCHG_H
> > +#define _ASM_RISCV_CMPXCHG_H
> > +
> > +#include <xen/compiler.h>
> > +#include <xen/lib.h>
> > +
> > +#include <asm/fence.h>
> > +#include <asm/io.h>
> > +#include <asm/system.h>
> > +
> > +#define __xchg_relaxed(ptr, new, size) \
> > +({ \
> > + __typeof__(ptr) ptr__ = (ptr); \
> > + __typeof__(new) new__ = (new); \
> > + __typeof__(*(ptr)) ret__; \
>
> I expect the types of new and *ptr want to actually match. Which
> you then want to enforce, so that issues at use sites would either
> be reported by the compiler, or be permitted by a type conversion
> of new.
I am OK to make the same type for new and ret__, but it looks like
__xchg_relaxed() is used only inside xchg_relaxed() where 'new' is
declared with the same type as *ptr.
>
> > + switch (size) \
> > + { \
>
> Nit: Hard tab left here. (Also again you want to either stick to
> Linux style or fully switch to Xen style.)
Thanks. I'll update that.
>
> > + case 4: \
> > + asm volatile( \
> > + " amoswap.w %0, %2, %1\n" \
>
> I don't think a leading tab (or leading whitespace in general) is
> needed in single-line-output asm()s. The trailing \n also isn't
> needed if I'm not mistaken.
I just wanted to align it with "=r", but for sure a leading tab or
whitespace can be dropped.
>
> > + : "=r" (ret__), "+A" (*ptr__) \
> > + : "r" (new__) \
> > + : "memory" ); \
> > + break; \
> > + case 8: \
> > + asm volatile( \
> > + " amoswap.d %0, %2, %1\n" \
> > + : "=r" (ret__), "+A" (*ptr__) \
> > + : "r" (new__) \
> > + : "memory" ); \
> > + break; \
> > + default: \
> > + ASSERT_UNREACHABLE(); \
>
> If at all possible this wants to trigger a build failure, not a
> runtime
> one.
I'll update that with BUILD_BUG_ON(size > 8);
>
> > + } \
> > + ret__; \
> > +})
> > +
> > +#define xchg_relaxed(ptr, x) \
> > +({ \
> > + __typeof__(*(ptr)) x_ = (x); \
> > + (__typeof__(*(ptr))) __xchg_relaxed((ptr), x_,
> > sizeof(*(ptr))); \
>
> Nit: Stray blank after cast. For readability I'd also suggest to
> drop parentheses in cases like the first argument passed to
> __xchg_relaxed() here.
Thanks. I'll take that into account.
>
> > +})
>
> For both: What does "relaxed" describe? I'm asking because it's not
> really clear whether the memory clobbers are actually needed.
>
> > +#define __xchg_acquire(ptr, new, size) \
> > +({ \
> > + __typeof__(ptr) ptr__ = (ptr); \
> > + __typeof__(new) new__ = (new); \
> > + __typeof__(*(ptr)) ret__; \
> > + switch (size) \
> > + { \
> > + case 4: \
> > + asm volatile( \
> > + " amoswap.w %0, %2, %1\n" \
> > + RISCV_ACQUIRE_BARRIER \
> > + : "=r" (ret__), "+A" (*ptr__) \
> > + : "r" (new__) \
> > + : "memory" ); \
> > + break; \
> > + case 8: \
> > + asm volatile( \
> > + " amoswap.d %0, %2, %1\n" \
> > + RISCV_ACQUIRE_BARRIER \
> > + : "=r" (ret__), "+A" (*ptr__) \
> > + : "r" (new__) \
> > + : "memory" ); \
> > + break; \
> > + default: \
> > + ASSERT_UNREACHABLE(); \
> > + } \
> > + ret__; \
> > +})
>
> If I'm not mistaken this differs from __xchg_relaxed() only in the
> use
> of RISCV_ACQUIRE_BARRIER, and ...
>
> > +#define xchg_acquire(ptr, x) \
> > +({ \
> > + __typeof__(*(ptr)) x_ = (x); \
> > + (__typeof__(*(ptr))) __xchg_acquire((ptr), x_,
> > sizeof(*(ptr))); \
> > +})
> > +
> > +#define __xchg_release(ptr, new, size) \
> > +({ \
> > + __typeof__(ptr) ptr__ = (ptr); \
> > + __typeof__(new) new__ = (new); \
> > + __typeof__(*(ptr)) ret__; \
> > + switch (size) \
> > + { \
> > + case 4: \
> > + asm volatile ( \
> > + RISCV_RELEASE_BARRIER \
> > + " amoswap.w %0, %2, %1\n" \
> > + : "=r" (ret__), "+A" (*ptr__) \
> > + : "r" (new__) \
> > + : "memory"); \
> > + break; \
> > + case 8: \
> > + asm volatile ( \
> > + RISCV_RELEASE_BARRIER \
> > + " amoswap.d %0, %2, %1\n" \
> > + : "=r" (ret__), "+A" (*ptr__) \
> > + : "r" (new__) \
> > + : "memory"); \
> > + break; \
> > + default: \
> > + ASSERT_UNREACHABLE(); \
> > + } \
> > + ret__; \
> > +})
>
> this only in the use of RISCV_RELEASE_BARRIER. If so they likely want
> folding, to limit redundancy and make eventual updating easier. (Same
> for the cmpxchg helper further down, as it seems.)
Yes, you are right. The difference is only in RISCV_RELEASE_BARRIER and
it is an absence of RISCV_RELEASE_BARRIER is a reason why we have
"relaxed" versions.
I am not sure that I understand what you mean by folding here. Do you
mean that there is no any sense to have to separate macros and it is
needed only one with RISCV_RELEASE_BARRIER?
>
> > +#define xchg_release(ptr, x) \
> > +({ \
> > + __typeof__(*(ptr)) x_ = (x); \
> > + (__typeof__(*(ptr))) __xchg_release((ptr), x_,
> > sizeof(*(ptr))); \
> > +})
> > +
> > +static always_inline uint32_t __xchg_case_4(volatile uint32_t
> > *ptr,
> > + uint32_t new)
> > +{
> > + __typeof__(*(ptr)) ret;
> > +
> > + asm volatile (
> > + " amoswap.w.aqrl %0, %2, %1\n"
> > + : "=r" (ret), "+A" (*ptr)
> > + : "r" (new)
> > + : "memory" );
> > +
> > + return ret;
> > +}
> > +
> > +static always_inline uint64_t __xchg_case_8(volatile uint64_t
> > *ptr,
> > + uint64_t new)
> > +{
> > + __typeof__(*(ptr)) ret;
> > +
> > + asm volatile( \
> > + " amoswap.d.aqrl %0, %2, %1\n" \
> > + : "=r" (ret), "+A" (*ptr) \
> > + : "r" (new) \
> > + : "memory" ); \
> > +
> > + return ret;
> > +}
> > +
> > +static always_inline unsigned short __cmpxchg_case_2(volatile
> > uint32_t *ptr,
> > + uint32_t old,
> > + uint32_t
> > new);
>
> Don't you consistently mean uint16_t here (incl the return type) and
> ...
>
> > +static always_inline unsigned short __cmpxchg_case_1(volatile
> > uint32_t *ptr,
> > + uint32_t old,
> > + uint32_t
> > new);
>
> ... uint8_t here?
The idea was that we emulate __cmpxchg_case_1 and __cmpxchg_case_2
using 4 bytes cmpxchg and __cmpxchg_case_4 expects uint32_t.
>
> > +static inline unsigned long __xchg(volatile void *ptr, unsigned
> > long x, int size)
> > +{
> > + switch (size) {
> > + case 1:
> > + return __cmpxchg_case_1(ptr, (uint32_t)-1, x);
> > + case 2:
> > + return __cmpxchg_case_2(ptr, (uint32_t)-1, x);
>
> How are these going to work? You'll compare against ~0, and if the
> value
> in memory isn't ~0, memory won't be updated; you will only
> (correctly)
> return the value found in memory.
>
> Or wait - looking at __cmpxchg_case_{1,2}() far further down, you
> ignore
> "old" there. Which apparently means they'll work for the use here,
> but
> not for the use in __cmpxchg().
Yes, the trick is that old is ignored and is read in
__emulate_cmpxchg_case1_2() before __cmpxchg_case_4 is called:
do {
read_val = read_func(aligned_ptr);
swapped_new = read_val & ~mask;
swapped_new |= masked_new;
ret = cmpxchg_func(aligned_ptr, read_val, swapped_new);
} while ( ret != read_val );
read_val it is 'old'.
But now I am not 100% sure that it is correct for __cmpxchg...
>
> > + case 4:
> > + return __xchg_case_4(ptr, x);
> > + case 8:
> > + return __xchg_case_8(ptr, x);
> > + default:
> > + ASSERT_UNREACHABLE();
> > + }
> > +
> > + return -1;
> > +}
> > +
> > +#define xchg(ptr,x) \
> > +({ \
> > + __typeof__(*(ptr)) ret__; \
> > + ret__ = (__typeof__(*(ptr))) \
> > + __xchg((ptr), (unsigned long)(x), sizeof(*(ptr))); \
> > + ret__; \
> > +})
> > +
> > +#define xchg32(ptr, x) \
> > +({ \
> > + BUILD_BUG_ON(sizeof(*(ptr)) != 4); \
> > + xchg((ptr), (x)); \
> > +})
> > +
> > +#define xchg64(ptr, x) \
> > +({ \
> > + BUILD_BUG_ON(sizeof(*(ptr)) != 8); \
> > + xchg((ptr), (x)); \
> > +})
>
> What are these two (and their cmpxchg counterparts) needed for?
It was copied from Linux, but it looks like they can be really dropped,
I can't find where it is used, so I'll drop them.
>
> > +/*
> > + * Atomic compare and exchange. Compare OLD with MEM, if
> > identical,
> > + * store NEW in MEM. Return the initial value in MEM. Success is
> > + * indicated by comparing RETURN with OLD.
> > + */
> > +#define __cmpxchg_relaxed(ptr, old, new, size) \
> > +({ \
> > + __typeof__(ptr) ptr__ = (ptr); \
> > + __typeof__(*(ptr)) __old = (old); \
>
> Leftover leading underscores?
>
> > + __typeof__(*(ptr)) new__ = (new); \
>
> Related to my earlier comment on types needing to be compatible - see
> how here you're using "ptr" throughout.
>
> > + __typeof__(*(ptr)) ret__; \
> > + register unsigned int __rc; \
>
> More leftover leading underscores?
Missed to drop underscores. Thanks. I'll drop them in next version.
>
> > +static always_inline unsigned short __cmpxchg_case_2(volatile
> > uint32_t *ptr,
> > + uint32_t old,
> > + uint32_t new)
> > +{
> > + (void) old;
> > +
> > + if (((unsigned long)ptr & 3) == 3)
> > + {
> > +#ifdef CONFIG_64BIT
> > + return __emulate_cmpxchg_case1_2((uint64_t *)ptr, new,
> > + readq, __cmpxchg_case_8,
> > 0xffffU);
>
> What if ((unsigned long)ptr & 7) == 7 (which is a sub-case of what
> the
> if() above checks for? Isn't it more reasonable to require aligned
> 16-bit quantities here? Or if mis-aligned addresses are okay, you
> could
> as well emulate using __cmpxchg_case_4().
Yes, it will be more reasonable. I'll use IS_ALIGNED instead.
>
> Also you shouldn't be casting away volatile (here and below).
> Avoiding
> the casts (by suitable using volatile void * parameter types) would
> likely be best.
Thanks. I'll update the function prototype.
~ Oleksii
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |