[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/atomic: Improvements and simplifications to assembly constraints



 >>> On 21.11.18 at 20:37, <andrew.cooper3@xxxxxxxxxx> wrote:
> * Some of the single-byte versions specify "=q" as the output.  AFAICT, there
>    was not a legitimate reason to restrict the use of %esi/%edi in the 32-bit
>    build.  Either way, in 64-bit, it is equivelent to "=r".

I'm confused about the 32-bit part here: Of course it was necessary
to restrict the compiler to the low 4 registers in that case. It's just
not clear to me whether you've just written it down wrongly, or
whether you indeed think the way it reads to me.

>  * Constraints in the form "=r" (x) : "0" (x) can be folded to just "+r" (x)
>  * Switch to using named parameters (mostly for legibility) which in
>    particular helps with...
>  * __xchg(), __cmpxchg() and __cmpxchg_user() modify their memory operand, so
>    must list it as an output operand.  This only works because they each have
>    a memory clobber to give the construct full compiler-barrier properties.
>  * Every memory operand has an explicit known size.  Letting the compiler see
>    the real size rather than obscuring it with __xg() allows for the removal
>    of the instruction size suffixes without introducing ambiguity.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> ---
> CC: Jan Beulich <JBeulich@xxxxxxxx>
> CC: Wei Liu <wei.liu2@xxxxxxxxxx>
> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
> Interestingly, switching to use output memory operands has the following
> perturbance in the build:
> 
>   add/remove: 0/0 grow/shrink: 3/5 up/down: 70/-124 (-54)
>   Function                                     old     new   delta
>   do_mmu_update                               7041    7101     +60
>   mctelem_process_deferred                     234     242      +8
>   cpufreq_governor_dbs                         851     853      +2
>   _set_status                                  162     161      -1
>   create_irq                                   325     323      -2
>   do_tmem_put                                 2066    2062      -4
>   task_switch_load_seg                         892     884      -8
>   _get_page_type                              6057    5948    -109
> 
> but as far as I can tell, it is exclusively down to different register
> scheduling choices.
> ---
>  xen/include/asm-x86/system.h        | 99 
> +++++++++++++++++--------------------
>  xen/include/asm-x86/x86_64/system.h | 24 ++++-----
>  2 files changed, 57 insertions(+), 66 deletions(-)
> 
> diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
> index 483cd20..8764e31 100644
> --- a/xen/include/asm-x86/system.h
> +++ b/xen/include/asm-x86/system.h
> @@ -23,9 +23,6 @@
>  #define xchg(ptr,v) \
>      ((__typeof__(*(ptr)))__xchg((unsigned long)(v),(ptr),sizeof(*(ptr))))
>  
> -struct __xchg_dummy { unsigned long a[100]; };
> -#define __xg(x) ((volatile struct __xchg_dummy *)(x))

I never fully understood why we have this, so I'm happy to see it
go away. I see it has gone away in Linux back in 2.6.36.

> @@ -79,31 +72,27 @@ static always_inline unsigned long __cmpxchg(
>      switch ( size )
>      {
>      case 1:
> -        asm volatile ( "lock; cmpxchgb %b1,%2"
> -                       : "=a" (prev)
> -                       : "q" (new), "m" (*__xg(ptr)),
> -                       "0" (old)
> +        asm volatile ( "lock; cmpxchg %b[new], %[ptr]"
> +                       : "=a" (prev), [ptr] "+m" (*(uint8_t *)ptr)
> +                       : [new] "r" (new), "0" (old)

Any reason you retain the reference by number in the input
constraint here, rather than giving its corresponding output
one a name?

Also since you're playing with this anyway - is there a need to
retain the bogus ; after "lock"?

> --- a/xen/include/asm-x86/x86_64/system.h
> +++ b/xen/include/asm-x86/x86_64/system.h
> @@ -25,7 +25,7 @@ static always_inline __uint128_t __cmpxchg16b(
>  
>      /* Don't use "=A" here - clang can't deal with that. */
>      asm volatile ( "lock; cmpxchg16b %2"

Any reason not to change this to named operands as well?

> @@ -63,36 +63,38 @@ static always_inline __uint128_t cmpxchg16b_local_(
>   * If no fault occurs then _o is updated to the value we saw at _p. If this
>   * is the same as the initial value of _o then _n is written to location _p.
>   */
> -#define __cmpxchg_user(_p,_o,_n,_isuff,_oppre,_regtype)                 \
> +#define __cmpxchg_user(_p, _o, _n, _oppre)                              \
>      stac();                                                             \
>      asm volatile (                                                      \
> -        "1: lock; cmpxchg"_isuff" %"_oppre"2,%3\n"                      \
> +        "1: lock; cmpxchg %"_oppre"[new], %[ptr]\n"                     \
>          "2:\n"                                                          \
>          ".section .fixup,\"ax\"\n"                                      \
>          "3:     movl $1,%1\n"                                           \

Any what about this?

>          "       jmp 2b\n"                                               \
>          ".previous\n"                                                   \
>          _ASM_EXTABLE(1b, 3b)                                            \
> -        : "=a" (_o), "=r" (_rc)                                         \
> -        : _regtype (_n), "m" (*__xg((volatile void *)_p)), "0" (_o), "1" (0) 
> \
> +        : "+a" (_o), "=r" (_rc),                                        \
> +          [ptr] "+m" (*(volatile typeof(*(_p)) *)(_p))                  \

Does casting to add "volatile" here really make any difference,
considering the asm() itself is a volatile one and has a memory
clobber?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.