[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 7/8] lib: move bsearch code



On 24.11.2020 01:40, Andrew Cooper wrote:
> On 23/11/2020 22:49, Julien Grall wrote:
>> On 19/11/2020 10:27, Jan Beulich wrote:
>>> On 18.11.2020 19:09, Julien Grall wrote:
>>>> On 23/10/2020 11:19, Jan Beulich wrote:
>>>>> --- a/xen/include/xen/compiler.h
>>>>> +++ b/xen/include/xen/compiler.h
>>>>> @@ -12,6 +12,7 @@
>>>>>       #define inline        __inline__
>>>>>    #define always_inline __inline__ __attribute__
>>>>> ((__always_inline__))
>>>>> +#define gnu_inline    __inline__ __attribute__ ((__gnu_inline__))
>>>>
>>>> bsearch() is only used by Arm and I haven't seen anyone so far
>>>> complaining about the perf of I/O emulation.
>>>>
>>>> Therefore, I am not convinced that there is enough justification to
>>>> introduce a GNU attribute just for this patch.
>>>
>>> Please settle this with Andrew: He had asked for the function to
>>> become inline. I don't view making it static inline in the header
>>> as an option here - if the compiler decides to not inline it, we
>>> should not end up with multiple instances in different CUs.
>>
>> That's the cons of static inline... but then why is it suddenly a
>> problem with this helper?
>>
>>> And
>>> without making it static inline the attribute needs adding; at
>>> least I'm unaware of an alternative which works with the various
>>> compiler versions.
>>
>> The question we have to answer is: What is the gain with this approach?
> 
> Substantial.
> 
>>
>> If it is not quantifiable, then introducing compiler specific
>> attribute is not an option.
>>
>> IIRC, there are only two callers (all in Arm code) of this function.
>> Even inlined, I don't believe you would drastically reduce the number
>> of instructions compare to a full blown version. To be generous, I
>> would say you may save ~20 instructions per copy.
>>
>> Therefore, so far, the compiler specific attribute doesn't look
>> justified to me. As usual, I am happy to be proven wrong
> 
> There is a very good reason why this is the classic example used for
> extern inline's in various libc's.
> 
> The gains are from the compiler being able to optimise away the function
> pointer(s) entirely.  Instead of working on opaque objects, it can see
> the accesses directly, implement compares as straight array reads, (for
> sorting, the swap() call turns into memcpy()) and because it can see all
> the memory accesses, doesn't have to assume that every call to cmp()
> modifies arbitrary data in the array (i.e. doesn't have to reload the
> objects from memory every iteration).
> 
> extern inline allows the compiler full flexibility to judge whether
> inlining is a net win, based on optimisation settings and observing what
> the practical memory access pattern would be from not inlining.
> 
> extern inline is the appropriate thing to use here, except for the big
> note in the GCC manual saying "always use gnu_inline in this case" which
> appears to be working around a change in the C99 standard which forces
> any non-static inline to emit a body even when its not called, due to
> rules about global symbols.
> 
> Therefore, Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Thanks Andrew.

Julien - please clarify whether you're okay with Andrew's response,
or whether you continue to object the conversion to inline.

> On a totally separate point,  I wonder if we'd be better off compiling
> with -fgnu89-inline because I can't see any case we're we'd want the C99
> inline semantics anywhere in Xen.

I'm not sure about this, i.e. I wouldn't want to exclude such a
case appearing. I think using attributes is better in general, as
it allows fine grained control.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.