[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/5] xen/sort: Switch to an extern inline implementation


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Fri, 12 Nov 2021 11:25:52 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Sx2rnzqFtY76dmq3vc2M8RdO8C0wFtqKgqF6RI6V0ls=; b=FP6EzRDKou73+TPuLGM2iCL+wPltieDOGOYt0a+Lu+laV/ekQyUQDh57SluNCaV1Sppwr0LrTiXe/PGSsSZkOATPrwmoX63RwXYhxco/WanC0GFMBGysfxbsSEFnVSq8arcxsx+afwIkMdwuzUJCxSCYKLB6wu1rEQhDxXchZU4+kj8OH1vNZpvRyKZYoecxfBuklv7Th1KZfrb4jZslHPwam/Y6hCrfOcwTNmI+c1JOZadgjf8kSavwC6Ft6SiMhPY+eO8EEPwI/GDPwSViqKXRbSvTqjNMQB6C5pWItG1sYfNhePFVtdAc9nDLCCgBW8ssLuGOb245nKsqOjCbvA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CkyRXUlQWyAsfHqt0BKml46yBRA3dA0MfenISgHsOj0nvnB0EuHR6wGSkX+qZHHKCn3IMpJoPx6fv5z022lkL6rUZoEiS59fIQQT/0E0GMxYPgPWOCYwyRFIBu2FkZFtYANbDrvcgQcHb//eIQxsmdaP3G2ctZmYIoDAVAFVJ8Ka7YzrgxNT8K/ZknwspW2O6TpJQnYwLiGUGDVLJD97B9h5yeqa/NRhQN5eq2z+b3tFL8yjnAwBiMGLN0mXu+q8pD0P63crpubxyLqkTqc7PL5xM3SH0TwADi+0IszWjreVvgnlR2FFF9jov9CNXu2b733aS9+I53T7pwxIZNIMCg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 12 Nov 2021 10:26:15 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 11.11.2021 18:57, Andrew Cooper wrote:
> There are exactly 3 callers of sort() in the hypervisor.
> 
> Both arm callers pass in NULL for the swap function.  While this might seem
> like an attractive option at first, it causes generic_swap() to be used which
> forced a byte-wise copy.  Provide real swap functions which the compiler can
> optimise sensibly.
> 
> Furthermore, use of function pointers in tight loops like that can be very bad
> for performance.  Implement sort() as extern inline, so the optimiser can
> judge whether to inline things or not.
> 
> On x86, the diffstat shows how much of a better job the compiler can do when
> it is able to see the cmp/swap implementations.
> 
>   $ ../scripts/bloat-o-meter xen-syms-before xen-syms-after
>   add/remove: 0/5 grow/shrink: 1/1 up/down: 505/-735 (-230)
>   Function                                     old     new   delta
>   sort_exception_table                          31     536    +505
>   u32_swap                                       9       -      -9
>   generic_swap                                  34       -     -34
>   cmp_ex                                        36       -     -36
>   swap_ex                                       41       -     -41
>   sort_exception_tables                         90      38     -52
>   sort                                         563       -    -563
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Technically
Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>

Yet again without the intention of overriding Julien's concerns in any
way. To address one of them, how about retaining generic_swap() (as an
inline function), ...

> --- a/xen/include/xen/sort.h
> +++ b/xen/include/xen/sort.h
> @@ -3,8 +3,61 @@
>  
>  #include <xen/types.h>
>  
> +/*
> + * sort - sort an array of elements
> + * @base: pointer to data to sort
> + * @num: number of elements
> + * @size: size of each element
> + * @cmp: pointer to comparison function
> + * @swap: pointer to swap function or NULL
> + *
> + * This function does a heapsort on the given array. You may provide a
> + * swap function optimized to your element type.
> + *
> + * Sorting time is O(n log n) both on average and worst-case. While
> + * qsort is about 20% faster on average, it suffers from exploitable
> + * O(n*n) worst-case behavior and extra memory requirements that make
> + * it less suitable for kernel use.
> + */
> +#ifndef SORT_IMPLEMENTATION
> +extern gnu_inline
> +#endif
>  void sort(void *base, size_t num, size_t size,
>            int (*cmp)(const void *, const void *),
> -          void (*swap)(void *, void *, size_t));
> +          void (*swap)(void *, void *, size_t))
> +{
> +    /* pre-scale counters for performance */
> +    size_t i = (num / 2) * size, n = num * size, c, r;
> +
> +    /* heapify */
> +    while ( i > 0 )
> +    {
> +        for ( r = i -= size; r * 2 + size < n; r = c )
> +        {
> +            c = r * 2 + size;
> +            if ( (c < n - size) && (cmp(base + c, base + c + size) < 0) )
> +                c += size;
> +            if ( cmp(base + r, base + c) >= 0 )
> +                break;
> +            swap(base + r, base + c, size);

... doing

            if ( swap )
                swap(base + r, base + c, size);
            else
                generic_swap(base + r, base + c, size);

here and below. The compiler would then still be able to eliminate the
indirect calls (as well as the added conditional), I think.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.