|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2] x86/bitops: Optimise arch_ffs{,l}() some more on AMD
On 01/09/2025 3:26 pm, Jan Beulich wrote:
> On 01.09.2025 16:21, Andrew Cooper wrote:
>> On 27/08/2025 8:52 am, Jan Beulich wrote:
>>> On 26.08.2025 19:41, Andrew Cooper wrote:
>>>> --- a/xen/common/bitops.c
>>>> +++ b/xen/common/bitops.c
>>>> @@ -97,14 +97,14 @@ static void __init test_for_each_set_bit(void)
>>>> if ( ui != ui_res )
>>>> panic("for_each_set_bit(uint) expected %#x, got %#x\n", ui,
>>>> ui_res);
>>>>
>>>> - ul = HIDE(1UL << (BITS_PER_LONG - 1) | 1);
>>>> + ul = HIDE(1UL << (BITS_PER_LONG - 1) | 0x11);
>>>> for_each_set_bit ( i, ul )
>>>> ul_res |= 1UL << i;
>>>>
>>>> if ( ul != ul_res )
>>>> panic("for_each_set_bit(ulong) expected %#lx, got %#lx\n", ul,
>>>> ul_res);
>>>>
>>>> - ull = HIDE(0x8000000180000001ULL);
>>>> + ull = HIDE(0x8000000180000011ULL);
>>>> for_each_set_bit ( i, ull )
>>>> ull_res |= 1ULL << i;
>>> How do these changes make a difference? Apart from ffs() using TZCNT, ...
>>>
>>>> @@ -127,6 +127,79 @@ static void __init test_for_each_set_bit(void)
>>>> panic("for_each_set_bit(break) expected 0x1008, got %#x\n",
>>>> ui_res);
>>>> }
>>>>
>>>> +/*
>>>> + * A type-generic fls() which picks the appropriate fls{,l,64}() based on
>>>> it's
>>>> + * argument.
>>>> + */
>>>> +#define fls_g(x) \
>>>> + (sizeof(x) <= sizeof(int) ? fls(x) : \
>>>> + sizeof(x) <= sizeof(long) ? flsl(x) : \
>>>> + sizeof(x) <= sizeof(uint64_t) ? fls64(x) : \
>>>> + ({ BUILD_ERROR("fls_g() Bad input type"); 0; }))
>>>> +
>>>> +/*
>>>> + * for_each_set_bit_reverse() - Iterate over all set bits in a scalar
>>>> value,
>>>> + * from MSB to LSB.
>>>> + *
>>>> + * @iter An iterator name. Scoped is within the loop only.
>>>> + * @val A scalar value to iterate over.
>>>> + *
>>>> + * A copy of @val is taken internally.
>>>> + */
>>>> +#define for_each_set_bit_reverse(iter, val) \
>>>> + for ( typeof(val) __v = (val); __v; __v = 0 ) \
>>>> + for ( unsigned int (iter); \
>>>> + __v && ((iter) = fls_g(__v) - 1, true); \
>>>> + __clear_bit(iter, &__v) )
>>>> +
>>>> +/*
>>>> + * Xen doesn't have need of for_each_set_bit_reverse() at present, but the
>>>> + * construct does exercise a case of arch_fls*() not covered anywhere
>>>> else by
>>>> + * these tests.
>>>> + */
>>>> +static void __init test_for_each_set_bit_reverse(void)
>>>> +{
>>>> + unsigned int ui, ui_res = 0, tmp;
>>>> + unsigned long ul, ul_res = 0;
>>>> + uint64_t ull, ull_res = 0;
>>>> +
>>>> + ui = HIDE(0x80008001U);
>>>> + for_each_set_bit_reverse ( i, ui )
>>>> + ui_res |= 1U << i;
>>>> +
>>>> + if ( ui != ui_res )
>>>> + panic("for_each_set_bit_reverse(uint) expected %#x, got %#x\n",
>>>> ui, ui_res);
>>>> +
>>>> + ul = HIDE(1UL << (BITS_PER_LONG - 1) | 0x11);
>>>> + for_each_set_bit_reverse ( i, ul )
>>>> + ul_res |= 1UL << i;
>>>> +
>>>> + if ( ul != ul_res )
>>>> + panic("for_each_set_bit_reverse(ulong) expected %#lx, got
>>>> %#lx\n", ul, ul_res);
>>>> +
>>>> + ull = HIDE(0x8000000180000011ULL);
>>>> + for_each_set_bit_reverse ( i, ull )
>>>> + ull_res |= 1ULL << i;
>>> ... even here the need for the extra setting of bit 4 remains unclear to
>>> me: The thing that was missing was the testing of the reverse for-each.
>>> You mention the need for an asymmetric input in the description, but isn't
>>> that covered already by the first test using 0x80008001U?
>> The first test covers {arch_,}f[fl]s() only. It happens to be safe to
>> count-from-the-wrong-end bugs, but that was by chance.
>>
>> The second test covers {arch_,}f[fs]sl() only. They are unsafe WRT
>> symmetry, and disjoint (coverage wise) from the first test.
>>
>> The third test, while the same as the second test on x86, isn't the same
>> on arm32.
>>
>>
>> Just because one test happens to be safe (symmetry wise) and passes,
>> doesn't make the other variants tested.
> Hmm, okay, it is of course in principle possible that one flavor is screwed
> while the other isn't.
>
> Acked-by: Jan Beulich <jbeulich@xxxxxxxx>
Thanks, but I now have both R-by and A-by you on this patch. Which
would you prefer?
~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |